I am using LlamaIndex’s BM25Retriever as part of a Retrieval-Augmented Generation (RAG) pipeline:
from llama_index.retrievers.bm25 import BM25Retriever
bm25_retriever = BM25Retriever.from_defaults(nodes=nodes, similarity_top_k=3)
However, I cannot find a clear way to inspect or explicitly confirm the BM25 hyperparameters used (e.g., k1, b, tokenization settings).
From literature, BM25 typically uses default Okapi parameters such as:
k1 ≈ 1.2
b ≈ 0.75
But in LlamaIndex, these parameters are not clearly exposed in the public API or object attributes.
What are the exact default BM25 parameters used internally by
BM25Retrieverin LlamaIndex?Are these parameters configurable through the public API?
If not exposed directly, where in the source code are they defined?
Is there a recommended way to verify or log these parameters for reproducibility in experiments?