Tuning BM25
One of the nice features of BM25 is that, unlike TF/IDF, it has two parameters that allow it to be tuned:
- This parameter controls how quickly an increase in term frequency results in term-frequency saturation. The default value is
1.2
. Lower values result in quicker saturation, and higher values in slower saturation. - This parameter controls how much effect field-length normalization should have. A value of
0.0
disables normalization completely, and a value of1.0
normalizes fully. The default is0.75
.
k1
b
The practicalities of tuning BM25 are another matter. The default values for k1
and b
should be suitable for most document collections, but the optimal values really depend on the collection. Finding good values for your collection is a matter of adjusting, checking, and adjusting again.
The similarity algorithm can be set on a per-field basis. It’s just a matter of specifying the chosen algorithm in the field’s mapping:
PUT /my_index{ "mappings": { "doc": { "properties": { "title": { "type": "string", "similarity": "BM25"
![](https://www.elastic.co/guide/en/elasticsearch/guide/current/images/icons/callouts/1.png)
}, "body": { "type": "string", "similarity": "default"
![](https://www.elastic.co/guide/en/elasticsearch/guide/current/images/icons/callouts/2.png)
} } }}
The | |
The |
Currently, it is not possible to change the similarity
mapping for an existing field. You would need to reindex your data in order to do that.
Configuring BM25
Configuring a similarity is much like configuring an analyzer. Custom similarities can be specified when creating an index. For instance:
PUT /my_index{ "settings": { "similarity": { "my_bm25": {
![](https://www.elastic.co/guide/en/elasticsearch/guide/current/images/icons/callouts/1.png)
"type": "BM25", "b": 0
![](https://www.elastic.co/guide/en/elasticsearch/guide/current/images/icons/callouts/2.png)
} } }, "mappings": { "doc": { "properties": { "title": { "type": "string", "similarity": "my_bm25"
![](https://www.elastic.co/guide/en/elasticsearch/guide/current/images/icons/callouts/3.png)
}, "body": { "type": "string", "similarity": "BM25"
![](https://www.elastic.co/guide/en/elasticsearch/guide/current/images/icons/callouts/4.png)
} } } }} 参考:https://www.elastic.co/guide/en/elasticsearch/guide/current/changing-similarities.html