Milvus supports multiple types of indexes to suit various scenarios. The page introduces the description and main parameters that affects query performance or recall rate when using these indexes. The compatibility of the index with CPU or GPU supported Milvus is also demonstrated.
Some concepts used in this page are explained as follows:
- nq: number of queried vectors.
nlist: total number of buckets generated. The feature space is partitioned into
nprobe: number of buckets to search at query.
gpu_search_threshold: The threshold value must be compared with nq to decide if the search computation will be executed on GPUs only. If nq >=
gpu_search_threshold, the search computation will be executed on GPUs only. If nq <
gpu_search_threshold, the search computation will be executed on both CPUs and GPUs.
|Type||Class name||Description||Affecting parameters||CPU-only Milvus||GPU-enabled Milvus|
Provides 100% query recall rate. Compared to other indexes,
|Inverted file with exact post-verification||
Vectors are partitioned into buckets without any quantization. This partition based multi-probing method is balanced between speed and accuracy.
|IVF and scalar quantizer||
Adopts a scalar quantizer to significantly reduce the size of a vector (by about 3/4). Compared to
|IVFSQ8 hybrid search on both CPU and GPU||
Optimized version of
To choose an appropriate index for your application scenarios, please read How to select an index in Milvus.
IVFSQ8H index, the concurrent execution of index building and querying on the GPU may lead to server down.