Conduct a Vector Search

Milvus supports searching vectors in a collection or partition.

Search for vectors in a collection

  1. Create search parameters DSL.

    # This DSL searches for topk `entities` that are
    # closest to vectors[:1] searched by `IVF_FLAT` index with `nprobe = 10` and `metric_type = L2`,
    # AND field "A" in [1, 2, 5],
    # AND field "B" greater than 1 less than 100
    >>> dsl = {
    ...     "bool": {
    ...         "must":[
    ...             {
    ...                 "term": {"A": [1, 2, 5]}
    ...             },
    ...             {
    ...                 "range": {"B": {"GT": 1, "LT": 100}}
    ...             },
    ...             {
    ...                 "vector": {
    ...                    "Vec": {"topk": 10, "query": vectors[:1], "metric_type": "L2", "params": {"nprobe": 10}}
    ...                 }
    ...             }
    ...         ]
    ...     }
    ... }
    // Basic hybrid search:
    // Let's say we have a film with its `embedding` and we want to find `top1` film that is
    // most similar to it by L2 metric_type (Euclidean Distance).
    // In addition to vector similarities, we also want to filter films such that:
    // - `term` is 1, 2, or 5,
    // - `duration` larger than 250 minutes.
    List<List<Float>> queryEmbedding = /* your query vectors */;
    final long topK = 10;
    String dsl = String.format(
        "{\"bool\": {"
            + "\"must\": [{"
            + "    \"range\": {"
            + "        \"A\": {\"GT\": 250}" // "GT" for greater than
            + "    }},{"
            + "    \"term\": {"
            + "        \"B\": [1, 5, 10]" // "term" is a list
            + "    }},{"
            + "    \"vector\": {"
            + "        \"embedding\": {"
            + "            \"topk\": %d, \"metric_type\": \"L2\", \"type\": \"float\", \"query\": %s"
            + "    }}}]}}", topK, queryEmbedding.toString());
    • topk refers to the k vectors closest to the target vector in the vector space.
    • The range of topk is [1, 16384].
    • Different index requires different search parameters. To conduct an embedding search, you must assign values to all search parameters. See Vector Indexes for more information.
  2. Conduct a similarity search:

    >>>'test01', dsl)
    SearchParam searchParam = SearchParam
    SearchResult searchResult =;

You can also set Milvus to return a specified field. Here, we retrieve values in the B field:

>>>'test01', dsl, fields=["B"])
SearchParam searchParam = SearchParam
     .setParamsInJson("{\"fields\": [\"B\"]}");
SearchResult searchResult =;

Search vectors in a partition

>>>'test01', dsl, partition_tags=['tag01'])
setPartitionTags​(java.util.List<java.lang.String> partitionTags);
If you do not specify partition_tags, Milvus searches similar vectors in the entire collection.


Why is my recall rate unsatisfying? You can increase the value of nprobe when searching from a client. The greater the value, the more accurate the result, and the more time it takes. See Performance Tuning > Index for more information.
Does Milvus support inserting while searching? Yes.
Does the size of a collection affect vector searches in one of its partitions, especially when it holds up to 100 million vectors? No. If you have specified partitions when conducting a vector search, Milvus searches the specified partitions only.
Does Milvus load the whole collection to the memory if I search only certain partitions in that collection? No, Milvus only loads the partitions to search.
Are queries in segments processed in parallel?

Yes. But the parallelism processing mechanism varies with Milvus versions.

Suppose a collection has multiple segments, then when a query request comes in:

  • CPU-only Milvus processes the segment reading tasks and the segment searching tasks in pipeline.
  • On top of the abovementioned pipeline mechanism, GPU-enabled Milvus distributes the segments among the available GPUs.

See How Does Milvus Schedule Query Tasks for more information.

Will a batch query benefit from multi-threading? If your batch query is on a small scale (nq < 64), Milvus combines the query requests, in which case multi-threading helps.

Otherwise, the resources are already exhausted, hence multi-threading does not help much.

Why the search is very slow? Check if the value of cache.cache_size in milvus.yaml is greater than the size of the collection.
Why do I see a surge in memory usage when conducting a vector search immediately after an index is created? This is because:
  • Milvus loads the newly created index file to the memory for the vector search.
  • The original vector files used to create the index are not yet released from the memory, because the size of original vector files and the index file has not exceeded the upper limit specified by cache.cache_size.

Why does the first search take a long time after Milvus restarts?

This is because, after restarting, Milvus needs to load data from the disk to the memory for the first vector search. You can set preload_collection in milvus.yaml and load as many collections as the memory permits. Milvus loads collections to the memory each time it restarts.

Otherwise, you can call load_collection() to load collections to the memory.

© 2019 - 2020 Milvus. All rights reserved.