SPLADE

SPLADE embedding is a model that offers highly sparse representations for documents and queries, inheriting desirable properties from bag-of-words (BOW) models such as exact term matching and efficiency.

Milvus integrates with the SPLADE model via the SpladeEmbeddingFunction class. This class provides methods for encoding documents and queries and returning the embeddings as sparse vectors compatible with Milvus indexing.

To use this feature, install the necessary dependencies:

pip install --upgrade pymilvus
pip install "pymilvus[model]"

To instantiate the SpladeEmbeddingFunction, use the command:

from pymilvus import model

splade_ef = model.sparse.SpladeEmbeddingFunction(
    model_name="naver/splade-cocondenser-selfdistil", 
    device="cpu"
)

Parameters:

model_name (string)

The name of the SPLADE model to use for encoding. Valid options are naver/splade-cocondenser-ensembledistil (default), naver/splade_v2_max, naver/splade_v2_distil, and naver/splade-cocondenser-selfdistil. For more information, refer to Play with models.
device (string)

The device to use, with cpu for the CPU and cuda:n for the nth GPU device.

To create embeddings for documents, use the encode_documents() method:

docs = [
    "Artificial intelligence was founded as an academic discipline in 1956.",
    "Alan Turing was the first person to conduct substantial research in AI.",
    "Born in Maida Vale, London, Turing was raised in southern England.",
]

docs_embeddings = splade_ef.encode_documents(docs)

# Print embeddings
print("Embeddings:", docs_embeddings)
# since the output embeddings are in a 2D csr_array format, we convert them to a list for easier manipulation.
print("Sparse dim:", splade_ef.dim, list(docs_embeddings)[0].shape)

The expected output is similar to the following:

Embeddings:   (0, 2001) 0.6392706036567688
  (0, 2034) 0.024093208834528923
  (0, 2082) 0.3230178654193878
...
  (2, 23602)    0.5671860575675964
  (2, 26757)    0.5770265460014343
  (2, 28639)    3.1990697383880615
Sparse dim: 30522 (1, 30522)

To create embeddings for queries, use the encode_queries() method:

queries = ["When was artificial intelligence founded", 
           "Where was Alan Turing born?"]

query_embeddings = splade_ef.encode_queries(queries)

# Print embeddings
print("Embeddings:", query_embeddings)
# since the output embeddings are in a 2D csr_array format, we convert them to a list for easier manipulation.
print("Sparse dim:", splade_ef.dim, list(query_embeddings)[0].shape)

The expected output is similar to the following:

Embeddings:   (0, 2001)        0.6353746056556702
  (0, 2194)        0.015553371049463749
  (0, 2301)        0.2756537199020386
...
  (1, 18522)        0.1282549500465393
  (1, 23602)        0.13133203983306885
  (1, 28639)        2.8150033950805664
Sparse dim: 30522 (1, 30522)