Insert and Delete Entities

You can perform vector operations on collections or partitions. This article talks about the following topics:

Insert entities to a collection

  1. Generate 1,000 random vectors.
>>> import random
# Generate 10000 entities.
>>> list_of_int = [random.randint(0, 255) for _ in range(10000)]
>>> vectors = [[random.random() for _ in range(128)] for _ in range(10000)]
  private static List<List<Float>> randomFloatVectors() {
    SplittableRandom splitCollectionRandom = new SplittableRandom();
    List<List<Float>> vectors = new ArrayList<>(10000);
    for (int i = 0; i < 10000; ++i) {
      splitCollectionRandom = splitCollectionRandom.split();
      DoubleStream doubleStream = splitCollectionRandom.doubles(128);
      List<Float> vector =
          doubleStream.boxed().map(Double::floatValue).collect(Collectors.toList());
      vectors.add(vector);
    }
    return vectors;
  }
  1. Insert a list of vectors.

# Insert embeddings.
>>> hybrid_entities = [
        {"name": "duration", "values": list_of_int, "type": DataType.INT32},
        {"name": "release_year", "values": list_of_int, "type": DataType.INT64},
        {"name": "embedding", "values": vectors, "type":DataType.FLOAT_VECTOR}
    ]
>>> client.insert('demo_films', hybrid_entities)
    // Insert three films with their IDs, duration, release year, and fake embeddings into the collection "demo_films".
    List<Long> ids = LongStream.range(0, 10000).boxed().collect(Collectors.toList());
    List<Integer> durations =  /* A list of 1,000 Integers. */
    List<Long> releaseYears =  LongStream.range(0, 10000).boxed().collect(Collectors.toList());
    List<List<Float>> embeddings = randomFloatVectors();

    InsertParam insertParam = InsertParam
        .create(collectionName)
        .addField("duration", DataType.INT32, durations)
        .addField("release_year", DataType.INT64, releaseYears)
        .addVectorField("embedding", DataType.VECTOR_FLOAT, embeddings)
```

If auto_id is specified as False in the collection, you can also specify the entity IDs:

>>> entity_ids = [id for id in range(10000)]
>>> client.insert('demo_films', hybrid_entities, ids=entity_ids)
    // Insert three films with their IDs, duration, release year, and fake embeddings into the collection "demo_films".
    List<Long> ids = LongStream.range(0, 10000).boxed().collect(Collectors.toList());
    List<Integer> durations =  /* A list of 1,000 Integers. */
    List<Long> releaseYears =  LongStream.range(0, 10000).boxed().collect(Collectors.toList());
    List<List<Float>> embeddings = randomFloatVectors();

    InsertParam insertParam = InsertParam
        .create(collectionName)
        .addField("duration", DataType.INT32, durations)
        .addField("release_year", DataType.INT64, releaseYears)
        .addVectorField("embedding", DataType.VECTOR_FLOAT, embeddings)
        .setEntityIds(ids)

Insert entities to a partition

>>> client.insert('demo_films', hybrid_entities, partition_tag="American")
    // Insert three films with their IDs, duration, release year, and fake embeddings into the partition "American".
    List<Long> ids = LongStream.range(0, 10000).boxed().collect(Collectors.toList());
    List<Integer> durations =  /* A list of 1,000 Integers. */
    List<Long> releaseYears =  LongStream.range(0, 10000).boxed().collect(Collectors.toList());
    List<List<Float>> embeddings = randomFloatVectors();

    InsertParam insertParam = InsertParam
        .create(collectionName)
        .addField("duration", DataType.INT32, durations)
        .addField("release_year", DataType.INT64, releaseYears)
        .addVectorField("embedding", DataType.VECTOR_FLOAT, embeddings)
        .setEntityIds(ids)
        .setPartitionTag(partitionTag);

Delete entities by ID

Suppose your collection contains the following vector IDs:

>>> ids = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Run these following command to delete specified embedding vectors:

>>> client.delete_entity_by_id('demo_films', ids)
client.deleteEntityByID(collectionName, ids.subList(0, 10));
After calling delete, you can call flush again to ensure that the newly inserted data is visible and the deleted data is no longer recoverable.

FAQ

Is there a length limit on the self-defined entity IDs? Entity IDs must be non-negative 64-bit integers.
Can I insert vectors with existing IDs? Yes, you can. If you insert vectors with an existing ID, you would end up having duplicate IDs.
Does Milvus support inserting while searching? Yes.
Is there a volume limit on the vectors inserted each time? Vectors inserted each time must not exceed 256 MB.
Edit
© 2019 - 2020 Milvus. All rights reserved.