Milvus can be used to build intelligent systems in most AI application scenarios:
Query by image content，including content-based image retrieval such as bio-identification, object detection and recognition, etc.
Real-time object detection and tracing.
Natural language analysis
Semantics-based text analysis and suggestion, and text similarity search.
- Voiceprint recognition and audio search
- Remove duplicated files by file fingerprint
The application architecture of Milvus as a feature vector search engine is as follows:
Unstructured data (images/videos/texts/audios) are transformed to feature vectors by feature extraction models, and saved to Milvus database. When you input a target vector, it is saved to the current vector collection, and the search begins, until the most similar vectors are matched, and their IDs returned.
Nowadays, when you shop or view pages online, you will often see such words as "You may also like" or "Related products". In fact, many tech companies have embedded recommendation algorithms into their mobile Apps. Some examples include the Toutiao news, NetEase news, Pinduoduo, and WeChat, etc. With Milvus, you can implement your own personalized recommendation system.
Recommend personalized content based on user persona.
Take personalized advertising content recommendation as an example, the application architecture is:
Create user persona by data analysis and key feature extraction
By analyzing user history data and extracting key features, the user persona can be built. For example: The user history data contains news content about tennis, Wimbledon Championships, sports and Tennis Masters. So we can conclude from these keywords that the user is a tennis fan.
- Convert user keywords to vectors, load them to Milvus, and extract user feature vectors.
Recommend content to users based on feature vectors and logistic regression model.
- Search and filter out the top 100 ads that the user might be interested in and has not yet viewed.
- Extract the keywords and click-through rate of the top 100 ads.
- Locate and recommend the ads content to the user based on logistic regression model (which arises from user history data).
Online sellers need to prepare product images and tag product categories to help buyers better learn the product. As product categories grow, there will be a large sum of product images to be managed. If these product images are not well organized and utilized, it is often the case that you can't find the previously prepared image and need to retake it.
Manage product images, and run multimodal similarity search based on keywords, for example, find out the most similar images of the most popular products.
Milvus helps you realize product feature extraction and multimodal search by the following procedures:
- Convert product images to vectors.
- Load these vectors, together with other structured data such as product prices, publish date, sold quantity into Milvus.
- Begin multimodal search, specifying the query range as "among the top 10 products that sold the most".
- Find out the most similar images that belong to the top 10 products.
Today, online shopping and product trading has becoming a daily routine. On commodity trading platforms such as Taobao and Xianyu, sellers can display products to customers more fully and intuitively through product videos. Meanwhile, product video copying and plagiarism have also appeared. One solution to find a duplicate video is by vector similarity search.
Take Xianyu, the second-hand commodity trading platform as an example. According to its current product size and business development trend, the vector index system needs to support billions of videos with an average length of 20 seconds, and a 1024-dimensional vector per second.
Recognize and remove duplicate videos
The core of video deduplication is high-dimensional vector index. Milvus helps you recognize duplicate videos through these steps:
Convert video data to vectors according to certain algorithms. The converting algorithm determines how precisely the original video is represented by vectors.
Vector distance computation
When the video is represented by vectors, the similarity of videos can be measured by similarity of vectors. The distance between vectors can be calculated by the angle cosine, Euclidean distance and vector inner product.
Search the most similar vector by multiple vector indexing methods such as tree-based, hash-based and vector quantization, etc.