We’re excited to partner with Milvus to bring you Milvus Lite, the newly available, lightweight, in-memory version of their leading vector database. This powerful tool is now just a pip install away, ready to run on Jupyter Notebooks, laptops, or edge devices, and is fully integrated with Voyage AI embeddings, making the development of GenAI applications easier than ever.
Our cutting-edge general-purpose and domain-specific embedding models are easily accessible through a hosted API endpoint. Voyage embedding models dramatically boost semantic search retrieval quality for enterprise retrieval-augmented generation (RAG) applications, and Voyage’s portfolio of embedding models frequently tops the Massive Text Embedding Benchmark (MTEB) leaderboards, including the general purpose voyage-large-2-instruct and the legal-specific voyage-law-2 (ranked #1 for legal retrieval). Voyage models consistently outperform commercial alternatives, including OpenAI and Cohere.
Together, Milvus Lite and Voyage hosted models enable developers to easily include powerful semantic search to their GenAI applications within seconds—with little to no changes when scaling to production. The same client-side code works for more scalable Milvus on Kubernetes or managed Milvus on Zilliz Cloud, simplifying the migration and saving valuable time.
See it Action
Here’s a simple demo illustrating how easy it is build semantic search with Milvus Lite and Voyage embeddings.
First, install the package Milvus and Voyage AI packages.
pip install --upgrade pymilvus voyageaiOur code starts with importing the packages and initializing the respective clients.
from pymilvus import MilvusClient
import voyageai
# Milvus client
milvus_client = MilvusClient("milvus_voyage_demo.db")
# Which model to use, please check https://docs.voyageai.com/docs/embeddings for available models
MODEL_ID = "voyage-large-2-instruct"
# Voyage client
# This will automatically use the environment variable VOYAGE_API_KEY. Alternatively, you can use vo = voyageai.Client(api_key="<your secret key>")
voyage_client = voyageai.Client()
docs = [
"AI was founded as an academic discipline in 1956.",
"Alan Turing was the first person to conduct substantial research in AI.",
"Born in Maida Vale, London, Turing was raised in southern England.",
"Artificial intelligence interest waned in the mid-70s."
]Next, we embed our documents and prepare them for insert into Milvus.
# Vectorize (embed) documents
vectors = voyage_client.embed(
texts=docs,
model=MODEL_ID,
truncation=False
).embeddings
# Get dimensions
EMBED_DIM = len(vectors[0])
data = [ {"id": i, "vector": vectors[i], "text": docs[i], "subject": "history"} for i in range(len(docs))]We create a collection and insert the data.
COLLECTION_NAME = "demo_collection" # Milvus collection name
# Clear data before inserting
has_collection = milvus_client.has_collection(COLLECTION_NAME)
if has_collection:
milvus_client.drop_collection(COLLECTION_NAME)
# Create collection
milvus_client.create_collection(
collection_name=COLLECTION_NAME,
dimension=EMBED_DIM
)
# Insert data
res = milvus_client.insert(
collection_name=COLLECTION_NAME,
data=data
)
# Verify the insert
print(str(res["insert_count"]) + " records inserted.")Now, we’re ready for semantic search!
# Queries
queries = ["When was artificial intelligence founded?"]
# Embed queries
query_vectors = voyage_client.embed(
texts=queries,
model=MODEL_ID,
truncation=False
).embeddings
# Semantic search
res = milvus_client.search(
collection_name=COLLECTION_NAME, # target collection
data=query_vectors, # query vectors
limit=2, # number of returned entities
output_fields=["text", "subject"], # specifies fields to be returned
)
# Semantic search results
for q in queries:
print('Query:', q)
for result in res:
print(result)
print("\n")We can see that the search was able to retrieve the appropriate document, semantically understanding that AI is synonymous with artificial intelligence.
Query: When was artificial intelligence founded?
[{'id': 0, 'distance': 0.8130185604095459, 'entity': {'text': 'AI was founded as an academic discipline in 1956.', 'subject': 'history'}}, {'id': 1, 'distance': 0.7873374223709106, 'entity': {'text': 'Alan Turing was the first person to conduct substantial research in AI.', 'subject': 'history'}}]Get Started Now
Start including semantic search in your GenAI applications today with Milvus Lite and Voyage embeddings. Select among the portfolio of cutting-edge embedding models for your use case. In addition, Voyage AI also develops rerankers, which can further boost your semantic search quality. Learn more about Milvus here.
Leave a Reply