This project demonstrates how to convert sentences into embeddings using the Ollama model (Llama2) and store them in a vector database (FAISS). The system can then perform a similarity search to find the most semantically similar sentence from a collection.
To run this project, you’ll need the following:
ollama
faiss
numpy
You can install the required libraries using the following command:
pip install ollama faiss numpy
Convert Sentences to Embeddings
The script converts a set of sample sentences into embeddings using Ollama and stores them in FAISS.
Perform Similarity Search
After storing the embeddings, you can input a new sentence, and the system will return the most similar sentence from the stored collection.
import faiss
import ollama
import numpy as np
def text_to_embedding_ollama(text):
model_name = "llama2"
response = ollama.embed(model=model_name, input=text)
return response
# Sample sentences
sentences = [
"Artificial Intelligence is the future.",
"AI requires large datasets to train models.",
"Machines learn by analyzing data."
]
# Convert to embeddings
embeddings = [text_to_embedding_ollama(sentence) for sentence in sentences]
# Store in FAISS index for similarity search
dimension = len(embeddings[0])
index = faiss.IndexFlatL2(dimension)
index.add(np.array(embeddings))
# Query the most similar sentence
query = text_to_embedding_ollama("AI evolves with data.")
D, I = index.search(np.array([query]), 1)
print(f"The most similar sentence is: {sentences[I[0][0]]}")
You can test the system by adding new sentences and running similarity searches to verify the correctness of the results.
This project is licensed under the MIT License.
Feel free to submit issues or pull requests. Contributions are welcome!
For any questions or feedback, reach out at info@tameronline.com.