Retrieval-Augmented Generation (RAG) combines the strengths of retrieval systems and generative models to produce highly relevant, context-aware outputs. It queries external knowledge sources—like databases or search indexes—to retrieve relevant information, which is then refined by a generative model.
Despite its potential, implementing RAG can be complex, involving model selection, data organization, and best practices. Existing tools simplify prototyping but lack an open-source template for scalable deployment—enter Cognita.
Cognita is an open-source RAG framework that simplifies building and deploying scalable applications. By breaking RAG into modular steps, it ensures easy maintenance, interoperability with other AI tools, customization, and compliance. Cognita balances adaptability and user-friendliness while staying scalable for future advancements.
Using MongoDB as a vector database for your Retrieval-Augmented Generation (RAG) application can be beneficial depending on your requirements. Here's why MongoDB could be a good choice:
MongoDB supports vector indexing through its Atlas Vector Search. This enables efficient similarity searches over high-dimensional data, which is central to RAG workflows. Key benefits:
RAG applications often require managing both, unstructured data (e.g., text and embeddings) and structured data (e.g., metadata, user preferences).
MongoDB, being a document database, lets you store embeddings alongside their associated metadata in a single record. For example:
{ "embedding": [0.1, 0.2, 0.3, ...], "text": "This is a sample document.", "metadata": { "source": "document_1", "timestamp": "2024-12-06T10:00:00Z" } }
This avoids the complexity of managing embeddings in a separate system.
Schema-less Design: MongoDB's schema flexibility makes it easy to iterate on your data model as your RAG application evolves.
Horizontal Scaling: MongoDB's sharding capability allows handling large datasets and scaling as your application grows.
Cloud-native Features: MongoDB Atlas provides fully managed services, including scaling, backups, and monitoring.
For a video tutorial to see how to get your free MongoDB Atlas cluster click here.
cp models_config.sample.yaml models_config.yaml
VECTOR_DB_CONFIG='{"provider":"mongo","url":"mongodb+srv://username:password@clustername.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0", "config": {"database_name": "cognita"}}'
docker-compose --env-file compose.env --profile ollama --profile infinity up
Once you have set cognita up, the following steps will showcase how to use the UI to query documents:
1. Create Data Source
2. Create Collection
3. Upon creating a new collection, Here is what what happens behind the scenes
from pymongo.operations import SearchIndexModel search_index_model = SearchIndexModel( definition={ "fields": [ { "type": "vector", "path": "embedding", "numDimensions": self.get_embedding_dimensions(embeddings), "similarity": "cosine", } ] }, name="vector_search_index", type="vectorSearch", ) # Create the search index result = self.db[collection_name].create_search_index(model=search_index_model)
This ensures that the newly created collection is ready for vector search queries. Note that creating an index on MongoDB may take up to a minute.
4. As soon as you create the collection, data ingestion begins, you can view its status by selecting your collection in the collections tab. This step is responsible for parsing your files, chunking them and adding them to your MongoDB. You can also add additional data sources later on and index them in the collection. Move to the next step once the Status is “Completed”.
This tutorial demonstrated how to build a production-ready RAG application using Cognita and MongoDB in just 10 minutes. The synergy between Cognita's adaptability and ease of use and MongoDB's flexible document model with vector search offers a robust foundation for creating advanced AI application.
Join AI/ML leaders for the latest on product, community, and GenAI developments