Knowledge Base
A knowledge base is a batteries-included RAG system that you can create and insert data into, as well as query as if it were a table.
Internally, knowledge bases use a vector store and an embedding model. By default, it uses the ChromaDB vector store and the OpenAI embedding model, which requires an OpenAI API key set as an evironment variable. You can define a vector store and an embedding model as in the examples below.
Syntax
Here is a general syntax for creating a knowledge base:
Where all the parameters are optional:
model
: This is an embedding model created within MindsDB withCREATE MODEL embedding_model
. If you do not provide themodel
parameter, then the OpenAI’s text-embedding-ada-002 model is used by default, provided that theOPENAI_API_KEY
environment variable is defined.storage
: This is a vector database that stores embedded data. It defaults to ChromaDB, or you can connect your vector store to MindsDB withCREATE DATABASE vector_database
.metadata_columns
: The list of column names that will be stored in themetadata
column of the knowledge base. If not set, themetadata
column is not used.content_columns
: The list of column names that will be stored in thecontent
column of the knowledge base. If not set, all columns are stored in thecontent
column.id_column
: The column name that will be stored in theid
column of the knowledge base to uniquely identitify the data.
Each knowledge base comprises the id
, content
, and metadata
columns that store data defined in the parameters.
Examples
This section presents examples of how to create knowledge bases and insert data for storage in the form of embeddings.
Knowledge Base with OpenAI Embedding Model
Note that using OpenAI’s embedding model requires OpenAI API key.
First, create an engine through which the model is accessed.
Next, create an embedding model, providing an OpenAI API key.
Analyze the data that you want to insert into a knowledge base:
Here is the output:
Decide which columns to use as content
and which ones as metadata
. For example, we use the days_on_market
and neighborhood
columns as metadata
and the location
and rental_price
columns as content
.
Now that you have an embedding model, create a knowledge base, passing this embedding model and defining the content
and metadata
columns.
After successful creation of a knowledge base, insert data to store it in the form of embeddings.
Finally, you can verify that the data has been inserted into the knowledge base by querying it.
Here is the output:
Knowledge Base with Hugging Face Embedding Model
This example uses an open source embedding model.
First, create an engine through which the model is accessed.
Next, create an embedding model.
Now that you have an embedding model, create a knowledge base, passing this embedding model.
After successful creation of a knowledge base, insert data to store it in the form of embeddings.
Finally, you can verify that the data has been inserted into the knowledge base by querying it.
Knowledge Base with Custom Vector Store
This example shows how to create a knowledge base with custom vector database.
First, connect to your vector database. Here, it is ChromaDB.
Create an index in the vector store and insert one example point.
Next, create a knowledge base, passing this vector database connection.
Automate Data Sync with Knowledge Base
This example shows how to set up a job that inserts newly available data from your database into a knowledge base.
Here is how you can automate adding content to the knowledge base every time new data becomes available:
The LAST
keyword enables the quey to select only the newly added data. Learn more about the LAST
keyword here.
Query the knowledge base as below.
Was this page helpful?