This documentation describes the Retrieval Augmented Generation integration that can be used to create, train, and deploy models within MindsDB.

It supports the following:

Prerequisites

Before proceeding, ensure the following prerequisites are met:

  1. Install MindsDB locally via Docker or Docker Desktop.
  2. To use RAG within MindsDB, install the required dependencies following this instruction.
  3. Obtain the OpenAI or Writer API key.

Setup

Create an AI engine from the RAG handler. You can create a RAG engine using this command and providing either OpenAI or Writer parameters:

CREATE ML_ENGINE rag_engine
FROM rag
USING
    openai_api_key="openai-api-key", 
    writer_org_id="writer-org", -- optional if openai is used
    writer_api_key="writer-api-key"; -- optional if openai is used

Create a model using the rag_engine:

CREATE MODEL rag_model
PREDICT answer
USING
   engine = 'rag_engine',
   llm_type = 'openai',                          -- choose one of OpenAI or Writer
   url = 'link-to-webpage',                      -- this is optional if the FROM cluse is provided
   vector_store_folder_name = 'db_connection',   -- provide a folder name that will/stores vector db data
   input_column = 'question';                       -- provide column name that stores the input to the model

Where:

NameDescription
llm_typeIt defines which LLM is used.
urlIt is used to find the knowlege from a website.
vector_store_folder_nameIt is a folder name to which the vector db data in between sessions is persisted.
input_columnIt is a column name that stores the input to the model.

When creating a RAG model, it is required to provide data either in the url parameter or in the FROM clause.

Usage

The following examples illustrate various ways to integrate RAG with different data sources, including files, URLs, databases, and vector databases.

From a URL

The following example utilize rag_engine to create a model with the CREATE MODEL statement.

CREATE ML_ENGINE rag_engine
FROM rag
USING
    openai_api_key = 'sk-xxx';

Create a model using this engine:

CREATE MODEL mindsdb_rag_model
predict answer
USING
   engine = "rag_engine",
   llm_type = "openai",
   url='https://docs.mindsdb.com/what-is-mindsdb',
   vector_store_folder_name = 'db_connection',
   input_column = 'question'; 

Check the status of the model.

DESCRIBE mindsdb_rag_model;

Now you can use the model to answer your questions.

SELECT *
FROM mindsdb_rag_model
WHERE question = 'What ML use cases does MindsDB support?';

On execution, we get:

answersource_documentsquestion
MindsDB supports various machine learning use cases such as anomaly detection, forecasting, recommenders, classification, and regression. It also supports multimedia use cases such as video and text semantic search, text to audio, text to video, and text to image.{} What ML use cases does MindsDB support?

From Database

The following example utilize rag_engine to create a model with the CREATE MODEL statement and MySQL database as a knowlege base.

CREATE ML_ENGINE rag_engine
FROM rag
USING
    openai_api_key = 'sk-xxx';

Connect to MySQL database:

CREATE DATABASE mysql_demo_db
WITH ENGINE = 'mysql',
PARAMETERS = {
    "user": "user",
    "password": "MindsDBUser123!",
    "host": "samples.mindsdb.com",
    "port": "3306",
    "database": "public"
};

Create a model using this engine and include the FORM clause:

CREATE MODEL rag_handler_db
FROM mysql_demo_db 
    (SELECT * FROM demo_fda_context LIMIT 2)
PREDICT answer
USING
   engine="rag",
   llm_type="openai",
   vector_store_folder_name='test_db',
   input_column='question';

Now you can use the model to answer your questions.

SELECT *
FROM rag_handler_db
WHERE question='what product is best for treating a cold?';

On execution, we get:

answersource_documentsquestion
ShopRite Arthritis Pain Acetaminophen is not specifically designed for treating a cold. It may help to temporarily relieve minor aches and pains associated with a cold, but it is not the best product for treating a cold. It is always best to consult with a doctor or pharmacist for the most appropriate medication for treating a cold.{"column":["full_ingredients","indications_and_usage","intended_purpose_of_product","active_ingredient"],"sources_content":["ShopRite Arthritis ..."],"sources_document":["dataframe"],"sources_row":[1,1,1,1]}what product is best for treating a cold?

From File

The following example utilize rag_engine to create a model with the CREATE MODEL statement and uploaded file as a knowlege base.

CREATE ML_ENGINE rag_engine
FROM rag
USING
    openai_api_key = 'sk-xxx';

Upload a file using the GUI Upload File option. Create a model using this engine and include the FORM clause:

CREATE MODEL rag_handler_files
FROM files 
    (SELECT * FROM uploaded_file)
PREDICT answer
USING
   engine="rag",
   llm_type="openai",
   vector_store_folder_name='test_db',
   input_column='question';

Now you can use the model to answer your questions.

SELECT *
FROM rag_handler_files
WHERE question='what product is best for treating a cold?';

On execution, we get:

answersource_documentsquestion
ShopRite Arthritis Pain Acetaminophen is not specifically designed for treating a cold. It may help to temporarily relieve minor aches and pains associated with a cold, but it is not the best product for treating a cold. It is always best to consult with a doctor or pharmacist for the most appropriate medication for treating a cold.{"column":["full_ingredients","indications_and_usage","intended_purpose_of_product","active_ingredient"],"sources_content":["ShopRite Arthritis ..."],"sources_document":["dataframe"],"sources_row":[1,1,1,1]}what product is best for treating a cold?