LlamaIndex Handler

This documentation describes the integration of MindsDB with LlamaIndex, a framework for building context-augmented generative AI applications with LLMs.

Prerequisites

Before proceeding, ensure the following prerequisites are met:

  1. Install MindsDB locally via Docker or Docker Desktop.
  2. To use LlamaIndex within MindsDB, install the required dependencies following this instruction.
  3. Obtain the OpenAI API key required to OpenAI LLMs. Follow the instructions for obtaining the API key.

Setup

Create an AI engine from the Llamaindex handler.

CREATE ML_ENGINE llama_index
FROM llama_index
USING
      openai_api_key = 'api-key-value';

Create a model using llama_index as an engine and OpenAI as a model provider.

CREATE MODEL chatbot_model
PREDICT answer
USING
  engine = 'llama_index',  -- engine name as created via CREATE ML_ENGINE
  input_column = 'question',
  mode = 'conversational', -- optional
  user_column = 'question', -- optional: used only for conversational mode
  assistant_column = 'answer'; -- optional: used only for conversational mode

Usage

Here is how to create a model that answers questions by reading a page from the web:

CREATE MODEL qa_model
PREDICT answer
USING 
  engine = 'llama_index', 
  reader = 'SimpleWebPageReader',
  source_url_link = 'https://mindsdb.com/about',
  input_column = 'question';

Query the model to get answer:

SELECT question, answer
FROM mindsdb.qa_model
WHERE question = "What is MindsDB's story?"

Here is the output:

+---------------------------+-------------------------------+
|question                   |answer                         |
+---------------------------+-------------------------------+
|What is MindsDB's story?    |MindsDB is a fast-growing open-source ...|
+---------------------------+-------------------------------+

Configuring SimpleWebPageReader for Specific Domains

When SimpleWebPageReader is used it can be configured to interact only with specific domains by using the web_crawling_allowed_sites setting in the config.json file. This feature allows you to restrict the handler to read and process content only from the domains you specify, enhancing security and control over web interactions.

To configure this, simply list the allowed domains under the web_crawling_allowed_sites key in config.json. For example:

"web_crawling_allowed_sites": [
    "https://docs.mindsdb.com",
    "https://another-allowed-site.com"
]

Next Steps

Go to the Use Cases section to see more examples.