OpenAI

This documentation describes the integration of MindsDB with OpenAI, an AI research organization known for developing AI models like GPT-3 and GPT-4. The integration allows for the deployment of OpenAI models within MindsDB, providing the models with access to data from various data sources.

Prerequisites

Before proceeding, ensure the following prerequisites are met:

Install MindsDB locally via Docker or Docker Desktop.
To use OpenAI within MindsDB, install the required dependencies following this instruction.
Obtain the OpenAI API key required to deploy and use OpenAI models within MindsDB. Follow the instructions for obtaining the API key.

Setup

Create an AI engine from the OpenAI handler.

CREATE ML_ENGINE openai_engine
FROM openai
USING
      openai_api_key = 'api-key-value';

Create a model using openai_engine as an engine.

CREATE MODEL openai_model
PREDICT target_column
USING
      engine = 'openai_engine',  -- engine name as created via CREATE ML_ENGINE
      api_base = 'base-url', -- optional, replaces the default base URL
      mode = 'mode_name', -- optional, mode to run the model in
      model_name = 'openai_model_name',  -- optional with default value of gpt-3.5-turbo
      question_column = 'question',  -- optional, column name that stores user input
      context_column = 'context',  -- optional, column that stores context of the user input
      prompt_template = 'input message to the model here', -- optional, user provides instructions to the model here
      user_column = 'user_input', -- optional, stores user input
      assistant_column = 'conversation_context', -- optional, stores conversation context
      prompt = 'instruction to the model', -- optional stores instruction to the model
      max_tokens = 100, -- optional, token limit for answer
      temperature = 0.3, -- temp
      json_struct = {
        'key': 'value',
        ...
      }'

If you want to update the prompt_template parameter, you do not have to recreate the model. Instead, you can override the prompt_template parameter at prediction time like this:

SELECT question, answer
FROM openai_model
WHERE question = 'input question here'
USING prompt_template = 'input new message to the model here';

The following parameters are available to use when creating an OpenAI model:

engine

api_base

mode

model_name

question_column

context_column

prompt_template

max_tokens

temperature

json_struct

Usage with OpenAI-Compatible APIs

The OpenAI handler can be used with any OpenAI-compatible APIs by providing the api_base parameter that stores the base URL of the OpenAI-compatible APIs. Here is an example of how to use the OpenAI handler with OpenRouter, the OpenAI-compatible interface for accessing LLMs.

CREATE MODEL openrouter_model
PREDICT answer
USING
  engine = 'openai_engine',
  api_base = 'https://openrouter.ai/api/v1',
  openai_api_key = 'openrouter-api-key',
  model_name = 'mistralai/devstral-small-2505',
  prompt_template = 'answer a question: {{question}}';

DESCRIBE openrouter_model;

SELECT * FROM openrouter_model
WHERE question = 'how many planets are in the solar system?';

When using OpenAI-compatible APIs, it is required to provide the base URL in the api_base parameter and the API key in the openai_api_key parameter.

Usage

Here are the combination of parameters for creating a model:

Provide a prompt_template alone.
Provide a question_column and optionally a context_column.
Provide a prompt, user_column, and assistant_column to create a model in the conversational mode.

The following usage examples utilize openai_engine to create a model with the CREATE MODEL statement.

Answering questions without context

Here is how to create a model that answers questions without context.

CREATE MODEL openai_model
PREDICT answer
USING
    engine = 'openai_engine',
    question_column = 'question';

Query the model to get predictions.

SELECT question, answer
FROM openai_model
WHERE question = 'Where is Stockholm located?';

Here is the output:

+---------------------------+-------------------------------+
|question                   |answer                         |
+---------------------------+-------------------------------+
|Where is Stockholm located?|Stockholm is located in Sweden.|
+---------------------------+-------------------------------+

Answering questions with context

Here is how to create a model that answers questions with context.

CREATE MODEL openai_model
PREDICT answer
USING
    engine = 'openai_engine',
    question_column = 'question',
    context_column = 'context';

Query the model to get predictions.

SELECT context, question, answer
FROM openai_model
WHERE context = 'Answer accurately'
AND question = 'How many planets exist in the solar system?';

On execution, we get:

+-------------------+-------------------------------------------+----------------------------------------------+
|context            |question                                   |answer                                        |
+-------------------+-------------------------------------------+----------------------------------------------+
|Answer accurately  |How many planets exist in the solar system?| There are eight planets in the solar system. |
+-------------------+-------------------------------------------+----------------------------------------------+

Prompt completion

Here is how to create a model that offers the most flexible mode of operation. It answers any query provided in the prompt_template parameter.

Good prompts are the key to getting great completions out of large language models like the ones that OpenAI offers. For best performance, we recommend you read their prompting guide before trying your hand at prompt templating.

Let’s look at an example that reuses the openai_model model created earlier and overrides parameters at prediction time.

SELECT instruction, answer
FROM openai_model
WHERE instruction = 'Speculate extensively'
USING
    prompt_template = '{{instruction}}. What does Tom Hanks like?',
    max_tokens = 100,
    temperature = 0.5;

On execution, we get:

+----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|instruction           |answer                                                                                                                                                                                                                         |
+----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|Speculate extensively |Some people speculate that Tom Hanks likes to play golf, while others believe that he enjoys acting and directing. It is also speculated that he likes to spend time with his family and friends, and that he enjoys traveling.|
+----------------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

Conversational mode

Here is how to create a model in the conversational mode.

CREATE MODEL openai_chat_model
PREDICT response
USING
  engine = 'openai_engine',
  mode = 'conversational',
  model_name = 'gpt-3.5-turbo',
  user_column = 'user_input',
  assistant_column = 'conversation_history',
  prompt = 'Answer the question in a helpful way.';

And here is how to query this model:

SELECT response
FROM openai_chat_model
WHERE user_input = '<question>'
AND conversation_history = '<optionally, provide the context for the question>';

Next StepsFollow this tutorial on sentiment analysis and this tutorial on finetuning OpenAI models to see more use case examples.

Troubleshooting Guide

Authentication Error

Symptoms: Failure to authenticate to the OpenAI API.
Checklist:
1. Make sure that your OpenAI account is active.
2. Confirm that your API key is correct.
3. Ensure that your API key has not been revoked.
4. Ensure that you have not exceeded the API usage or rate limit.

SQL statement cannot be parsed by mindsdb_sql

Symptoms: SQL queries failing or not recognizing table and model names containing spaces or special characters.

Checklist:

Ensure table names with spaces or special characters are enclosed in backticks. Examples:

Incorrect:

SELECT input.text, output.sentiment
FROM integration.travel data AS input
JOIN openai_engine AS output

Incorrect:

SELECT input.text, output.sentiment
FROM integration.'travel data' AS input
JOIN openai_engine AS output

Correct:

SELECT input.text, output.sentiment
FROM integration.`travel data` AS input
JOIN openai_engine AS output

Overview

Integrations

Connection

Data Catalog

Prerequisites

Setup

Usage with OpenAI-Compatible APIs

Usage

Troubleshooting Guide

Overview

Integrations

Connection

Data Catalog

​Prerequisites

​Setup

​Usage with OpenAI-Compatible APIs

​Usage

​Troubleshooting Guide

Prerequisites

Setup

Usage with OpenAI-Compatible APIs

Usage

Troubleshooting Guide