This documentation describes the integration of MindsDB with Anyscale Endpoints, a fast and scalable API to integrate OSS LLMs into apps. The integration allows for the deployment of Anyscale Endpoints models within MindsDB, providing the models with access to data from various data sources.

Prerequisites

Before proceeding, ensure the following prerequisites are met:

  1. Install MindsDB locally via Docker or Docker Desktop.
  2. To use Anyscale Endpoints within MindsDB, install the required dependencies following this instruction.
  3. Obtain the Anyscale Endpoints API key required to deploy and use Anyscale Endpoints models within MindsDB. Follow the instructions for obtaining the API key.

Setup

Create an AI engine from the Anyscale Endpoints handler.

CREATE ML_ENGINE anyscale_endpoints_engine
FROM anyscale_endpoints
USING
      anyscale_endpoints_api_key = 'api-key-value';

Create a model using anyscale_endpoints_engine as an engine.

CREATE MODEL anyscale_endpoints_model
[FROM integration
         (SELECT * FROM table)]
PREDICT target_column
USING
      engine = 'anyscale_endpoints_engine',   -- engine name as created via CREATE ML_ENGINE
      api_base = 'base-url', -- optional, replaces the default base URL
      mode = 'conversational', -- optional, mode to run the model in
      model_name = 'anyscale_endpoints_model_name',  -- optional, the LLM to use
      prompt = 'You are a helpful assistant. Your task is to continue the chat.',  -- optional, system prompt for the model
      question_column = 'question',  -- optional, column name that stores user input
      context_column = 'context',  -- optional, column that stores context of the user input
      prompt_template = 'Answer the users input in a helpful way: {{question}}', -- optional, base template with placeholders used to provide input to the model 
      max_tokens = 100, -- optional, token limit for model output
      temperature = 0.3, -- optional, randomness setting for the model output
      json_struct = {
        'key': 'value',
        ...
      }' -- optional, the parameter for extracting JSON data from `prompt_template`

It is possilbe to override certain parameters set for a model at prediction time instead of recreating the model. For example, to change the temperature parameter for a specific prediction, use the following query:

SELECT question, answer
FROM anyscale_endpoints_model
WHERE question = 'Where is Stockholm located?'
USING
      temperature = 0.9
      prompt_template = 'Answer the users input as a pirate: {{question}}';

The parameters that can be overridden as shown above are mentioned below in the detailed explanation.

The following is a more detailed explanation of the parameters used in the CREATE MODEL statement:

The implementation of this integration is based on the engine for the OpenAI API, as Anyscale conforms to it. There are a few notable differences, though:

  1. All models supported by Anyscale Endpoints are open source. A full list can be found here for inference-only under section Supported models.
  2. Not every model is supported for fine-tuning. You can find a list here under section Fine Tuning - Supported models.

Please check both lists regularly, as they are subject to change. If you try to fine-tune a model that is not supported, you will get a warning and subsequently an error from the Anyscale endpoint.

  1. This integration only offers chat-based text completion models, either for normal text or specialized for code.
  2. When providing a description, this integration returns the respective HuggingFace model card.
  3. Fine-tuning requires that your dataset complies with the chat format. That is, each row should contain a context and a role. The context is the text that is the message in the chat, and the role is who authored it (system, user, or assistant, where the last one is the model). For more information, please check the fine tuning guide in the Anyscale Endpoints docs.

The base URL for this API is https://api.endpoints.anyscale.com/v1.

Usage

The following usage examples utilize anyscale_endpoints_engine to create a model with the CREATE MODEL statement.

The output generated for a single input will be the same regardless of the mode used. The difference between the modes is in how the model handles multiple inputs.

files.unrelated_questions is a simple CSV file containing a question column with simple (unrelated) questions that has been uploaded to MindsDB, while files.related_questions is a similar file containing related questions. files.unrelated_questions_with_context and files.related_questions_with_context are similar files containing an additional context column.

These files are used in the examples given below to provide multiple inputs to the models created. It is possible to use any other supported data source in the same manner.

Next Steps

Follow this tutorial to see more use case examples.

Troubleshooting Guide

Authentication Error

  • Symptoms: Failure to authenticate to the Anyscale.
  • Checklist:
    1. Make sure that your Anyscale account is active.
    2. Confirm that your API key is correct.
    3. Ensure that your API key has not been revoked.
    4. Ensure that you have not exceeded the API usage or rate limit.

SQL statement cannot be parsed by mindsdb_sql

  • Symptoms: SQL queries failing or not recognizing table and model names containing spaces or special characters.
  • Checklist:
    1. Ensure table names with spaces or special characters are enclosed in backticks. Examples:
      • Incorrect:
        SELECT input.text, output.sentiment
        FROM integration.travel data AS input
        JOIN openai_engine AS output
        
      • Incorrect:
        SELECT input.text, output.sentiment
        FROM integration.'travel data' AS input
        JOIN openai_engine AS output
        
      • Correct:
        SELECT input.text, output.sentiment
        FROM integration.`travel data` AS input
        JOIN openai_engine AS output