# Build an Application Handler Source: https://docs.mindsdb.com/contribute/app-handlers In this section, you'll find how to add new application integrations to MindsDB. **Prerequisite** You should have the latest version of the MindsDB repository installed locally. Follow [this guide](/contribute/install/) to learn how to install MindsDB for development. ## What are API Handlers? Application handlers act as a bridge between MindsDB and any application that provides APIs. You use application handlers to create databases using the [`CREATE DATABASE`](/sql/create/databases/) statement. So you can reach data from any application that has its handler implemented within MindsDB. **Database Handlers** To learn more about handlers and how to implement a database handler, visit our [doc page here](/contribute/data-handlers/). ## Creating an Application Handler You can create your own application handler within MindsDB by inheriting from the [`APIHandler`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/api_handler.py#L150) class. By providing the implementation for some or all of the methods contained in the [`APIHandler`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/api_handler.py#L150) class, you can interact with the application APIs. ### Core Methods Apart from the `__init__()` method, there are five core methods that must be implemented. We recommend checking actual examples in the codebase to get an idea of what goes into each of these methods, as they can change a bit depending on the nature of the system being integrated. Let's review the purpose of each method. | Method | Purpose | | ----------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------- | | [`_register_table()`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/api_handler.py#L164) | It registers the data resource in memory. For example, if you are using Twitter API it registers the `tweets` resource from `/api/v2/tweets`. | | [`connect()`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/base.py#L23) | It performs the necessary steps to connect/authenticate to the underlying system. | | [`check_connection()`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/base.py#L39) | It evaluates if the connection is alive and healthy. | | [`native_query()`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/base.py#L47) | It parses any *native* statement string and acts upon it (for example, raw syntax commands). | | `call_application_api()` | It calls the application API and maps the data to pandas DataFrame. This method handles the pagination and data mapping. | Authors can opt for adding private methods, new files and folders, or any combination of these to structure all the necessary work that will enable the core methods to work as intended. **Other Common Methods** Under the [`mindsdb.integrations.utilities`](main/mindsdb/integrations/utilities) library, contributors can find various methods that may be useful while implementing new handlers. ### API Table Once the data returned from the API call is registered using the [`_register_table()`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/api_handler.py#L164) method, you can use it to map to the [`APITable`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/api_handler.py#L93) class. The [`APITable`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/api_handler.py#L93) class provides CRUD methods. | Method | Purpose | | | --------------- | ----------------------------------------------------------------------------------------------------------- | - | | `select()` | It implements the mappings from the ast.Select and calls the actual API through the `call_application_api`. | | | `insert()` | It implements the mappings from the ast.Insert and calls the actual API through the `call_application_api`. | | | `update()` | It implements the mappings from the ast.Update and calls the actual API through the `call_application_api`. | | | `delete()` | It implements the mappings from the ast.Delete and calls the actual API through the `call_application_api`. | | | `add()` | Adds new rows to the data dictionary. | | | `list()` | List data based on certain conditions by providing FilterCondition, limits, sorting and target fields. | | | `get_columns()` | It maps the data columns returned by the API. | | ### Implementation Each application handler should inherit from the [`APIHandler`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/api_handler.py#L150) class. Here is a step-by-step guide: * Implementing the [`__init__()`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/api_handler.py#L155) method: This method initializes the handler. ```py theme={null} def __init__(self, name: str): super().__init__(name) """ constructor Args: name (str): the handler name """ self._tables = {} ``` * Implementing the [`connect()`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/base.py#L23) method: The `connect()` method sets up the connection. ```py theme={null} def connect(self) -> HandlerStatusResponse: """ Set up any connections required by the handler Should return output of check_connection() method after attempting connection. Should switch self.is_connected. Returns: HandlerStatusResponse """ ``` * Implementing the [`check_connection()`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/base.py#L39) method: The `check_connection()` method performs the health check for the connection. ```py theme={null} def check_connection(self) -> HandlerStatusResponse: """ Check connection to the handler Returns: HandlerStatusResponse """ ``` * Implementing the [`native_query()`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/base.py#L47) method: The `native_query()` method runs commands of the native API syntax. ```py theme={null} def native_query(self, query: Any) -> HandlerResponse: """Receive raw query and act upon it somehow. Args: query (Any): query in native format (str for sql databases, api's json etc) Returns: HandlerResponse """ ``` * Implementing the `call_application_api()` method: This method makes the API calls. It is **not mandatory** to implement this method, but it can help make the code more reliable and readable. ```py theme={null} def call_application_api(self, method_name:str = None, params:dict = None) -> DataFrame: """Receive query as AST (abstract syntax tree) and act upon it somehow. Args: query (ASTNode): sql query represented as AST. Can be any kind of query: SELECT, INSERT, DELETE, etc Returns: DataFrame """ ``` ### Exporting the `connection_args` Dictionary The `connection_args` dictionary contains all of the arguments used to establish the connection along with their descriptions, types, labels, and whether they are required or not. The `connection_args` dictionary should be stored in the `connection_args.py` file inside the handler folder. The `connection_args` dictionary is stored in a separate file in order to be able to hide sensitive information such as passwords or API keys. By default, when querying for `connection_data` from the `information_schema.databases` table, all sensitive information is hidden. To unhide it, use this command: ```sql theme={null} set show_secrets=true; ``` Here is an example of the `connection_args.py` file from the [GitHub handler](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/github_handler) where the API key value is set to hidden with `"secret": True`. ```py theme={null} from collections import OrderedDict from mindsdb.integrations.libs.const import HANDLER_CONNECTION_ARG_TYPE as ARG_TYPE connection_args = OrderedDict( repository={ "type": ARG_TYPE.STR, "description": " GitHub repository name.", "required": True, "label": "Repository", }, api_key={ "type": ARG_TYPE.PWD, "description": "Optional GitHub API key to use for authentication.", "required": False, "label": "Api key", "secret": True }, github_url={ "type": ARG_TYPE.STR, "description": "Optional GitHub URL to connect to a GitHub Enterprise instance.", "required": False, "label": "Github url", }, ) connection_args_example = OrderedDict( repository="mindsdb/mindsdb", api_key="ghp_xxx", github_url="https://github.com/mindsdb/mindsdb" ) ``` ### Exporting All Required Variables The following should be exported in the `__init__.py` file of the handler: * The `Handler` class. * The `version` of the handler. * The `name` of the handler. * The `type` of the handler, either `DATA` handler or `ML` handler. * The `icon_path` to the file with the database icon. * The `title` of the handler or a short description. * The `description` of the handler. * The `connection_args` dictionary with the connection arguments. * The `connection_args_example` dictionary with an example of the connection arguments. * The `import_error` message that is used if the import of the `Handler` class fails. A few of these variables are defined in another file called `__about__.py`. This file is imported into the `__init__.py` file. Here is an example of the `__init__.py` file for the [GitHub handler](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/github_handler). ```py theme={null} from mindsdb.integrations.libs.const import HANDLER_TYPE from .__about__ import __version__ as version, __description__ as description from .connection_args import connection_args, connection_args_example try: from .github_handler import ( GithubHandler as Handler, connection_args_example, connection_args, ) import_error = None except Exception as e: Handler = None import_error = e title = "GitHub" name = "github" type = HANDLER_TYPE.DATA icon_path = "icon.svg" __all__ = [ "Handler", "version", "name", "type", "title", "description", "import_error", "icon_path", "connection_args_example", "connection_args", ] ``` The `__about__.py` file for the same [GitHub handler](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/github_handler) contains the following variables: ```py theme={null} __title__ = "MindsDB GitHub handler" __package_name__ = "mindsdb_github_handler" __version__ = "0.0.1" __description__ = "MindsDB handler for GitHub" __author__ = "Artem Veremey" __github__ = "https://github.com/mindsdb/mindsdb" __pypi__ = "https://pypi.org/project/mindsdb/" __license__ = "MIT" __copyright__ = "Copyright 2023 - mindsdb" ``` ## Check out our Application Handlers! To see some integration handlers that are currently in use, we encourage you to check out the following handlers inside the MindsDB repository: * [GitHub handler](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/github_handler) * [TwitterHandler](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/twitter_handler) And here are [all the handlers available in the MindsDB repository](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers). # Join our Community Source: https://docs.mindsdb.com/contribute/community If you have questions or you want to chat with the MindsDB core team or other community members, you can join our [Slack workspace](https://mindsdb.com/joincommunity) ## MindsDB Newsletter To get updates on MindsDB's latest announcements, releases, and events, [sign up for our newsletter](https://mindsdb.com/newsletter/). ## Contact Us If you are interested in MindsDB for large-scale projects, contact us by submitting [this form](https://mindsdb.com/contact-us/). # How to Contribute to MindsDB Source: https://docs.mindsdb.com/contribute/contribute Thank you for your interest in contributing to MindsDB. MindsDB is free, open-source software and all types of contributions are welcome, whether they’re documentation changes, bug reports, bug fixes, or new source code changes. In order to contribute to MindsDB: * fork the MindsDB GitHub repository, * [install MindsDB locally](/contribute/install), * implement and test your changes, * **push your changes to the `develop` branch**. 1. Fork the MindsDB repository from [MindsDB GitHub](https://github.com/mindsdb/mindsdb). 2. Clone the MindsDB repository locally from your fork and go inside the repository folder. ```bash theme={null} cd /path/mindsdb-repo-folder-name ``` 3. Fetch all other branches from the MindsDB repository with these commands: ```bash theme={null} git remote add upstream https://github.com/mindsdb/mindsdb git fetch upstream ``` 4. Switch to the `develop` branch. ```bash theme={null} git checkout develop ``` 5. Create a new branch for your changes from the `develop` branch. ```bash theme={null} git checkout -b new-branch-name ``` 6. Make your changes on this branch. 7. Commit and push your changes to GitHub. ```bash theme={null} git add * git commit -m "commit message" git push --set-upstream origin new-branch-name ``` 8. Go to GitHub and create a PR to the `develop` branch of the MindsDB repository.

## MindsDB Release Process The `main` branch of the [MindsDB repository](https://github.com/mindsdb/mindsdb) contains the latest stable version of MindsDB and represents the GA (General Availability) release. Learn more about [MindsDB release types here](/releases). MindsDB follows the [Gitflow branching model](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow) to manage development and releases as follows. All code changes are first committed to the `develop` branch. When a release is approaching, a short-lived `release` branch is created from the `develop` branch. * This branch is used for final testing and validation. * Pre-GA artifacts are built at this stage, including both the Python package and the Docker image, and shared for broader testing and feedback. After successful testing and validation: * The `release` branch is merged into the `main` branch, making it an official GA release. * The final GA versions of the Python package and Docker image are released, while the pre-GA version are removed. ## Contributor Testing Requirements As a contributor, you are responsible for writing the code according to the [Python Coding Standards](/contribute/python-coding-standards) and thoroughly testing all features or fixes that you implement before they are merged into the `develop` branch. ### Feature Branch Testing Before merging your changes, the following types of testing must be completed to validate your work in isolation: * Unit Tests Verify that individual components or functions behave as expected during development. * Integration Tests Ensure that your new code works correctly with existing functionality and doesn't introduce regressions. ### Post-Release Testing After a release that includes your features or fixes is published, contributors are encouraged to: * Test their changes in the released environment, and * Report any issues or unexpected behavior that may arise. # Build a Database Handler Source: https://docs.mindsdb.com/contribute/data-handlers In this section, you'll find how to add new integrations/databases to MindsDB. **Prerequisite** You should have the latest version of the MindsDB repository installed locally. Follow [this guide](/contribute/install/) to learn how to install MindsDB for development. ## What are Database Handlers? Database handlers act as a bridge to any database. You use database handlers to create databases using [the CREATE DATABASE command](/sql/create/databases/). So you can reach data from any database that has its handler implemented within MindsDB. ## Creating a Database Handler You can create your own database handler within MindsDB by inheriting from the [`DatabaseHandler`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/base.py#L102) class. By providing the implementation for some or all of the methods contained in the `DatabaseHandler` class, you can connect with the database of your choice. ### Core Methods Apart from the `__init__()` method, there are seven core methods that must be implemented. We recommend checking actual examples in the codebase to get an idea of what goes into each of these methods, as they can change a bit depending on the nature of the system being integrated. Let's review the purpose of each method. | Method | Purpose | | -------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | `connect()` | It performs the necessary steps to connect to the underlying system. | | `disconnect()` | It gracefully closes connections established in the `connect()` method. | | `check_connection()` | It evaluates if the connection is alive and healthy. This method is called frequently. | | `native_query()` | It parses any *native* statement string and acts upon it (for example, raw SQL commands). | | `query()` | It takes a parsed SQL command in the form of an abstract syntax tree and executes it. | | `get_tables()` | It lists and returns all the available tables. Each handler decides what a *table* means for the underlying system when interacting with it from the data layer. Typically, these are actual tables. | | `get_columns()` | It returns columns of a table registered in the handler with the respective data type. | Authors can opt for adding private methods, new files and folders, or any combination of these to structure all the necessary work that will enable the core methods to work as intended. **Other Common Methods** Under the `mindsdb.integrations.libs.utils` library, contributors can find various methods that may be useful while implementing new handlers. Also, there are wrapper classes for the `DatabaseHandler` instances called [HandlerResponse](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/response.py#L7) and [HandlerStatusResponse](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/response.py#L32). You should use them to ensure proper output formatting. ### Implementation Each database handler should inherit from the [`DatabaseHandler`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/libs/base.py#L102) class. Here is a step-by-step guide: * Setting the `name` class property: MindsDB uses it internally as the name of the handler. For example, the `CREATE DATABASE` statement uses the handler's name. ```sql theme={null} CREATE DATABASE integration_name WITH ENGINE = 'postgres', --- here, the handler's name is `postgres` PARAMETERS = { 'host': '127.0.0.1', 'user': 'root', 'password': 'password' }; ``` * Implementing the `__init__()` method: This method initializes the handler. The `connection_data` argument contains the `PARAMETERS` from the `CREATE DATABASE` statement, such as `user`, `password`, etc. ```py theme={null} def __init__(self, name: str, connection_data: Optional[dict]): """ constructor Args: name (str): the handler name """ ``` * Implementing the `connect()` method: The `connect()` method sets up the connection. ```py theme={null} def connect(self) -> HandlerStatusResponse: """ Set up any connections required by the handler Should return the output of check_connection() method after attempting connection. Should switch self.is_connected. Returns: HandlerStatusResponse """ ``` * Implementing the `disconnect()` method: The `disconnect()` method closes the existing connection. ```py theme={null} def disconnect(self): """ Close any existing connections Should switch self.is_connected. """ ``` * Implementing the `check_connection()` method: The `check_connection()` method performs the health check for the connection. ```py theme={null} def check_connection(self) -> HandlerStatusResponse: """ Check connection to the handler Returns: HandlerStatusResponse """ ``` * Implementing the `native_query()` method: The `native_query()` method runs commands of the native database language. ```py theme={null} def native_query(self, query: Any) -> HandlerResponse: """Receive raw query and act upon it somehow. Args: query (Any): query in native format (str for sql databases, etc) Returns: HandlerResponse """ ``` * Implementing the `query()` method: The query method runs parsed SQL commands. ```py theme={null} def query(self, query: ASTNode) -> HandlerResponse: """Receive query as AST (abstract syntax tree) and act upon it somehow. Args: query (ASTNode): sql query represented as AST. May be any kind of query: SELECT, INSERT, DELETE, etc Returns: HandlerResponse """ ``` * Implementing the `get_tables()` method: The `get_tables()` method lists all the available tables. ```py theme={null} def get_tables(self) -> HandlerResponse: """ Return list of entities Return a list of entities that will be accessible as tables. Returns: HandlerResponse: should have the same columns as information_schema.tables (https://dev.mysql.com/doc/refman/8.0/en/information-schema-tables-table.html) Column 'TABLE_NAME' is mandatory, other is optional. """ ``` * Implementing the `get_columns()` method: The `get_columns()` method lists all columns of a specified table. ```py theme={null} def get_columns(self, table_name: str) -> HandlerResponse: """ Returns a list of entity columns Args: table_name (str): name of one of tables returned by self.get_tables() Returns: HandlerResponse: should have the same columns as information_schema.columns (https://dev.mysql.com/doc/refman/8.0/en/information-schema-columns-table.html) Column 'COLUMN_NAME' is mandatory, other is optional. Highly recommended to define also 'DATA_TYPE': it should be one of python data types (by default it is str). """ ``` ### Exporting the `connection_args` Dictionary The `connection_args` dictionary contains all of the arguments used to establish the connection along with their descriptions, types, labels, and whether they are required or not. The `connection_args` dictionary should be stored in the `connection_args.py` file inside the handler folder. The `connection_args` dictionary is stored in a separate file in order to be able to hide sensitive information such as passwords or API keys. By default, when querying for `connection_data` from the `information_schema.databases` table, all sensitive information is hidden. To unhide it, use this command: ```sql theme={null} set show_secrets=true; ``` Here is an example of the `connection_args.py` file from the [MySQL handler](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/mysql_handler) where the password value is set to hidden with `'secret': True`. ```py theme={null} from collections import OrderedDict from mindsdb.integrations.libs.const import HANDLER_CONNECTION_ARG_TYPE as ARG_TYPE connection_args = OrderedDict( url={ 'type': ARG_TYPE.STR, 'description': 'The URI-Like connection string to the MySQL server. If provided, it will override the other connection arguments.', 'required': False, 'label': 'URL' }, user={ 'type': ARG_TYPE.STR, 'description': 'The user name used to authenticate with the MySQL server.', 'required': True, 'label': 'User' }, password={ 'type': ARG_TYPE.PWD, 'description': 'The password to authenticate the user with the MySQL server.', 'required': True, 'label': 'Password', 'secret': True }, database={ 'type': ARG_TYPE.STR, 'description': 'The database name to use when connecting with the MySQL server.', 'required': True, 'label': 'Database' }, host={ 'type': ARG_TYPE.STR, 'description': 'The host name or IP address of the MySQL server. NOTE: use \'127.0.0.1\' instead of \'localhost\' to connect to local server.', 'required': True, 'label': 'Host' }, port={ 'type': ARG_TYPE.INT, 'description': 'The TCP/IP port of the MySQL server. Must be an integer.', 'required': True, 'label': 'Port' }, ssl={ 'type': ARG_TYPE.BOOL, 'description': 'Set it to True to enable ssl.', 'required': False, 'label': 'ssl' }, ssl_ca={ 'type': ARG_TYPE.PATH, 'description': 'Path or URL of the Certificate Authority (CA) certificate file', 'required': False, 'label': 'ssl_ca' }, ssl_cert={ 'type': ARG_TYPE.PATH, 'description': 'Path name or URL of the server public key certificate file', 'required': False, 'label': 'ssl_cert' }, ssl_key={ 'type': ARG_TYPE.PATH, 'description': 'The path name or URL of the server private key file', 'required': False, 'label': 'ssl_key', } ) connection_args_example = OrderedDict( host='127.0.0.1', port=3306, user='root', password='password', database='database' ) ``` ### Exporting All Required Variables The following should be exported in the `__init__.py` file of the handler: * The `Handler` class. * The `version` of the handler. * The `name` of the handler. * The `type` of the handler, either `DATA` handler or `ML` handler. * The `icon_path` to the file with the database icon. * The `title` of the handler or a short description. * The `description` of the handler. * The `connection_args` dictionary with the connection arguments. * The `connection_args_example` dictionary with an example of the connection arguments. * The `import_error` message that is used if the import of the `Handler` class fails. A few of these variables are defined in another file called `__about__.py`. This file is imported into the `__init__.py` file. Here is an example of the `__init__.py` file for the [MySQL handler](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/mysql_handler). ```py theme={null} from mindsdb.integrations.libs.const import HANDLER_TYPE from .__about__ import __version__ as version, __description__ as description from .connection_args import connection_args, connection_args_example try: from .mysql_handler import ( MySQLHandler as Handler, connection_args_example, connection_args ) import_error = None except Exception as e: Handler = None import_error = e title = 'MySQL' name = 'mysql' type = HANDLER_TYPE.DATA icon_path = 'icon.svg' __all__ = [ 'Handler', 'version', 'name', 'type', 'title', 'description', 'connection_args', 'connection_args_example', 'import_error', 'icon_path' ] ``` The `__about__.py` file for the same [MySQL handler](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/mysql_handler) contains the following variables: ```py theme={null} __title__ = 'MindsDB MySQL handler' __package_name__ = 'mindsdb_mysql_handler' __version__ = '0.0.1' __description__ = "MindsDB handler for MySQL" __author__ = 'MindsDB Inc' __github__ = 'https://github.com/mindsdb/mindsdb' __pypi__ = 'https://pypi.org/project/mindsdb/' __license__ = 'MIT' __copyright__ = 'Copyright 2022- mindsdb' ``` ### Exporting Requirements In the case if the integration requires other packages to function correctly, list them in the `requirements.txt` file. Create a text file named `requirements.txt` that stores all packages required for using the integration. Here is an example: ``` mysql-connector-python==9.1.0 ... ``` ## Check out our Database Handlers! To see some integration handlers that are currently in use, we encourage you to check out the following handlers inside the MindsDB repository: * [MySQL](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/mysql_handler) * [Postgres](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/postgres_handler) And here are [all the handlers available in the MindsDB repository](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers). # How to Write MindsDB Documentation Source: https://docs.mindsdb.com/contribute/docs This section gets you started on how to contribute to the MindsDB documentation. MindsDB's documentation is run using Mintlify. If you want to contribute to our docs, please follow the steps below to set up the environment locally. ## Running the Docs Locally **Prerequisite** You should have installed Git (version 2.30.1 or higher) and Node.js (version 18.10.0 or higher). Step 1. Clone the MindsDB Git repository: ```console theme={null} git clone https://github.com/mindsdb/mindsdb.git ``` Step 2. Install Mintlify on your OS: ```console theme={null} npm i mintlify -g ``` Step 3. Go to the `docs` folder inside the cloned MindsDB Git repository and start Mintlify there: ```console theme={null} mintlify dev ``` The documentation website is now available at `http://localhost:3000`. **Getting an Error?** If you use the Windows operating system, you may get an error saying `no such file or directory: C:/Users/Username/.mintlify/mint/client`. Here are the steps to troubleshoot it: * Go to the `C:/Users/Username/.mintlify/` directory. * Remove the `mint` folder. * Open the Git Bash in this location and run `git clone https://github.com/mintlify/mint.git`. * Repeat step 3. ## MindsDB Repository Structure Here is the structure of the MindsDB docs repository: ``` docs # All documentation source files |__assets/ # Images and icons used throughout the docs │ ├─ ... │__folders_with_mdx_files/ # All remaining folders that store the .mdx files |__mdx_files # Some of the .mdx files are stored in the docs directory |__mintlify.json # This JSON file stores navigation and page setup ``` # MindsDB Installation for Development Source: https://docs.mindsdb.com/contribute/install If you want to contribute to the development of MindsDB, you need to install from source. If you do not want to contribute to the development of MindsDB but simply install and use it, then [install MindsDB via Docker](/setup/self-hosted/docker). ## Install MindsDB for Development Here are the steps to install MindsDB from source. You can either follow the steps below or visit the provided link. Before installing MindsDB from source, ensure that you use one of the following Python versions: `3.10.x`, `3.11.x`, `3.12.x`, `3.13.x`. 1. Fork the [MindsDB repository from GitHub](https://github.com/mindsdb/mindsdb). 2. Clone the fork locally: ```bash theme={null} git clone https://github.com//mindsdb.git ``` 3. Create a virtual environment: ```bash theme={null} python -m venv mindsdb-venv ``` 4. Activate the virtual environment: Windows: ```bash theme={null} .\mindsdb-venv\Scripts\activate ``` macOS/Linux: ```bash theme={null} source mindsdb-venv/bin/activate ``` 5. Install MindsDB with its local development dependencies: Install dependencies: ```bash theme={null} cd mindsdb pip install -e . ``` 6. Start MindsDB: ```bash theme={null} python -m mindsdb ``` By default, MindsDB starts the `http` and `mysql` APIs. You can define which APIs to start using the `api` flag as below. ```bash theme={null} python -m mindsdb --api http,mysql ``` If you want to start MindsDB without the graphical user interface (GUI), use the `--no_studio` flag as below. ```bash theme={null} python -m mindsdb --no_studio ``` Alternatively, you can use a makefile to install dependencies and start MindsDB: ```bash theme={null} make install_mindsdb make run_mindsdb ``` Now you should see the following message in the console: ``` ... mindsdb.api.http.initialize: - GUI available at http://127.0.0.1:47334/ mindsdb.api.mysql.mysql_proxy.mysql_proxy: Starting MindsDB Mysql proxy server on tcp://127.0.0.1:47335 mindsdb.api.mysql.mysql_proxy.mysql_proxy: Waiting for incoming connections... mindsdb: mysql API: started on 47335 mindsdb: http API: started on 47334 ``` You can access the MindsDB Editor at `localhost:47334`. ## Install dependencies The core installation includes everything needed to run the Federated Query Engine and essential database capabilities. The dependencies for many of the data or ML integrations are not installed by default. If you need additional features — such as Agents, the Knowledge Base, MCP or A2A protocol — you can enable them through extras, rather than installing everything by default. ### Install Features via Extras Optional integrations and features can be installed as needed using extras. | Feature | Install command | | ------------------------- | --------------------------------------- | | Agents / LLMs | `pip install ".[agents]"` | | Knowledge Base | `pip install ".[kb]"` | | Multiple features at once | `pip install ".[agents,knowledgebase]"` | | Integrations | `pip install .[integration_name]` | You can find all available [handlers here](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers). ## What's Next? Now that you installed and started MindsDB locally, go ahead and find out how to create and train a model using the [`CREATE MODEL`](/sql/create/model) statement. Check out the [Use Cases](/use-cases/overview) section to follow tutorials that cover Large Language Models, Chatbots, Time Series, Classification, and Regression models, Semantic Search, and more. # How to Write Handlers README Source: https://docs.mindsdb.com/contribute/integrations-readme The README file is a crucial document that guides users in understanding, using, and contributing to the MindsDb integration. It serves as the first point of contact for anyone interacting with the integration, hence the need for it to be comprehensive, clear, and user-friendly. ## Sections to Include ### Table of Contents A well-organized table of contents is provided for easy navigation through the document, allowing users to quickly find the information they need. ### About Explain what specific database, application, or framework the integration targets. Provide a concise overview of the integration’s purpose, highlighting its key features and benefits. ### Handler Implementation * Setup * Detail the installation and initial setup process, including any prerequisites. * Connection * Describe the steps to establish and manage connections, with clear instructions. * Include SQL examples for better clarity. * Required Parameters * List and describe all essential parameters necessary for the operation of the integration. * Optional Parameters * Detail additional, non-mandatory parameters that can enhance the integration's functionality. ### Example Usage * Practical Examples: Offer detailed examples showing how to use the integration effectively. * Coverage: Ensure examples encompass a range of functionalities, from basic to advanced operations. * SQL Examples: Include SQL statements and their expected outputs to illustrate use cases. ### Supported Tables/Tasks Clearly enumerate the tables, tasks, or operations that the integration supports, possibly in a list or table format. ### Limitations Transparently outline any limitations or constraints known in the integration. ### TODO * Future Developments: Highlight areas for future enhancements or improvements. * GitHub Issues: Link to open GitHub issues tagged as enhancements, indicating ongoing or planned feature additions. # Python Coding Standards Source: https://docs.mindsdb.com/contribute/python-coding-standards # PEP8 Strict adherence to [PEP8](https://peps.python.org/pep-0008/) standards is mandatory for all code contributions to MindsDB. **Why PEP8?** [PEP8](https://peps.python.org/pep-0008/) provides an extensive set of guidelines for Python code styling, promoting readability and a uniform coding standard. By aligning with PEP8, we ensure our codebase remains clean, maintainable, and easily understandable for Python developers at any level. #### Automated Checks * Upon submission of a Pull Request (PR), an automated process checks the code for PEP8 compliance. * Non-compliance with PEP8 can result in the failure of the build process. Adherence to PEP8 is not just a best practice but a necessity to ensure smooth integration of new code into the codebase. * If a PR fails due to PEP8 violations, the contributor is required to review the automated feedback provided. * Pay special attention to common PEP8 compliance issues such as proper indentation, appropriate line length, correct use of whitespace, and following the recommended naming conventions. * Contributors are encouraged to iteratively improve their code based on the feedback until full compliance is achieved. # Logging Always instantiate a logger using the MindsDB utilities module. This practice ensures a uniform approach to logging across different parts of the application. Example of Logger Creation: ```python theme={null} from mindsdb.utilities import log logger = log.getLogger(__name__) ``` ### Setting Logging * Environment Variable: Use `MINDSDB_LOG_LEVEL` to set the desired logging level. This approach allows for dynamic adjustment of log verbosity without needing code modifications. * Log Levels: Available levels include: * `DEBUG`: Detailed information, typically of interest only when diagnosing problems. * `INFO:` Confirmation that things are working as expected. * `WARNING`: An indication that something unexpected happened, or indicative of some problem in the near future. * `ERROR`: Due to a more serious problem, the software has not been able to perform some function. * `CRITICAL`: A serious error, indicating that the program itself may be unable to continue running. * Avoid print() statements. They lack the flexibility and control offered by logging mechanisms, particularly in terms of output redirection and level-based filtering. * The logger name should be `__name__ ` to automatically reflect the module's name. This convention is crucial for pinpointing the origin of log messages. # Docstrings Docstrings are essential for documenting Python code. They provide a clear explanation of the functionality of classes, functions, modules, etc., making the codebase easier to understand and maintain. A well-written docstring should include: * Function's Purpose: Describe what the function/class/module does. * Parameters: List and explain the parameters it takes. * Return Value: Describe what the function returns. * Exceptions: Mention any exceptions that the function might raise. ```python theme={null} def example_function(param1, param2): """This is an example docstring. Args: param1 (type): Description of param1. param2 (type): Description of param2. Returns: type: Description of the return value. Raises: ExceptionType: Description of the exception. """ # function body... ``` # Exception Handling Implementing robust error handling strategies is essential to maintain the stability and reliability of MindsDB. Proper exception management ensures that the application behaves predictably under error conditions, providing clear feedback and preventing unexpected crashes or behavior. * Utilizing MindsDB Exceptions: To ensure uniformity and clarity in error reporting, always use predefined exceptions from the MindsDB exceptions library. * Adding New Exceptions: If during development you encounter a scenario where none of the existing exceptions adequately represent the error, consider defining a new, specific exception. # Data Catalog for Integrations Source: https://docs.mindsdb.com/data_catalog/integrations/overview As of now, the Data Catalog is available for the following integrations: * [Snowflake](/integrations/data-integrations/snowflake) * [Salesforce](/integrations/app-integrations/salesforce) * [BigQuery](/integrations/data-integrations/google-bigquery) * [MS SQL Server](/integrations/data-integrations/microsoft-sql-server) * [MySQL](/integrations/app-integrations/mysql) * [Oracle](/integrations/data-integrations/oracle) * [PostgreSQL](/integrations/data-integrations/postgresql) ### Enabling the Data Catalog To enable the Data Catalog feature in MindsDB, update your `config.json` file by setting the `data_catalog` flag to `true`: ```json theme={null} { "data_catalog": { "enabled": true } } ``` Follow this doc page to learn how to [start MindsDB with custom configuration](/setup/custom-config). Note that the data catalog is generated for a data source only after this data source is connected to an agent. Here is an example: ```sql theme={null} CREATE DATABASE snowflake_data WITH ENGINE = 'snowflake', PARAMETERS = { "account": "abc123-xyz987", "user": "username", "password": "password", "database": "database_name", "schema": "schema_name", "warehouse": "warehouse_name" }; CREATE AGENT my_agent USING include_tables= ['snowflake_data.table_name', ...]; ``` Now you can [query the data catalog](/data_catalog/integrations/query) generated for the `snowflake_data` integration. ### How It Works When you create an [agent](/mindsdb_sql/agents/agent) in MindsDB that connects to one of the supported integrations, the Data Catalog automatically: 1. Inspects the data source. 2. Extracts metadata for all accessible tables and columns. 3. Stores this information in a dedicated catalog schema (`DATA_CATALOG`). 4. Makes this metadata available to agents and users via both SQL queries and internal reasoning. **Current Limitations** This feature is still evolving and has some known limitations: * **One-Time Snapshot**: Metadata is generated only once—at the time the agent is created. If the data schema changes (e.g., new columns, renamed tables), the Data Catalog will not automatically update. A refresh mechanism is planned in a future release. * **No Manual Feedback**: If any metadata appears to be incorrect (e.g., wrong row counts or data types), there is currently no way for users to flag or correct it. A feedback system will be introduced soon. # Querying Data Catalog for Integrations Source: https://docs.mindsdb.com/data_catalog/integrations/query MindsDB exposes collected metadata from connected data sources via virtual tables in the `INFORMATION_SCHEMA` schema. These views allow users to inspect and query the Data Catalog using familiar SQL syntax. ## Available Data Catalog Tables To filter results for a specific data integration, use `WHERE TABLE_SCHEMA = ''`. ### `INFORMATION_SCHEMA.META_TABLES` Provides high-level metadata about available tables in a given integration. Here are the available columns: * `TABLE_NAME` (string): Name of the table. * `TABLE_TYPE` (string, optional): Type of table (e.g., `BASE TABLE`, `VIEW`). * `TABLE_SCHEMA` (string, optional): Schema name or integration name. * `TABLE_DESCRIPTION` (string, optional): Description of the table. * `ROW_COUNT` (integer, optional): Estimated row count. Here is how to query it foe a specific data integration: ```sql theme={null} SELECT * FROM INFORMATION_SCHEMA.META_TABLES WHERE TABLE_SCHEMA = 'integration_name'; ``` ### `INFORMATION_SCHEMA.META_COLUMNS` Returns detailed column-level metadata for all tables in the specified integration. Here are the available columns: * `TABLE_NAME` (string): Name of the table. * `COLUMN_NAME` (string): Column name. * `DATA_TYPE` (string): Data type of the column. * `COLUMN_DESCRIPTION` (string, optional): Description of the column. * `IS_NULLABLE` (boolean, optional): Whether nulls are allowed. * `COLUMN_DEFAULT` (string, optional): Default value, if any. Here is how to query it foe a specific data integration: ```sql theme={null} SELECT * FROM INFORMATION_SCHEMA.META_COLUMNS WHERE TABLE_SCHEMA = 'integration_name'; ``` ### `INFORMATION_SCHEMA.META_COLUMN_STATISTICS` Provides statistical insights about each column’s values and distribution. Here are the available columns: * `TABLE_NAME` (string): Name of the table. * `COLUMN_NAME` (string): Column name. * `MOST_COMMON_VALUES` (array of strings, optional) * `MOST_COMMON_FREQUENCIES` (array of integers, optional) * `NULL_PERCENTAGE` (float, optional) * `MINIMUM_VALUE` (string, optional) * `MAXIMUM_VALUE` (string, optional) * `DISTINCT_VALUES_COUNT` (integer, optional) Here is how to query it foe a specific data integration: ```sql theme={null} SELECT * FROM INFORMATION_SCHEMA.META_COLUMN_STATISTICS WHERE TABLE_SCHEMA = 'integration_name'; ``` ### `INFORMATION_SCHEMA.META_KEY_COLUMN_USAGE` Describes the primary key columns for tables in the integration. Here are the available columns: * `TABLE_NAME` (string): Name of the table. * `COLUMN_NAME` (string): Column name. * `ORDINAL_POSITION` (integer, optional) * `CONSTRAINT_NAME` (string, optional) Here is how to query it foe a specific data integration: ```sql theme={null} SELECT * FROM INFORMATION_SCHEMA.META_KEY_COLUMN_USAGE WHERE TABLE_SCHEMA = 'integration_name'; ``` ### `INFORMATION_SCHEMA.META_TABLE_CONSTRAINTS` Lists table-level constraints, including primary and foreign keys. Here are the available columns: * `TABLE_NAME` (string): Name of the table. * `CONSTRAINT_NAME` (string, optional) * `CONSTRAINT_TYPE` (string): e.g., PRIMARY KEY, FOREIGN KEY Here is how to query it foe a specific data integration: ```sql theme={null} SELECT * FROM INFORMATION_SCHEMA.META_TABLE_CONSTRAINTS WHERE TABLE_SCHEMA = 'integration_name'; ``` ### `INFORMATION_SCHEMA.META_HANDLER_INFO` Returns a textual summary of the integration implementation, including supported SQL features and capabilities. Here are the available columns: * `HANDLER_INFO` (string): Description. Here is how to query it foe a specific data integration: ```sql theme={null} SELECT * FROM INFORMATION_SCHEMA.META_HANDLER_INFO WHERE TABLE_SCHEMA = 'integration_name'; ``` # Data Catalog Source: https://docs.mindsdb.com/data_catalog/overview The **Data Catalog** in MindsDB plays a key role in enhancing the context available to [agents](/mindsdb_sql/agents/agent) when querying data sources. By automatically indexing and storing metadata, such as table names, column types, constraints, and statistics, the catalog empowers agents to understand the structure and semantics of the data, leading to more accurate and efficient query generation. ### Why It Matters When agents interpret natural language questions or generate SQL queries, access to metadata improves their ability to: * Understand relationships between tables and fields. * Infer joins, filters, and aggregations more intelligently. * Avoid syntax errors due to missing or unknown schema information. This metadata layer provides agents with the necessary context to avoid making uninformed queries. # Benefits of MindsDB Source: https://docs.mindsdb.com/faqs/benefits MindsDB facilitates development of AI-powered apps by bridging the gap between data and AI. Thanks to its numerous integrations with data sources (including databases, vector stores, and applications) and AI frameworks (including LLMs and AutoML), you can mix and match between the available integrations to create custom AI workflows with MindsDB. Here are some prominent benefits of using MindsDB: 1. **Unified AI Deployment and Management**
MindsDB integrates directly with the database, warehouse, or stream. This eliminates the need to build and maintain custom, complex data pipelines or separate systems for AI/ML deployment. 2. **Automated AI Workflows**
MindsDB automates the entire AI workflow to execute on time-based or event-based triggers. No need to build custom automation logic to get predictions, move data, or (re)train models. 3. **Turn every developer into an AI Engineer**
MindsDB enables developers to leverage their existing SQL skills, accelerating the adoption of AI across teams and departments, turning every developer into an AI Engineer. 4. **Enhanced Scalability and Performance**
Whether in your private cloud or using MindsDB’s managed service, MindsDB enables you to handle large-scale AI/ML workloads efficiently. MindsDB can scale to meet the demands of your use case, ensuring optimal performance and responsiveness. # Disposable Email Domains and OpenAI Source: https://docs.mindsdb.com/faqs/disposable-email-doman-and-openai Disposable email domains can't make use of OpenAI, therefore users will encounter errors with using MindsDB's integration with OpenAI. To check if your email domain is disposable, you can verify it on [QuickEmailVerification](https://quickemailverification.com/tools/disposable-email-address-detector) or [VerifyEmail.IO](https://verifymail.io/domain/ipnuc.com). # How to Interact with MindsDB from PHP Source: https://docs.mindsdb.com/faqs/mindsdb-with-php To get started with MindsDB, you need to install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). There are a few ways you can interact with MindsDB from the PHP code. 1. You can connect to MindsDB using the [PHP Data Objects](https://www.php.net/manual/en/book.pdo.php) and execute statements directly on MindsDB with the `PDO::query` method. 2. You can use the [REST API](/rest/overview) endpoints to interact with MindsDB directly from PHP. # Missing required CPU features Source: https://docs.mindsdb.com/faqs/missing-required-cpu-features Depending on the operating system and its setup, you may encounter this runtime warning when starting MindsDB: ```bash theme={null} RuntimeWarning: Missing required CPU features. The following required CPU features were not detected: avx2, fma, bmi1, bmi2, lzcnt ``` The solution is to install the `polars-lts-cpu` package in the environment where MindsDB runs. If you are on an Apple ARM machine (e.g. M1), this warning is likely due to running Python under Rosetta. To troubleshoot it, install a native version of Python that does not run under Rosetta x86-64 emulation. # How to Persist Predictions Source: https://docs.mindsdb.com/faqs/persist-predictions MindsDB provides a range of options for persisting predictions and forecasts. Let's explore all possibilities to save the prediction results. **Reasons to Save Predictions** Every time you want to get predictions, you need to query the model, usually joined with an input data table, like this: ```sql theme={null} SELECT input.product_name, input.review, output.sentiment FROM mysql_demo_db.amazon_reviews AS input JOIN sentiment_classifier AS output; ``` However, querying the model returns the result set that is not persistent by default. For future use, it is recommended to persist the result set instead of querying the model again with the same data. MindsDB enables you to save predictions into a view or a table or download as a CSV file. ## Creating a View After creating the model, you can save the prediction results into a view. ```sql theme={null} CREATE VIEW review_sentiment ( -- querying for predictions SELECT input.product_name, input.review, output.sentiment FROM mysql_demo_db.amazon_reviews AS input JOIN sentiment_classifier AS output LIMIT 10 ); ``` Now the `review_sentiment` view stores sentiment predictions made for all customer reviews. Here is a [comprehensive tutorial](/nlp/sentiment-analysis-inside-mysql-with-openai) on how to predict sentiment of customer reviews using OpenAI. ## Creating a Table After creating the model, you can save predictions into a database table. ```sql theme={null} CREATE TABLE local_postgres.question_answers ( -- querying for predictions SELECT input.article_title, input.question, output.answer FROM mysql_demo_db.questions AS input JOIN question_answering_model AS output LIMIT 10 ); ``` Here, the `local_postgres` database is a PostgreSQL database connected to MindsDB with a user that has the write access. Now the `question_answers` table stores all prediction results. Here is a [comprehensive tutorial](/nlp/question-answering-inside-mysql-with-openai) on how to answer questions using OpenAI. ## Downloading a CSV File After executing the `SELECT` statement, you can download the output as a CSV file.

Click the `Export` button and choose the `CSV` option. # Bring Your Own Model Source: https://docs.mindsdb.com/integrations/ai-engines/byom The Bring Your Own Model (BYOM) feature lets you upload your own models in the form of Python code and use them within MindsDB. ## How It Works You can upload your custom model via the MindsDB editor by clicking `Add` and `Upload custom model`, like this:

Here is the form that needs to be filled out in order to bring your model to MindsDB:

Let's briefly go over the files that need to be uploaded: * The Python file stores an implementation of your model. It should contain the class with the implementation for the `train` and `predict` methods. Here is the sample format: ```py theme={null} class CustomPredictor(): ​ def train(self, df, target_col, args=None): return '' def predict(self, df): return df ``` ```py theme={null} import os import pandas as pd ​ from sklearn.cross_decomposition import PLSRegression from sklearn import preprocessing ​ class CustomPredictor(): ​ def train(self, df, target_col, args=None): print(args, '1111') ​ self.target_col = target_col y = df[self.target_col] x = df.drop(columns=self.target_col) x_cols = list(x.columns) ​ x_scaler = preprocessing.StandardScaler().fit(x) y_scaler = preprocessing.StandardScaler().fit(y.values.reshape(-1, 1)) ​ xs = x_scaler.transform(x) ys = y_scaler.transform(y.values.reshape(-1, 1)) ​ pls = PLSRegression(n_components=1) pls.fit(xs, ys) ​ T = pls.x_scores_ W = pls.x_weights_ P = pls.x_loadings_ R = pls.x_rotations_ ​ self.x_cols = x_cols self.x_scaler = x_scaler self.P = P ​ def calc_limit(df): res = None for column in df.columns: if column == self.target_col: continue tbl = df.groupby(self.target_col).agg({column: ['mean', 'min', 'max', 'std']}) tbl.columns = tbl.columns.get_level_values(1) tbl['name'] = column tbl['std'] = tbl['std'].fillna(0) tbl['lower'] = tbl['mean'] - 3 * tbl['std'] tbl['upper'] = tbl['mean'] + 3 * tbl['std'] tbl['lower'] = tbl[["lower", "min"]].max(axis=1) # lower >= min tbl['upper'] = tbl[["upper", "max"]].min(axis=1) # upper <= max tbl = tbl[['name', 'lower', 'mean', 'upper']] try: res = pd.concat([res, tbl]) except: res = tbl return res ​ trdf = pd.DataFrame() trdf[self.target_col] = y.values trdf['T1'] = T.squeeze() limit = calc_limit(trdf).reset_index() ​ self.limit = limit ​ return "Trained predictor ready to be stored" ​ def predict(self, df): ​ yt = df[self.target_col].values xt = df[self.x_cols] ​ xt = self.x_scaler.transform(xt) ​ excess_cols = list(set(df.columns) - set(self.x_cols)) ​ pred_df = df[excess_cols].copy() ​ pred_df[self.target_col] = yt pred_df['T1'] = (xt @ self.P).squeeze() ​ pred_df = pd.merge(pred_df, self.limit[[self.target_col, 'lower', 'upper']], how='left', on=self.target_col) ​ return pred_df ``` * The optional requirements file, or `requirements.txt`, stores all dependencies along with their versions. Here is the sample format: ```sql theme={null} dependency_package_1 == version dependency_package_2 >= version dependency_package_3 >= version, < version ... ``` ```sql theme={null} pandas scikit-learn ``` Once you upload the above files, please provide an engine name. Please note that your custom model is uploaded to MindsDB as an engine. Then you can use this engine to create a model.

## Configuration The BYOM feature can be configured with the following environment variables: * `MINDSDB_BYOM_ENABLED` This environment variable defines whether the BYOM feature is enabled (`MINDSDB_BYOM_ENABLED=true`) or disabled (`MINDSDB_BYOM_ENABLED=false`). Note that when running MindsDB locally, it is enabled by default. * `MINDSDB_BYOM_DEFAULT_TYPE` This environment variable defines the modes of operation of the BYOM feature. * `MINDSDB_BYOM_DEFAULT_TYPE=venv`
When using the `venv` mode, MindsDB creates a virtual environment and installs in it the packages listed in the `requirements.txt` file. This virtual environment is dedicated for the custom model. Note that when running MindsDB locally, it is the default mode. * `MINDSDB_BYOM_DEFAULT_TYPE=inhouse`
When using the `inhouse` mode, there is no dedicated virtual environment for the custom model. It uses the environment of MindsDB, therefore, the `requirements.txt` file is not used with this mode. * `MINDSDB_BYOM_INHOUSE_ENABLED` This environment variable defines whether the `inhouse` mode is enabled (`MINDSDB_BYOM_INHOUSE_ENABLED=true`) or disabled (`MINDSDB_BYOM_INHOUSE_ENABLED=false`). Note that when running MindsDB locally, it is enabled by default. ## Example We upload the custom model, as below:

Here we upload the `model.py` file that stores an implementation of the model and the `requirements.txt` file that stores all the dependencies. Once the model is uploaded, it becomes an ML engine within MindsDB. Now we use this `custom_model_engine` to create a model as follows: ```sql theme={null} CREATE MODEL custom_model FROM my_integration (SELECT * FROM my_table) PREDICT target USING ENGINE = 'custom_model_engine'; ``` Let's query for predictions by joining the custom model with the data table. ```sql theme={null} SELECT input.feature_column, model_target_column FROM my_integration.my_table as input JOIN custom_model as model; ``` Check out the [BYOM handler folder](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/byom_handler) to see the implementation details. # MindsDB and MLflow Source: https://docs.mindsdb.com/integrations/ai-engines/mlflow MLflow allows you to create, train, and serve machine learning models, apart from other features, such as organizing experiments, tracking metrics, and more. ## How to Use MLflow Models in MindsDB Here are the prerequisites for using MLflow-served models in MindsDB: 1. Train a model via a wrapper class that inherits from the `mlflow.pyfunc.PythonModel` class. It should expose the `predict()` method that returns the predicted output for some input data when called. Please ensure that the Python version specified for Conda environment matches the one used to train the model. 2. Start the MLflow server: ```bash theme={null} mlflow server -p 5001 --backend-store-uri sqlite:////path/to/mlflow.db --default-artifact-root ./artifacts --host 0.0.0.0 ``` 3. Serve the trained model: ```bash theme={null} mlflow models serve --model-uri ./model_folder_name ``` ## Example Let's create a model that registers an MLflow-served model as an AI Table: ```sql theme={null} CREATE MODEL mindsdb.mlflow_model PREDICT target USING engine = 'mlflow', model_name = 'model_folder_name', -- replace the model_folder_name variable with a real value mlflow_server_url = 'http://0.0.0.0:5001/', -- match the port number with the MLflow server (point 2 in the previous section) mlflow_server_path = 'sqlite:////path/to/mlflow.db', -- replace the path with a real value (here we use the sqlite database) predict_url = 'http://localhost:5000/invocations'; -- match the port number that serves the trained model (point 3 in the previous section) ``` Here is how to check the models status: ```sql theme={null} DESCRIBE mlflow_model; ``` Once the status is `complete`, we can query for predictions. One way is to query for a single prediction using synthetic data in the `WHERE` clause. ```sql theme={null} SELECT target FROM mindsdb.mlflow_model WHERE text = 'The tsunami is coming, seek high ground'; ``` Another way is to query for batch predictions by joining the model with the data table. ```sql theme={null} SELECT t.text, m.predict FROM mindsdb.mlflow_model AS m JOIN files.some_text as t; ``` Here, the data table comes from the `files` integration. It is joined with the model and predictions are made for all the records at once. **Get More Insights** Check out the article on [How to bring your own machine learning model to databases](https://medium.com/mindsdb/how-to-bring-your-own-machine-learning-model-to-databases-47a188d6db00) by [Patricio Cerda Mardini](https://medium.com/@paxcema) to learn more. # Binance Source: https://docs.mindsdb.com/integrations/app-integrations/binance In this section, we present how to connect Binance to MindsDB. [Binance](https://www.binance.com/en) is one of the world's largest cryptocurrency exchanges. It's an online platform where you can buy, sell, and trade a wide variety of cryptocurrencies. Binance offers a range of services beyond just trading, including staking, lending, and various financial products related to cryptocurrencies. Binance provides real-time trade data that can be utilized within MindsDB to make real-time forecasts. ## Connection This handler integrates with the [Binance API](https://binance-docs.github.io/apidocs/spot/en/#change-log) to make aggregate trade (kline) data available to use for model training and predictions. Since there are no parameters required to connect to Binance using MindsDB, you can use the below statement: ```sql theme={null} CREATE DATABASE my_binance WITH ENGINE = 'binance'; ``` ## Usage ### Select Data By default, aggregate data (klines) from the latest 1000 trading intervals with a length of one minute (1m) each will be returned. ```sql theme={null} SELECT * FROM my_binance.aggregated_trade_data WHERE symbol = 'BTCUSDT'; ``` Here is the sample output data: ``` | symbol | open_time | open_price | high_price | low_price | close_price | volume | close_time | quote_asset_volume | number_of_trades | taker_buy_base_asset_volume | taker_buy_quote_asset_volume | | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ----------- | ------------------ | ---------------- | --------------------------- | ---------------------------- | | BTCUSDT | 1678338600 | 21752.65000 | 21761.33000 | 21751.53000 | 21756.7000 | 103.8614100 | 1678338659.999| 2259656.20520700 | 3655 | 55.25763000 | 1202219.60971860 | ``` where: * `symbol` - Trading pair (BTC to USDT in the above example) * `open_time` - Start time of interval in seconds since the Unix epoch (default interval is 1m) * `open_price` - Price of a base asset at the beginning of a trading interval * `high_price` - The highest price of a base asset during trading interval * `low_price` - Lowest price of a base asset during a trading interval * `close_price` - Price of a base asset at the end of a trading interval * `volume` - Total amount of base asset traded during an interval * `close_time` - End time of interval in seconds since the Unix epoch * `quote_asset_volume` - Total amount of quote asset (USDT in the above case) traded during an interval * `number_of_trades` - Total number of trades made during an interval * `taker_buy_base_asset_volume` - How much of the base asset volume is contributed by taker buy orders * `taker_buy_quote_asset_volume` - How much of the quote asset volume is contributed by taker buy orders To get a customized response we can pass open\_time, close\_time, and interval: ```sql theme={null} SELECT * FROM my_binance.aggregated_trade_data WHERE symbol = 'BTCUSDT' AND open_time > '2023-01-01' AND close_time < '2023-01-03 08:00:00' AND interval = '1s' LIMIT 10000; ``` Supported intervals are [listed here](https://binance-docs.github.io/apidocs/spot/en/#kline-candlestick-data) ### Train a Model Here is how to create a time series model using 10000 trading intervals in the past with a duration of 1m. ```sql theme={null} CREATE MODEL mindsdb.btc_forecast_model FROM my_binance ( SELECT * FROM aggregated_trade_data WHERE symbol = 'BTCUSDT' AND close_time < '2023-01-01' AND interval = '1m' LIMIT 10000; ) PREDICT open_price ORDER BY open_time WINDOW 100 HORIZON 10; ``` For more accuracy, the limit can be set to a higher value (e.g. 100,000) ### Making Predictions First, let's create a view for the most recent BTCUSDT aggregate trade data: ```sql theme={null} CREATE VIEW recent_btcusdt_data AS ( SELECT * FROM my_binance.aggregated_trade_data WHERE symbol = 'BTCUSDT' ) ``` Now let's predict the future price of BTC: ```sql theme={null} SELECT m.* FROM recent_btcusdt_data AS t JOIN mindsdb.btc_forecast_model AS m WHERE m.open_time > LATEST ``` This will give the predicted BTC price for the next 10 minutes (as the horizon is set to 10) in terms of USDT. # Confluence Source: https://docs.mindsdb.com/integrations/app-integrations/confluence This documentation describes the integration of MindsDB with [Confluence](https://www.atlassian.com/software/confluence), a popular collaboration and documentation tool developed by Atlassian. The integration allows MindsDB to access data from Confluence and enhance it with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). ## Connection Establish a connection to Confluence from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/confluence_handler) as an engine. ```sql theme={null} CREATE DATABASE confluence_datasource WITH ENGINE = 'confluence', PARAMETERS = { "api_base": "https://example.atlassian.net", "username": "john.doe@example.com", "password": "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6" }; ``` Required connection parameters include the following: * `api_base`: The base URL for your Confluence instance/server. * `username`: The email address associated with your Confluence account. * `password`: The API token generated for your Confluence account. Refer this [guide](https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/) for instructions on how to create API tokens for your account. ## Usage Retrieve data from a specified table by providing the integration and table names: ```sql theme={null} SELECT * FROM confluence_datasource.table_name LIMIT 10; ``` The above example utilize `confluence_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Supported Tables * `spaces`: The table containing information about the spaces in Confluence. * `pages`: The table containing information about the pages in Confluence. * `blogposts`: The table containing information about the blog posts in Confluence. * `whiteboards`: The table containing information about the whiteboards in Confluence. * `databases`: The table containing information about the databases in Confluence. * `tasks`: The table containing information about the tasks in Confluence. # Docker Hub Source: https://docs.mindsdb.com/integrations/app-integrations/dockerhub In this section, we present how to connect Docker Hub repository to MindsDB. [Docker Hub](https://hub.docker.com/) is the world's easiest way to create, manage, and deliver your team's container applications. Data from Docker Hub can be utilized within MindsDB to train models and make predictions about Docker Hub repositories. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Docker Hub to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Docker Hub. ## Connection This handler is implemented using the `requests` library that makes http calls to [https://docs.docker.com/docker-hub/api/latest/#tag/resources](https://docs.docker.com/docker-hub/api/latest/#tag/resources). The required arguments to establish a connection are as follows: * `username`: Username used to login to DockerHub. * `password`: Password used to login to DockerHub. Read about creating an account [here](https://hub.docker.com/). Here is how to connect to Docker Hub using MindsDB: ```sql theme={null} CREATE DATABASE dockerhub_datasource WITH ENGINE = 'dockerhub', PARAMETERS = { "username": "username", "password": "password" }; ``` ## Usage Now, you can query Docker Hub as follows: ```sql theme={null} SELECT * FROM dockerhub_datasource.repo_images_summary WHERE namespace="docker" AND repository="trusted-registry-nginx"; ``` Both the `namespace` and `repository` parameters are required in the WHERE clause. # Email Source: https://docs.mindsdb.com/integrations/app-integrations/email In this section, we present how to connect Email accounts to MindsDB. By connecting your email account to MindsDB, you can utilize various AI models available within MindsDB to summarize emails, detect spam, or even automate email replies. Please note that currently you can connect Gmail and Outlook accounts using this integration. ## Connection This handler was implemented using standard Python libraries: `email`, `imaplib`, and `smtplib`. The Email handler is initialized with the following required parameters: * `email` stores an email address used for authentication. * `password` stores a password used for authentication. Additionally, the following optional parameters can be passed: * `smtp_server` used to send emails. Defaults to `smtp.gmail.com`. * `smtp_port` used to send emails. Defaults to `587`. * `imap_server` used to receive emails. Defaults to `imap.gmail.com`. At the moment, the handler has been tested with Gmail and Outlook accounts. To use the handler on a Gmail account, you must create an app password following [this instruction](https://support.google.com/accounts/answer/185833?hl=en) and use its value for the `password` parameter. By default, the Email handler connects to Gmail. If you want to use other email providers as Outlook, add the values for `imap_server` and `smtp_server` parameters. ### Gmail To connect your Gmail account to MindsDB, use the below `CREATE DATABASE` statement: ```sql theme={null} CREATE DATABASE email_datasource WITH ENGINE = 'email', PARAMETERS = { "email": "youremail@gmail.com", "password": "yourpassword" }; ``` It creates a database that comes with the `emails` table. ### Outlook To connect your Outlook account to MindsDB, use the below `CREATE DATABASE` statement: ```sql theme={null} CREATE DATABASE email_datasource WITH ENGINE = 'email', PARAMETERS = { "email": "youremail@outlook.com", "password": "yourpassword", "smtp_server": "smtp.office365.com", "smtp_port": "587", "imap_server": "outlook.office365.com" }; ``` It creates a database that comes with the `emails` table. ## Usage Now you can query for emails like this: ```sql theme={null} SELECT * FROM email_datasource.emails; ``` And you can apply filters like this: ```sql theme={null} SELECT id, body, subject, to_field, from_field, datetime FROM email_datasource.emails WHERE subject = 'MindsDB' ORDER BY id LIMIT 5; ``` Or, write emails like this: ```sql theme={null} INSERT INTO email_datasource.emails(to_field, subject, body) VALUES ("toemail@outlook.com", "MindsDB", "Hello from MindsDB!"); ``` # GitHub Source: https://docs.mindsdb.com/integrations/app-integrations/github In this section, we present how to connect GitHub repository to MindsDB. [GitHub](https://github.com/) is a web-based platform and service that is primarily used for version control and collaborative software development. It provides a platform for developers and teams to host, review, and manage source code for software projects. Data from GitHub, including issues and PRs, can be utilized within MindsDB to make relevant predictions or automate the issue/PR creation. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect GitHub to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to GitHub. ## Connection This handler is implemented using the `pygithub` library, a Python library that wraps GitHub API v3. The required arguments to establish a connection are as follows: * `repository` is the GitHub repository name. * `api_key` is an optional GitHub API key to use for authentication. * `github_url` is an optional GitHub URL to connect to a GitHub Enterprise instance. Check out [this guide](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) on how to create the GitHub API key. It is recommended to use the API key to avoid the `API rate limit exceeded` error. Here is how to connect the MindsDB GitHub repository: ```sql theme={null} CREATE DATABASE mindsdb_github WITH ENGINE = 'github', PARAMETERS = { "repository": "mindsdb/mindsdb" }; ``` ## Usage The `mindsdb_github` connection contains two tables: `issues` and `pull_requests`. Here is how to query for all issues: ```sql theme={null} SELECT * FROM mindsdb_github.issues; ``` You can run more advanced queries to fetch specific issues in a defined order: ```sql theme={null} SELECT number, state, creator, assignees, title, labels FROM mindsdb_github.issues WHERE state = 'open' LIMIT 10; ``` And the same goes for pull requests: ```sql theme={null} SELECT number, state, title, creator, head, commits FROM mindsdb_github.pull_requests WHERE state = 'open' LIMIT 10; ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/github_handler/README.md). # GitLab Source: https://docs.mindsdb.com/integrations/app-integrations/gitlab In this section, we present how to connect GitLab repository to MindsDB. [GitLab](https://about.gitlab.com/) is a DevSecOps Platform. Data from GitLab, including issues and MRs, can be utilized within MindsDB to make relevant predictions or automate the issue/MR creation. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect GitLab to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to GitLab. ## Connection This handler was implemented using the [python-gitlab](https://github.com/python-gitlab/python-gitlab) library. python-gitlab is a Python library that wraps GitLab API. The GitLab handler is initialized with the following parameters: * `repository`: a required name of a GitLab repository to connect to. * `api_key`: an optional GitLab API key to use for authentication. Here is how to connect MindsDB to a GitLab repository: ```sql theme={null} CREATE DATABASE mindsdb_gitlab WITH ENGINE = 'gitlab', PARAMETERS = { "repository": "gitlab-org/gitlab", "api_key": "api_key", -- optional GitLab API key }; ``` ## Usage The `mindsdb_gitlab` connection contains two tables: `issues` and `merge_requests`. Now, you can use this established connection to query this table as: ```sql theme={null} SELECT * FROM mindsdb_gitlab.issues; ``` You can run more advanced queries to fetch specific issues in a defined order: ```sql theme={null} SELECT number, state, creator, assignee, title, created, labels FROM mindsdb_gitlab.issues WHERE state="opened" ORDER BY created ASC, creator DESC LIMIT 10; ``` And the same goes for merge requests: ```sql theme={null} SELECT number, state, creator, reviewers, title, created, has_conflicts FROM mindsdb_gitlab.merge_requests WHERE state="merged" ORDER BY created ASC, creator DESC LIMIT 10; ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/gitlab_handler/README.md). # Gmail Source: https://docs.mindsdb.com/integrations/app-integrations/gmail In this section, we present how to connect Gmail accounts to MindsDB. [Gmail](https://gmail.com/) is a widely used and popular email service developed by Google. By connecting your Gmail account to MindsDB, you can utilize various AI models available within MindsDB to summarize emails, detect spam, or even automate email replies. Please note that currently you can connect your Gmail account to local MindsDB installation by providing a path to the credentials file stored locally. If you want to connect your Gmail account to MindsDB Cloud, you can upload the credentials file, for instance, to your S3 bucket and provide a link to it as a parameter. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Gmail to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Gmail. ## Connection The required arguments to establish a connection are as follows: * `credentials_file` local path to the credentials.json or `credentials_url` in case your file is uploaded to s3. Follow the instructions below to generate the credentials file. * `scopes` define the level of access granted. It is optional and by default it uses '[https://.../gmail.compose](https://.../gmail.compose)' and '[https://.../gmail.readonly](https://.../gmail.readonly)' scopes. In order to make use of this handler and connect the Google Calendar app to MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE mindsdb_gmail WITH ENGINE = 'gmail', parameters = { "credentials_file": "mindsdb/integrations/handlers/gmail_handler/credentials.json", "scopes": ['https://.../gmail.compose', 'https://.../gmail.readonly', ...] }; ``` Or, you can also connect by giving the credentials file from an s3 [pre signed url](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html). To do this you need to pass in the credentials\_file parameter as a [pre signed url](https://docs.aws.amazon.com/AmazonS3/latest/userguide/ShareObjectPreSignedURL.html). For example: ```sql theme={null} CREATE DATABASE mindsdb_gmail WITH ENGINE = 'gmail', parameters = { "credentials_url": "https://s3.amazonaws.com/your_bucket/credentials.json?response-content-disposition=inline&X-Amz-Security-Token=12312...", -- "scopes": ['SCOPE_1', 'SCOPE_2', ...] -- Optional scopes. By default 'https://.../gmail.compose' & 'https://.../gmail.readonly' scopes are used }; ``` You need a Google account in order to use this integration. Here is how to get the credentials file: 1. Create a Google Cloud Platform (GCP) Project: 1.1 Go to the GCP Console ([https://console.cloud.google.com/](https://console.cloud.google.com/)). 1.2 If you haven't created a project before, you'll be prompted to do so now. 1.3 Give your new project a name. 1.4 Click `Create` to create the new project. 2. Enable the Gmail API: 2.1 In the GCP Console, select your project. 2.2 Navigate to `APIs & Services` > `Library`. 2.3 In the search bar, search for `Gmail`. 2.4 Click on `Gmail API`, then click `Enable`. 3. Create credentials for the Gmail API: 3.1 Navigate to `APIs & Services` > `Credentials`. 3.2 Click on the `Create Credentials` button and choose `OAuth client ID`. 3.3 If you haven't configured the OAuth consent screen before, you'll be prompted to do so now. Make sure to choose `External` for User Type, and select the necessary scopes. Make sure to save the changes. Now, create the OAuth client ID. Choose `Web application` for the Application Type and give it a name. 3.4 Add the following MindsDB URL to `Authorized redirect URIs`: * For local installation, add `http://localhost/verify-auth` * For Cloud, add `http://cloud.mindsdb.com/verify-auth`. 3.5 Click `Create`. 4. Download the JSON file: 4.1 After creating your credentials, click the download button (an icon of an arrow pointing down) on the right side of your client ID. This will download a JSON file, so you will use the location to it in the `credentials_file` param. ## Usage This creates a database called mindsdb\_gmail. This database ships with a table called emails that we can use to search for emails as well as to write emails. Now you can use your Gmail data, like this: * searching for email: ```sql theme={null} SELECT * FROM mindsdb_gmail.emails WHERE query = 'alert from:*@google.com' AND label_ids = "INBOX,UNREAD" LIMIT 20; ``` * writing emails: ```sql theme={null} INSERT INTO mindsdb_gmail.emails (thread_id, message_id, to_email, subject, body) VALUES ('187cbdd861350934d', '8e54ccfd-abd0-756b-a12e-f7bc95ebc75b@Spark', 'test@example2.com', 'Trying out MindsDB', 'This seems awesome. You must try it out whenever you can.'); ``` ## Example 1: Automating Email Replies Now that we know how to pull emails into our database and write emails, we can make use of OpenAPI engine to write email replies. First, create an OpenAI engine, passing your OpenAI API key: ```sql theme={null} CREATE ML_ENGINE openai_engine FROM openai USING openai_api_key = 'your-openai-api-key'; ``` Then, create a model using this engine: ```sql theme={null} CREATE MODEL mindsdb.gpt_model PREDICT response USING engine = 'openai_engine', max_tokens = 500, api_key = 'your_api_key', model_name = 'gpt-3.5-turbo', prompt_template = 'From input message: {{body}}\ by from_user: {{sender}}\ In less than 500 characters, write an email response to {{sender}} in the following format:\ Start with proper salutation and respond with a short message in a casual tone, and sign the email with my name mindsdb'; ``` ## Example 2: Detecting Spam Emails You can check if an email is spam by using one of the Hugging Face pre-trained models. ```sql theme={null} CREATE MODEL mindsdb.spam_classifier PREDICT PRED USING engine = 'huggingface', task = 'text-classification', model_name = 'mrm8488/bert-tiny-finetuned-sms-spam-detection', input_column = 'text_spammy', labels = ['ham', 'spam']; ``` Then, create a view that contains the snippet or the body of the email. ```sql theme={null} CREATE VIEW mindsdb.emails_text AS( SELECT snippet AS text_spammy FROM mindsdb_gmail.emails ); ``` Finally, you can use the model to classify emails into spam or ham: ```sql theme={null} SELECT h.PRED, h.PRED_explain, t.text_spammy AS input_text FROM mindsdb.emails_text AS t JOIN mindsdb.spam_classifier AS h; ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/gmail_handler/README.md). # Gong Source: https://docs.mindsdb.com/integrations/app-integrations/gong This documentation describes the integration of MindsDB with [Gong](https://www.gong.io/), a conversation intelligence platform that captures, analyzes, and provides insights from customer conversations. The integration allows MindsDB to access call recordings, transcripts, analytics, and other conversation data from Gong and enhance it with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect Gong to MindsDB, install the required dependencies following [this instruction](https://docs.mindsdb.com/setup/self-hosted/docker#install-dependencies). 3. Obtain a Gong API key from your [Gong API settings page](https://app.gong.io/settings/api-keys). ## Connection Establish a connection to Gong from MindsDB by executing the following SQL command and providing its handler name as an engine. ### Using Bearer Token (Recommended) ```sql theme={null} CREATE DATABASE gong_datasource WITH ENGINE = 'gong', PARAMETERS = { "api_key": "your_gong_api_key_here" }; ``` ### Using Basic Authentication ```sql theme={null} CREATE DATABASE gong_datasource WITH ENGINE = 'gong', PARAMETERS = { "access_key": "your_access_key", "secret_key": "your_secret_key" }; ``` Required connection parameters include the following: **Authentication (choose one method):** * `api_key`: Bearer token for authentication (recommended) * `access_key` + `secret_key`: Basic authentication credentials (alternative method) Optional connection parameters include the following: * `base_url`: Gong API base URL. This parameter defaults to `https://api.gong.io`. * `timeout`: Request timeout in seconds. This parameter defaults to `30`. If both authentication methods are provided, basic auth (`access_key` + `secret_key`) takes precedence. ## Usage The following usage examples utilize `gong_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ### Available Tables The Gong handler provides access to the following tables: * `calls` - Access call recordings and metadata * `users` - Get user information and permissions * `analytics` - Access AI-generated conversation insights * `transcripts` - Get full conversation transcripts ### Basic Queries Retrieve recent calls with date filters (recommended for best performance): ```sql theme={null} SELECT * FROM gong_datasource.calls WHERE date >= '2024-01-01' AND date < '2024-02-01' ORDER BY date DESC LIMIT 20; ``` Get all users in your organization: ```sql theme={null} SELECT user_id, name, email, role, status FROM gong_datasource.users LIMIT 100; ``` Get analytics for calls with high sentiment scores: ```sql theme={null} SELECT call_id, sentiment_score, key_phrases, topics FROM gong_datasource.analytics WHERE sentiment_score > 0.7 AND date >= '2024-01-01' LIMIT 50; ``` Get transcripts for a specific call: ```sql theme={null} SELECT speaker, timestamp, text FROM gong_datasource.transcripts WHERE call_id = '12345' ORDER BY timestamp; ``` ### Advanced Queries with JOINs Get calls with their sentiment analysis: ```sql theme={null} SELECT c.title, c.date, c.duration, a.sentiment_score, a.key_phrases FROM gong_datasource.calls c JOIN gong_datasource.analytics a ON c.call_id = a.call_id WHERE c.date >= '2024-01-01' AND c.date < '2024-02-01' ORDER BY a.sentiment_score DESC LIMIT 25; ``` Find calls where specific keywords were mentioned: ```sql theme={null} SELECT c.title, c.date, t.speaker, t.text FROM gong_datasource.calls c JOIN gong_datasource.transcripts t ON c.call_id = t.call_id WHERE c.date >= '2024-01-01' AND t.text LIKE '%pricing%' LIMIT 50; ``` Get user performance with call sentiment: ```sql theme={null} SELECT u.name, u.email, c.call_id, c.title, a.sentiment_score FROM gong_datasource.users u JOIN gong_datasource.calls c ON u.user_id = c.user_id JOIN gong_datasource.analytics a ON c.call_id = a.call_id WHERE c.date >= '2024-01-01' AND a.sentiment_score > 0.8 LIMIT 100; ``` ## Data Schema ### calls Table | Column | Description | | --------------- | -------------------------------------------- | | `call_id` | Unique identifier for the call (Primary Key) | | `title` | Call title or description | | `date` | Call date and time (ISO-8601 format) | | `duration` | Call duration in seconds | | `recording_url` | URL to the call recording | | `call_type` | Type of call (e.g., "sales", "demo") | | `user_id` | ID of the user who made the call | | `participants` | Comma-separated list of participants | | `status` | Call status | ### users Table | Column | Description | | ------------- | -------------------------------------------- | | `user_id` | Unique identifier for the user (Primary Key) | | `name` | User's full name | | `email` | User's email address | | `role` | User's role in the organization | | `permissions` | Comma-separated list of user permissions | | `status` | User status | ### analytics Table | Column | Description | | ------------------ | ------------------------------------------------------------------ | | `call_id` | Reference to the call (Primary Key, Foreign Key to calls.call\_id) | | `sentiment_score` | Sentiment analysis score | | `topic_score` | Topic detection score | | `key_phrases` | Comma-separated list of key phrases | | `topics` | Comma-separated list of detected topics | | `emotions` | Comma-separated list of detected emotions | | `confidence_score` | Confidence score for the analysis | ### transcripts Table | Column | Description | | ------------ | ---------------------------------------------------------- | | `segment_id` | Unique identifier for the transcript segment (Primary Key) | | `call_id` | Reference to the call (Foreign Key to calls.call\_id) | | `speaker` | Name of the speaker | | `timestamp` | Timestamp of the transcript segment (ISO-8601 format) | | `text` | Transcribed text | | `confidence` | Confidence score for the transcription | ## Troubleshooting `Authentication Error` * **Symptoms**: Failure to connect MindsDB with Gong. * **Checklist**: 1. Verify that your Gong API key is valid and not expired. 2. Ensure you have the necessary permissions in Gong to access the API. 3. Check that your API key has access to the specific data you're querying. 4. If using basic authentication, verify both `access_key` and `secret_key` are correct. `Empty Results or Missing Data` * **Symptoms**: Queries return no results or incomplete data. * **Checklist**: 1. Verify that date filters are included in your query (required for calls, analytics, transcripts). 2. Check that the date range includes data (analytics and transcripts have \~1 hour lag). 3. Ensure call\_id exists when querying transcripts for a specific call. 4. Verify that your Gong account has data for the requested time period. `Slow Query Performance` * **Symptoms**: Queries take a long time to execute. * **Checklist**: 1. Add date filters to limit the data range (essential for large datasets). 2. Use LIMIT to restrict the number of results. 3. Filter by call\_id when querying transcripts. 4. Avoid querying transcripts without filters (can return thousands of rows per call). # Google Analytics Source: https://docs.mindsdb.com/integrations/app-integrations/google-analytics In this section, we present how to connect Google Analytics to MindsDB. [Google Analytics](https://analytics.google.com/) is a web analytics service offered by Google that tracks and reports website traffic and also the mobile app traffic & events. Data from Google Analytics can be utilized within MindsDB to train AI models, make predictions, and automate user engagement and events with AI. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Google Analytics to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Google Analytics. ## Connection The required arguments to establish a connection are as follows: * `credentials_file` optional, is a path to the JSON file that stores credentials to the Google account. * `credentials_json`: optional, is the content of the JSON file that stores credentials to the Google account. * `property_id` required, is the property id of your Google Analytics website. [Here](https://developers.google.com/analytics/devguides/reporting/data/v1/property-id) is some information on how to get the property id. > ⚠️ One of credentials\_file or credentials\_json has to be chosen. Please note that a Google account with enabled Google Analytics Admin API is required. You can find more information [here](https://developers.google.com/analytics/devguides/config/admin/v1/quickstart-client-libraries).
Also an active website connected with Google Analytics is required. You can find more information [here](https://support.google.com/analytics/answer/9304153?hl=en).
To make use of this handler and connect the Google Analytics app to MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE my_ga WITH ENGINE = 'google_analytics', PARAMETERS = { 'credentials_file': '\path-to-your-file\credentials.json', 'property_id': '' }; ``` You need a Google account in order to use this integration. Here is how to get the credentials file: 1. Create a Google Cloud Platform (GCP) Project: 1.1 Go to the GCP Console ([https://console.cloud.google.com/](https://console.cloud.google.com/)). 1.2 If you haven't created a project before, you'll be prompted to do so now. 1.3 Give your new project a name. 1.4 Click `Create` to create the new project. 2. Enable the Google Analytics Admin API: 2.1 In the GCP Console, select your project. 2.2 Navigate to `APIs & Services` > `Library`. 2.3 In the search bar, search for `Google Analytics Admin API`. 2.4 Click on `Google Analytics Admin API`, then click `Enable`. 3. Create credentials for the Google Analytics Admin API : 3.1 Navigate to `APIs & Services` > `Credentials`. 3.2 Click on the `Create Credentials` button and choose `Service account`. 3.3 Enter a unique `Service account ID` . 3.4 Click `Done`. 3.5 Copy the service account you created. Find it under `Service Accounts`. 3.6 Now click on the service account you created, and navigate `KEYS` 3.7 Click `ADD KEY` > `Create new key`. 3.8 Choose `JSON` then click `CREATE` 3.9 After this the credentials file will be downloaded directly. Locate the file and use the location to it in the `credentials_file` param. 4. Add Service account to Google Analytics Property: 4.1 In the Google Analytics Admin Console, select the Account or Property to which you want to grant access. 4.2 Navigate to the `Admin` panel. 4.3 Navigate `Account` > `Account Access Management`. 4.4 Click on the "+" icon to add a new user. 4.5 Enter the service account you copied in step 3.5 as the email address. 4.6 Assign the appropriate permissions to the service account. At a minimum, you'll need to grant it `Edit` permissions. 4.7 Click on the `Add` button to add the service account as a user with the specified permissions. ## Usage This creates a database that comes with the `conversion_events` table. Now you can use your Google Analytics data like this: * searching for conversion events: ```sql theme={null} SELECT event_name, custom, countingMethod FROM my_ga.conversion_events; ``` * creating conversion event: ```sql theme={null} INSERT INTO my_ga.conversion_events (event_name, countingMethod) VALUES ('mindsdb_event', 2); ``` * updating one conversion event: ```sql theme={null} UPDATE my_ga.conversion_events SET countingMethod = 1, WHERE name = ''; ``` * deleting one conversion event: ```sql theme={null} DELETE FROM my_ga.conversion_events WHERE name = ''; ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/google_analytics_handler). # Google Calendar Source: https://docs.mindsdb.com/integrations/app-integrations/google-calendar In this section, we present how to connect Google Calendar to MindsDB. [Google Calendar](https://calendar.google.com/calendar/) is an online calendar service and application developed by Google. It allows users to create, manage, and share events and appointments, as well as schedule and organize their personal, work, or team activities. Data from Google Calendar can be utilized within MindsDB to train AI models, make predictions, and automate time management with AI. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Google Calendar to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Google Calendar. ## Connection The required arguments to establish a connection are as follows: * `credentials_file` is a path to the JSON file that stores credentials to the Google account. Please note that a Google account with enabled Google Calendar is required. You can find more information [here](https://developers.google.com/calendar/api/quickstart/python). In order to make use of this handler and connect the Google Calendar app to MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE my_calendar WITH ENGINE = 'google_calendar', PARAMETERS = { 'credentials_file': '\path-to-your-file\credentials.json' }; ``` You need a Google account in order to use this integration. Here is how to get the credentials file: 1. Create a Google Cloud Platform (GCP) Project: 1.1 Go to the GCP Console ([https://console.cloud.google.com/](https://console.cloud.google.com/)). 1.2 If you haven't created a project before, you'll be prompted to do so now. 1.3 Give your new project a name. 1.4 Click `Create` to create the new project. 2. Enable the Google Calendar API: 2.1 In the GCP Console, select your project. 2.2 Navigate to `APIs & Services` > `Library`. 2.3 In the search bar, search for `Google Calendar API`. 2.4 Click on `Google Calendar API`, then click `Enable`. 3. Create credentials for the Google Calendar API : 3.1 Navigate to `APIs & Services` > `Credentials`. 3.2 Click on the `Create Credentials` button and choose `OAuth client ID`. 3.3 If you haven't configured the OAuth consent screen before, you'll be prompted to do so now. Make sure to choose `External` for User Type, and add all the necessary scopes. Make sure to save the changes. Now, create the OAuth client ID. Choose `Desktop app` for the Application Type and give it a name. 3.4 Click `Create`. 4. Download the JSON file: 4.1 After creating your credentials, click the download button (an icon of an arrow pointing down) on the right side of your client ID. This will download a JSON file, so you will use the location to it in the `credentials_file` param. ## Usage This creates a database that comes with the `calendar` table. Now you can use your Google Calendar data, like this: * searching for events: ```sql theme={null} SELECT id, created_at, author_username, text FROM my_calendar.events WHERE start_time = '2023-02-16' AND end_time = '2023-04-09' LIMIT 20; ``` * creating events: ```sql theme={null} INSERT INTO my_calendar.events(start_time, end_time, summary, description, location, attendees, reminders, timeZone) VALUES ('2023-02-16 10:00:00', '2023-02-16 11:00:00', 'MindsDB Meeting', 'Discussing the future of MindsDB', 'MindsDB HQ', '', 'Europe/Athens'); ``` * updating one or more events: ```sql theme={null} UPDATE my_calendar.events SET summary = 'MindsDB Meeting', description = 'Discussing the future of MindsDB', location = 'MindsDB HQ', attendees = '', reminders = '' WHERE event_id > 1 AND event_id < 10; -- used to update events in a given range ``` * deleting one or more events: ```sql theme={null} DELETE FROM my_calendar.events WHERE id = '1'; ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/google_calendar_handler/README.md). # Hacker News Source: https://docs.mindsdb.com/integrations/app-integrations/hackernews In this section, we present how to connect Hacker News to MindsDB. [Hacker News](https://news.ycombinator.com/) is an online platform and community for discussions related to technology, startups, computer science, entrepreneurship, and a wide range of other topics of interest to the tech and hacker communities. It was created by Y Combinator, a well-known startup accelerator. Data from Hacker News, including articles and user comments, can be utilized within MindsDB to train AI models and chatbots with the knowledge and discussions shared at Hacker News. ## Connection This handler is implemented using the official Hacker News API. It provides a simple and easy-to-use interface to access the Hacker News API. There are no connection arguments required. In order to make use of this handler and connect the Hacker News to MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE my_hackernews WITH ENGINE = 'hackernews'; ``` It creates a database that comes with the `stories` and `comments` tables. ## Usage Now you can query the articles, like this: ```sql theme={null} SELECT * FROM my_hackernews.stories LIMIT 2; ``` And here is how to fetch comments for a specific article: ```sql theme={null} SELECT * FROM my_hackernews.comments WHERE item_id=35662571 LIMIT 1; ``` # Instatus Source: https://docs.mindsdb.com/integrations/app-integrations/instatus In this section, we present how to connect Instatus to MindsDB. [Instatus](https://instatus.com/) is a cloud-based status page software that enables users to communicate status information using incidents and maintenances. It serves as a SaaS platform for creating status pages for services. The Instatus Handler for MindsDB offers an interface to connect with Instatus via APIs and retrieve status pages. ## Connection Initialize the Instatus handler with the following parameter: * `api_key`: Instatus API key for authentication. Obtain it from [Instatus Developer Dashboard](https://dashboard.instatus.com/developer). Start by creating a database with the new instatus engine using the following SQL command: ```sql theme={null} CREATE DATABASE mindsdb_instatus --- Display name for the database. WITH ENGINE = 'instatus', --- Name of the MindsDB handler. PARAMETERS = { "api_key": "" --- Instatus API key to use for authentication. }; ``` ## Usage To get a status page, use the `SELECT` statement: ```sql theme={null} SELECT id, name, status, subdomain FROM mindsdb_instatus.status_pages WHERE id = '' LIMIT 10; ``` To create a new status page, use the `INSERT` statement: ```sql theme={null} INSERT INTO mindsdb_instatus.status_pages (email, name, subdomain, components, logoUrl, faviconUrl, websiteUrl, language, useLargeHeader, brandColor, okColor, disruptedColor, degradedColor, downColor, noticeColor, unknownColor, googleAnalytics, subscribeBySms, smsService, twilioSid, twilioToken, twilioSender, nexmoKey, nexmoSecret, nexmoSender, htmlInMeta, htmlAboveHeader, htmlBelowHeader, htmlAboveFooter, htmlBelowFooter, htmlBelowSummary, cssGlobal, launchDate, dateFormat, dateFormatShort, timeFormat) VALUES ('yourname@gmail.com', 'mindsdb', 'mindsdb-instatus', '["Website", "App", "API"]', 'https://instatus.com/sample.png', 'https://instatus.com/favicon-32x32.png', 'https://instatus.com', 'en', true, '#111', '#33B17E', '#FF8C03', '#ECC94B', '#DC123D', '#70808F', '#DFE0E1', 'UA-00000000-1', true, 'twilio', 'YOUR_TWILIO_SID', 'YOUR_TWILIO_TOKEN', 'YOUR_TWILIO_SENDER', null, null, null, null, null, null, null, null, null, null, 'MMMMMM d, yyyy', 'MMM yyyy', 'p'); ``` The following fields are required when inserting new status pages: * `email` (e.g. '[yourname@gmail.com](mailto:yourname@gmail.com)') * `name` (e.g 'mindsdb') * `subdomain` (e.g. 'mindsdb-docs') * `components` (e.g. '\["Website", "App", "API"]') The other fields are optional. To update an existing status page, use the `UPDATE` statement: ```sql theme={null} UPDATE mindsdb_instatus.status_pages SET name = 'mindsdb', status = 'UP', logoUrl = 'https://instatus.com/sample.png', faviconUrl = 'https://instatus.com/favicon-32x32.png', websiteUrl = 'https://instatus.com', language = 'en', translations = '{ "name": { "fr": "nasa" } }' WHERE id = ''; ``` # Intercom Source: https://docs.mindsdb.com/integrations/app-integrations/intercom [Intercom](https://intercom.com) is a software company that provides customer messaging and engagement tools for businesses. They offer products and services for customer support, marketing, and sales, allowing companies to communicate with their customers through various channels like chat, email, and more. ## Connection To get started with the Intercom API, you need to initialize the API handler with the required access token for authentication. You can do this as follows: * `access_token`: Your Intercom access token for authentication. Check out [this guide](https://developers.intercom.com/docs/build-an-integration/learn-more/authentication/) on how to get the intercom access token in order to access Intercom data. To create a database using the Intercom engine, you can use a SQL-like syntax as shown below: ```sql theme={null} CREATE DATABASE myintercom WITH ENGINE = 'intercom', PARAMETERS = { "access_token" : "your-intercom-access-token" }; ``` ## Usage You can retrieve data from Intercom using a `SELECT` statement. For example: ```sql theme={null} SELECT * FROM myintercom.articles; ``` You can filter data based on specific criteria using a `WHERE` clause. Here's an example: ```sql theme={null} SELECT * FROM myintercom.articles WHERE id = ; ``` To create a new article in Intercom, you can use the `INSERT` statement. Here's an example: ```sql theme={null} INSERT INTO myintercom.articles (title, description, body, author_id, state, parent_id, parent_type) VALUES ( 'Thanks for everything', 'Description of the Article', 'Body of the Article', 6840572, 'published', 6801839, 'collection' ); ``` You can update existing records in Intercom using the `UPDATE` statement. For instance: ```sql theme={null} UPDATE myintercom.articles SET title = 'Christmas is here!', body = '

New gifts in store for the jolly season

' WHERE id = ; ``` # Jira Source: https://docs.mindsdb.com/integrations/app-integrations/jira This documentation describes the integration of MindsDB with [Jira](https://www.atlassian.com/software/jira/guides/getting-started/introduction), the #1 agile project management tool used by teams to plan, track, release and support world-class software with confidence. The integration allows MindsDB to access data from Jira and enhance it with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect Jira to MindsDB, install the required dependencies following [this instruction](https://docs.mindsdb.com/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to Jira from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/jira_handler) as an engine. ```sql theme={null} CREATE DATABASE jira_datasource WITH ENGINE = 'jira', PARAMETERS = { "url": "https://example.atlassian.net", "username": "john.doe@example.com", "api_token": "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6" }; ``` Required connection parameters include the following: * `url`: The base URL for your Jira instance/server. * `username`: The email address associated with your Jira account. * `api_token`: The API token generated for your Jira account. * `cloud`: (Optional) Set to `true` for Jira Cloud or `false` for Jira Server. Defaults to `true`. Refer this [guide](https://support.atlassian.com/atlassian-account/docs/manage-api-tokens-for-your-atlassian-account/) for instructions on how to create API tokens for your account. ## Usage Retrieve data from a specified table by providing the integration and table names: ```sql theme={null} SELECT * FROM jira_datasource.table_name LIMIT 10; ``` The above example utilize `jira_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. # MediaWiki Source: https://docs.mindsdb.com/integrations/app-integrations/mediawiki In this section, we present how to connect MediaWiki to MindsDB. [MediaWiki](https://www.mediawiki.org/wiki/MediaWiki) is a free and open-source wiki software platform that is designed to enable the creation and management of wikis. It was originally developed for and continues to power Wikipedia. MediaWiki is highly customizable and can be used to create a wide range of collaborative websites and knowledge bases. Data from MediaWiki can be utilized within MindsDB to train AI models and chatbots using the wide range of available information. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect MediaWiki to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to MediaWiki. ## Connection This handler was implemented using [MediaWikiAPI](https://github.com/lehinevych/MediaWikiAPI), the Python wrapper for the MediaWiki API. There are no connection arguments required to initialize the handler. To connect the MediaWiki API to MindsDB, the following CREATE DATABASE statement can be used: ```sql theme={null} CREATE DATABASE mediawiki_datasource WITH ENGINE = 'mediawiki' ``` ## Usage Now, you can query the MediaWiki API as follows: ```sql theme={null} SELECT * FROM mediawiki_datasource.pages ``` You can run more advanced queries to fetch specific pages in a defined order: ```sql theme={null} SELECT * FROM mediawiki_datasource.pages WHERE title = 'Barack' ORDER BY pageid LIMIT 5 ``` # Microsoft One Drive Source: https://docs.mindsdb.com/integrations/app-integrations/microsoft-onedrive This documentation describes the integration of MindsDB with [Microsoft OneDrive](https://www.microsoft.com/en-us/microsoft-365/onedrive/online-cloud-storage), a cloud storage service that lets you back up, access, edit, share, and sync your files from any device. ## Prerequisites 1. Before proceeding, ensure that MindsDB is installed locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. Register an application in the [Azure portal](https://portal.azure.com/). * Navigate to the [Azure Portal](https://portal.azure.com/#home) and sign in with your Microsoft account. * Locate the **Microsoft Entra ID** service and click on it. * Click on **App registrations** and then click on **New registration**. * Enter a name for your application and select the `Accounts in this organizational directory only` option for the **Supported account types** field. * Keep the **Redirect URI** field empty and click on **Register**. * Click on **API permissions** and then click on **Add a permission**. * Select **Microsoft Graph** and then click on **Delegated permissions**. * Search for the `Files.Read` permission and select it. * Click on **Add permissions**. * Request an administrator to grant consent for the above permissions. If you are the administrator, click on **Grant admin consent for \[your organization]** and then click on **Yes**. * Copy the **Application (client) ID** and record it as the `client_id` parameter, and copy the **Directory (tenant) ID** and record it as the `tenant_id` parameter. * Click on **Certificates & secrets** and then click on **New client secret**. * Enter a description for your client secret and select an expiration period. * Click on **Add** and copy the generated client secret and record it as the `client_secret` parameter. * Click on **Authentication** and then click on **Add a platform**. * Select **Web** and enter URL where MindsDB has been deployed followed by `/verify-auth` in the **Redirect URIs** field. For example, if you are running MindsDB locally (on `https://localhost:47334`), enter `https://localhost:47334/verify-auth` in the **Redirect URIs** field. ## Connection Establish a connection to Microsoft OneDrive from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE one_drive_datasource WITH engine = 'one_drive', parameters = { "client_id": "12345678-90ab-cdef-1234-567890abcdef", "client_secret": "abcd1234efgh5678ijkl9012mnop3456qrst7890uvwx", "tenant_id": "abcdef12-3456-7890-abcd-ef1234567890", }; ``` Note that sample parameter values are provided here for reference, and you should replace them with your connection parameters. Required connection parameters include the following: * `client_id`: The client ID of the registered application. * `client_secret`: The client secret of the registered application. * `tenant_id`: The tenant ID of the registered application. ## Usage Retrieve data from a specified file in Microsoft OneDrive by providing the integration name and the file name: ```sql theme={null} SELECT * FROM one_drive_datasource.`my-file.csv`; LIMIT 10; ``` Wrap the object key in backticks (\`) to avoid any issues parsing the SQL statements provided. This is especially important when the file name contains spaces, special characters or prefixes, such as `my-folder/my-file.csv`. At the moment, the supported file formats are CSV, TSV, JSON, and Parquet. The above examples utilize `one_drive_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. The special `files` table can be used to list the files available in Microsoft OneDrive: ```sql theme={null} SELECT * FROM one_drive_datasource.files LIMIT 10 ``` The content of files can also be retrieved by explicitly requesting the `content` column. This column is empty by default to avoid unnecessary data transfer: ```sql theme={null} SELECT path, content FROM one_drive_datasource.files LIMIT 10 ``` This table will return all objects regardless of the file format, however, only the supported file formats mentioned above can be queried. ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with Microsoft OneDrive. * **Checklist**: 1. Ensure the `client_id`, `client_secret` and `tenant_id` parameters are correctly provided. 2. Ensure the registered application has the required permissions. 3. Ensure the generated client secret is not expired. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing object names containing spaces, special characters or prefixes. * **Checklist**: 1. Ensure object names with spaces, special characters or prefixes are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel/travel\_data.csv * Incorrect: SELECT \* FROM integration.'travel/travel\_data.csv' * Correct: SELECT \* FROM integration.\`travel/travel\_data.csv\` # Microsoft Teams Source: https://docs.mindsdb.com/integrations/app-integrations/microsoft-teams This documentation describes the integration of MindsDB with [Microsoft Teams](https://www.microsoft.com/en-us/microsoft-teams/group-chat-software), the ultimate messaging app for your organization. The integration allows MindsDB to access data from Microsoft Teams and enhance it with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Microsoft Teams to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to Microsoft Teams from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/ms_teams_handler) as an engine. ```sql theme={null} CREATE DATABASE teams_datasource WITH ENGINE = 'teams', PARAMETERS = { "client_id": "12345678-90ab-cdef-1234-567890abcdef", "client_secret": "a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6", "tenant_id": "abcdef12-3456-7890-abcd-ef1234567890" }; ``` Required connection parameters include the following: * `client_id`: The client ID of the registered Microsoft Entra ID application. * `client_secret`: The client secret of the registered Microsoft Entra ID application. * `tenant_id`: The tenant ID of the Microsoft Entra ID directory. Optional connection parameters include the following: * `permission_mode`: The type of permissions used to access data in Microsoft Teams. Can be either `delegated` (default) or `application`. The `delegated` permission mode requires user sign-in and allows the app to access data on behalf of the signed-in user. The `application` permission mode does not require user sign-in and allows the app to access data without a user context. You can learn more about permission types in the [Microsoft Graph permissions documentation](https://learn.microsoft.com/en-us/graph/auth/auth-concepts#delegated-and-application-permissions). Note that application permissions generally require higher privileges and admin consent compared to delegated permissions, as they allow broader access to organizational data without user context. Microsoft Entra ID was previously known as Azure Active Directory (Azure AD). ### How to set up the Microsoft Entra ID app registration Follow the instructions below to set up the Microsoft Teams app that will be used to connect with MindsDB. * Navigate to Microsoft Entra ID in the Azure portal, click on *Add* and then on *App registration*. * Click on *New registration* and fill out the *Name* and select the `Accounts in any organizational directory (Any Azure AD directory - Multitenant)` option under *Supported account types*. * If you chose the `application` permission mode you may skip this step, but if you are using `delegated` permissions, select `Web` as the platform and enter URL where MindsDB has been deployed followed by /verify-auth under *Redirect URI*. For example, if you are running MindsDB locally (on [https://localhost:47334](https://localhost:47334)), enter [https://localhost:47334/verify-auth](https://localhost:47334/verify-auth) in the Redirect URIs field. * Click on *Register*. **Save the *Application (client) ID* and *Directory (tenant) ID* for later use.** * Click on *API Permissions* and then click on *Add a permission*. * Select *Microsoft Graph* and then click on either *Delegated permissions* or *Application permissions* based on the permission mode you have chosen. * Search for the following permissions and select them: * `delegated` permission mode: * Team.ReadBasic.All * Channel.ReadBasic.All * ChannelMessage.Read.All * Chat.Read * `application` permission mode: * Group.Read.All * ChannelMessage.Read.All * Chat.Read.All * Click on **Add permissions**. * Request an administrator to grant consent for the above permissions. If you are the administrator, click on **Grant admin consent for \[your organization]** and then click on **Yes**. * Click on *Certificates & secrets* under *Manage*. * Click on *New client secret* and fill out the *Description* and select an appropriate *Expires* period, and click on *Add*. * Copy and **save the client secret in a secure location.** If you already have an existing app registration, you can use it instead of creating a new one and skip the above steps. * Open the MindsDB editor and create a connection to Microsoft Teams using the client ID, client secret and tenant ID obtained in the previous steps. Use the `CREATE DATABASE` statement as shown above. ## Usage Retrieve data from a specified table by providing the integration and table names: ```sql theme={null} SELECT * FROM teams_datasource.table_name LIMIT 10; ``` The above example utilize `teams_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Supported Tables * `teams`: The table containing information about the teams in Microsoft Teams. * `channels`: The table containing information about the channels in Microsoft Teams. * `channel_messages`: The table containing information about messages from channels in Microsoft Teams. * `chats`: The table containing information about the chats in Microsoft Teams. * `chat_messages`: The table containing information about messages from chats in Microsoft Teams. # News API Source: https://docs.mindsdb.com/integrations/app-integrations/newsapi In this section, we present how to connect News API to MindsDB. [News API](https://newsapi.org/) is a simple HTTP REST API for searching and retrieving live articles from all over the web. Data from News API can be utilized within MindsDB for model training and predictions. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect News API to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to News API. ## Connection This handler is implemented using the [newsapi-python](https://newsapi.org/docs/client-libraries/python) library. The required arguments to establish a connection are as follows: * `api_key` News API key to use for authentication. Check out [this guide](https://newsapi.org/docs/authentication) on how to create the API key. It is recommended to use the API key to avoid the `API rate limit exceeded` error. Here is how to connect News API to MindsDB: ```sql theme={null} CREATE DATABASE newsAPI WITH ENGINE = 'newsapi' PARAMETERS = { "api_key": "Your api key" }; ``` ## Usage Simple Search for recent articles: ```sql theme={null} SELECT * FROM newsAPI.article WHERE query = 'Python'; ``` Advanced search for recent articles per specific sources between dates: ```sql theme={null} SELECT * FROM newsAPI.article WHERE query = 'Python' AND sources="bbc-news" AND publishedAt >= "2021-03-23" AND publishedAt <= "2023-04-23" LIMIT 4; ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/newsapi_handler/README.md). # PayPal Source: https://docs.mindsdb.com/integrations/app-integrations/paypal In this section, we present how to connect PayPal to MindsDB. [PayPal](https://www.bankrate.com/finance/credit-cards/guide-to-using-paypal/) is an online payment system that makes paying for things online and sending and receiving money safe and secure. Data from PayPal can be utilized within MindsDB to train models and make predictions about your transactions. ## Connection This handler is implemented using [PayPal-Python-SDK](https://github.com/paypal/PayPal-Python-SDK), the Python SDK for PayPal RESTful APIs. The required arguments to establish a connection are as follows: * `mode`: The mode of the PayPal API. Can be `sandbox` or `live`. * `client_id`: The client ID of the PayPal API. * `client_secret`: The client secret of the PayPal API. To connect to PayPal using MindsDB, the following CREATE DATABASE statement can be used: ```sql theme={null} CREATE DATABASE paypal_datasource WITH ENGINE = 'paypal', PARAMETERS = { "mode": "your-paypal-mode", "client_id": "your-paypal-client-id", "client_secret": "your-paypal-client-secret" }; ``` Check out [this guide](https://developer.paypal.com/api/rest/) on how to create client credentials for PayPal. ## Usage Now, you can query PayPal as follows: Payments: ```sql theme={null} SELECT * FROM paypal_datasource.payments ``` Invoices: ```sql theme={null} SELECT * FROM paypal_datasource.invoices ``` Subscriptions: ```sql theme={null} SELECT * FROM paypal_datasource.subscriptions ``` Orders: ```sql theme={null} SELECT * FROM paypal_datasource.orders ``` Payouts: ```sql theme={null} SELECT * FROM paypal_datasource.payouts ``` You can also run more advanced queries on your data: Payments: ```sql theme={null} SELECT intent, cart FROM paypal_datasource.payments WHERE state = 'approved' ORDER BY id LIMIT 5 ``` Invoices: ```sql theme={null} SELECT invoice_number, total_amount FROM paypal_datasource.invoices WHERE status = 'PAID' ORDER BY total_amount DESC LIMIT 5 ``` Subscriptions: ```sql theme={null} SELECT id, state, name FROM paypal_datasource.subscriptions WHERE state ="CREATED" LIMIT 5 ``` Orders: ```sql theme={null} SELECT id, state, amount FROM paypal_datasource.orders WHERE state = 'APPROVED' ORDER BY total_amount DESC LIMIT 5 ``` Payouts: ```sql theme={null} SELECT payout_batch_id, amount_currency, amount_value FROM paypal_datasource.payouts ORDER BY amount_value DESC LIMIT 5 ``` ## Supported Tables The following tables are supported by the PayPal handler: * `payments`: payments made. * `invoices`: invoices created. * `subscriptions`: subscriptions created. * `orders`: orders created. * `payouts`: payouts made. # Plaid Source: https://docs.mindsdb.com/integrations/app-integrations/plaid In this section, we present how to connect Plaid to MindsDB. [Plaid](https://plaid.com/) is a financial technology company that offers a platform and a set of APIs that facilitate the integration of financial services and data into applications and websites. Its services primarily focus on enabling developers to connect with and access financial accounts and data from various financial institutions. Data from Plaid can be utilized within MindsDB to train AI models and make financial forecasts. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Plaid to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Plaid. ## Connection The required arguments to establish a connection are as follows: * `client_id` * `secret` * `access_token` * `plaid_env` You can get the `client_id`, `secret`, and `access_token` values [here](https://dashboard.plaid.com/team/keys) once you sign in to your Plaid account. And [here](https://plaid.com/docs/api/items/#itempublic_tokenexchange) is how you generate the `access_token` value. In order to make use of this handler and connect the Plaid app to MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE my_plaid WITH ENGINE = 'plaid', PARAMETERS = { "client_id": "YOUR_CLIENT_ID", "secret": "YOUR_SECRET", "access_token": "YOUR_PUBLIC_KEY", "plaid_env": "ENV" }; ``` It creates a database that comes with two tables: `transactions` and `balance`. ## Usage Now you can query your data, like this: ```sql theme={null} SELECT id, merchant_name, authorized_date, amount, payment_channel FROM my_plaid.transactions WHERE start_date = '2022-01-01' AND end_date = '2023-04-11' LIMIT 20; ``` And if you want to use functions provided by the Plaid API, you can use the native queries syntax, like this: ```sql theme={null} SELECT * FROM my_plaid ( get_transactions( start_date = '2022-01-01', end_date = '2022-02-01' ) ); ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/plaid_handler/README.md). # PyPI Source: https://docs.mindsdb.com/integrations/app-integrations/pypi In this section, we present how to connect PyPI to MindsDB. [PyPI](https://pypi.org) is a host for maintaining and storing Python packages. It's a good place for publishing your Python packages in different versions and releases. Data from PyPI can be utilized within MindsDB to train models and make predictions about your Python packages. ## Connection This handler is implemented using the standard Python `requests` library. It is used to connect to the RESTful service that [pypistats.org](https://pypistats.org) is serving. There are no connection arguments required to initialize the handler. To connect to PyPI using MindsDB, the following CREATE DATABASE statement can be used: ```sql theme={null} CREATE DATABASE pypi_datasource WITH ENGINE = 'pypi' ``` ## Usage Now, you can use the following queries to view the statistics for Python packages (MindsDB, for example): Overall downloads, including mirrors: ```sql theme={null} SELECT * FROM pypi_datasource.overall WHERE package="mindsdb" AND mirrors=true; ``` Overall downloads on CPython==2.7: ```sql theme={null} SELECT * FROM pypi_datasource.python_minor WHERE package="mindsdb" AND version="2.7"; ``` Recent downloads: ```sql theme={null} SELECT * FROM pypi_datasource.recent WHERE package="mindsdb"; ``` Recent downloads in the last day: ```sql theme={null} SELECT * FROM pypi_datasource.recent WHERE package="mindsdb" AND period="day"; ``` All downloads on Linux-based distributions: ```sql theme={null} SELECT date, downloads FROM pypi_datasource.system WHERE package="mindsdb" AND os="Linux"; ``` Each table takes a required `package` argument in the WHERE clause, which is the name of the package you want to query. ## Supported Tables The following tables are supported by the PyPI handler: * `overall`: daily download quantities for packages. * `recent`: recent download quantities for packages. * `python_major`: daily download quantities for packages, grouped by Python major version. * `python_minor`: daily download quantities for packages, grouped by Python minor version. * `system`: daily download quantities for packages, grouped by operating system. # Reddit Source: https://docs.mindsdb.com/integrations/app-integrations/reddit In this section, we present how to connect Reddit to MindsDB. [Reddit](https://www.reddit.com/) is a social media platform and online community where registered users can engage in discussions, share content, and participate in various communities called subreddits. Data from Reddit can be utilized within MindsDB to train AI models and chatbots. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Reddit to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Reddit. ## Connection This handler is implemented using the [PRAW (Python Reddit API Wrapper)](https://praw.readthedocs.io/en/latest/) library, which is a Python package that provides a simple and easy-to-use interface to access the Reddit API. The required arguments to establish a connection are as follows: * `client_id` is a Reddit API client ID. * `client_secret` is a Reddit API client secret. * `user_agent` is a user agent string to identify your application. Here is how to get your Reddit credentials: 1. Go to Reddit App Preferences at [https://www.reddit.com/prefs/apps](https://www.reddit.com/prefs/apps) or [https://old.reddit.com/prefs/apps/](https://old.reddit.com/prefs/apps/) 2. Scroll down to the bottom of the page and click *Create another app...* 3. Fill out the form with the name, description, and redirect URL for your app, then click *Create app* 4. Now you should be able to see the personal user script, secret, and name of your app. Store those as environment variables: `CLIENT_ID`, `CLIENT_SECRET`, and `USER_AGENT`, respectively. In order to make use of this handler and connect the Reddit app to MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE my_reddit WITH ENGINE = 'reddit', PARAMETERS = { "client_id": "YOUR_CLIENT_ID", "client_secret": "YOUR_CLIENT_SECRET", "user_agent": "YOUR_USER_AGENT" }; ``` It creates a database that comes with two tables: `submission` and `comment`. ## Usage Now you can fetch data from Reddit, like this: ```sql theme={null} SELECT * FROM my_reddit.submission WHERE subreddit = 'MachineLearning' AND sort_type = 'top' -- specifies the sorting type for the subreddit (possible values include 'hot', 'new', 'top', 'controversial', 'gilded', 'wiki', 'mod', 'rising') AND items = 5; -- specifies the number of items to fetch from the subreddit ``` You can also fetch comments for a particular post/submission, like this: ```sql theme={null} SELECT * FROM my_reddit.comment WHERE submission_id = '12gls93' ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/reddit_handler/README.md). # Salesforce Source: https://docs.mindsdb.com/integrations/app-integrations/salesforce This documentation describes the integration of MindsDB with [Salesforce](https://www.salesforce.com/), the world’s most trusted customer relationship management (CRM) platform. The integration allows MindsDB to access data from Salesforce and enhance it with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect Salesforce to MindsDB, install the required dependencies following [this instruction](https://docs.mindsdb.com/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to Salesforce from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/salesforce_handler) as an engine. ```sql theme={null} CREATE DATABASE salesforce_datasource WITH ENGINE = 'salesforce', PARAMETERS = { "username": "demo@example.com", "password": "demo_password", "client_id": "3MVG9lKcPoNINVBIPJjdw1J9LLM82HnZz9Yh7ZJnY", "client_secret": "5A52C1A1E21DF9012IODC9ISNXXAADDA9" }; ``` Required connection parameters include the following: * `username`: The username for the Salesforce account. * `password`: The password for the Salesforce account. * `client_id`: The client ID (consumer key) from a connected app in Salesforce. * `client_secret`: The client secret (consumer secret) from a connected app in Salesforce. Optional connection parameters include the following: * `is_sandbox`: The setting to indicate whether to connect to a Salesforce sandbox environment (`true`) or production environment (`false`). This parameter defaults to `false`. To create a connected app in Salesforce and obtain the client ID and client secret, follow the steps given below: 1. Log in to your Salesforce account. 2. Go to `Settings` > `Open Advanced Setup` > `Apps` > `App Manager`. 3. Click `New Connected App`, select `Create a Connected App` and click `Continue`. 4. Fill in the required details, i.e., `Connected App Name`, `API Name` and `Contact Phone`. 5. Select the `Enable OAuth Settings` checkbox, set the `Callback URL` to wherever MindsDB is deployed followed by `/verify-auth` (e.g., `http://localhost:47334/verify-auth`), and choose the following OAuth scopes: * Manage user data via APIs (api) * Perform requests at any time (refresh\_token, offline\_access) 6. Click `Save` and then `Continue`. 7. Click on `Manage Consumer Details` under `API (Enable OAuth Settings)`, and copy the Consumer Key (client ID) and Consumer Secret (client secret). 8. Click on `Back to Manage Connected Apps` and then `Manage`. 9. Click `Edit Policies`. 10. Under `OAuth Policies`, configure the `Permitted Users` and `IP Relaxation` settings according to your security policies. For example, to enable all users to access the app without enforcing any IP restrictions, select `All users may self-authorize` and `Relax IP restrictions` respectively. Leave the `Refresh Token Policy` set to `Refresh token is valid until revoked`. 11. Click `Save`. 12. Go to `Identity` > `OAuth and OpenID Connect Settings`. 13. Ensure that the `Allow OAuth Username-Password Flows` checkbox is checked. ## Usage Retrieve data from a specified table by providing the integration and table names: ```sql theme={null} SELECT * FROM salesforce_datasource.table_name LIMIT 10; ``` Run [SOQL](https://developer.salesforce.com/docs/atlas.en-us.soql_sosl.meta/soql_sosl/sforce_api_calls_soql.htm) queries directly on the connected Salesforce account: ```sql theme={null} SELECT * FROM salesforce_datasource ( --Native Query Goes Here SELECT Name, Account.Name, Account.Industry FROM Contact WHERE Account.Industry = 'Technology' LIMIT 5 ); ``` The above examples utilize `salesforce_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Salesforce Table Filtering We have implemented a filtering logic to exclude tables that are generally not useful for direct business queries, which fall into the following categories: * System and Auditing Tables: We exclude tables that track field history, record sharing rules, and data change events (e.g., objects ending in History, Share, or ChangeEvent). These are important for system administration but not for typical business analysis. * Configuration and Metadata: We remove tables that define the structure and configuration of Salesforce itself. This includes objects related to user permissions, internal rules, platform settings, and data definitions (e.g., FieldDefinition, PermissionSet, AssignmentRule). * Feature-Specific Technical Objects: Tables that support specific backend Salesforce features are excluded. This includes objects related to: * AI and Einstein: (AI...) * Developer Components: (Apex..., Aura...) * Data Privacy and Consent: (objects ending in Consent or containing Policy) * Chatter and Collaboration Feeds: (...Feed, Collaboration...) * Archived or Legacy Objects: Older objects that have been replaced by modern equivalents, such as ContentWorkspace, are also excluded to simplify the list. # Sendinblue Source: https://docs.mindsdb.com/integrations/app-integrations/sendinblue In this section, we present how to connect Sendinblue to MindsDB. [Brevo (formerly Sendinblue)](https://www.brevo.com/) is an all-in-one platform to automate your marketing campaigns over Email, SMS, WhatsApp or chat. Data from Sendinblue can be used to understand the impact of email marketing. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Sendinblue to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Sendinblue. ## Connection This handler is implemented using the [sib-api-v3-sdk](https://github.com/sendinblue/APIv3-python-library) library, a Python library that wraps Sendinblue APIs. The required arguments to establish a connection are as follows: * `api_key`: a required Sendinblue API key to use for authentication Check out [this guide](https://developers.brevo.com/docs) on how to create the Sendinblue API key. It is recommended to use the API key to avoid the `API rate limit exceeded` error. Here is how to connect the SendinBlue to MindsDB: ```sql theme={null} CREATE DATABASE sib_datasource WITH ENGINE = 'sendinblue', PARAMETERS = { "api_key": "xkeysib-..." }; ``` ## Usage Use the established connection to query your database: ```sql theme={null} SELECT * FROM sib_datasource.email_campaigns ``` Run more advanced queries: ```sql theme={null} SELECT id, name FROM sib_datasource.email_campaigns WHERE status = 'sent' ORDER BY name LIMIT 5 ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/sendinblue_handler/README.md). # Shopify Source: https://docs.mindsdb.com/integrations/app-integrations/shopify In this section, we present how to connect Shopify to MindsDB. [Shopify](https://www.shopify.com/) is an e-commerce platform that enables businesses to create and manage online stores. It is one of the leading e-commerce solutions, providing a wide range of tools and services to help entrepreneurs and businesses sell products and services online. Data from Shopify can be utilized within MindsDB to train AI models and chatbots using Products, Customers and Orders data, and make predictions relevant for businesses. ## Connection The required arguments to establish a connection are as follows: * `shop_url`: a required URL to your Shopify store. * `access_token`: a required access token to use for authentication. Here is how you can [create a Shopify access token](https://www.youtube.com/watch?v=4f_aiC5oTNc\&t=302s). Optionally, if you want to access customer reviews, provide the following parameters: * `yotpo_app_key`: a token needed to access customer reviews via the Yotpo Product Reviews app. * `yotpo_access_token`: a token needed to access customer reviews via the Yotpo Product Reviews app. If you want to query customer reviews, use the [Yotpo Product Reviews](https://apps.shopify.com/yotpo-social-reviews) app available in Shopify. Here are the steps to follow: 1. Install the [Yotpo Product Reviews](https://apps.shopify.com/yotpo-social-reviews) app for your Shopify store. 2. Generate `yotpo_app_key` following [this instruction](https://support.yotpo.com/docs/finding-your-yotpo-app-key-and-secret-key) for retrieving your app key. Learn more about [Yotpo authentication here](https://apidocs.yotpo.com/reference/yotpo-authentication). 3. Generate `yotpo_access_token` following [this instruction](https://develop.yotpo.com/reference/generate-a-token). To connect your Shopify account to MindsDB, you must first create a new handler instance. You can do it by the following query: ```sql theme={null} CREATE DATABASE shopify_datasource WITH ENGINE = 'shopify', PARAMETERS = { "shop_url": "your-shop-name.myshopify.com", "access_token": "shppa_..." }; ``` ## Usage Once you have created the database, you can query the following tables: * Products table * Customers table * Orders table * CustomerReviews table (requires the [Yotpo Product Reviews](https://apps.shopify.com/yotpo-social-reviews) app to be installed in your Shopify account) * InventoryLevel table * Location table * CarrierService table * ShippingZone table * SalesChannel table ### Products table You can query this table as below: ```sql theme={null} SELECT * FROM shopify_datasource.products; ``` Also, you can run more advanced queries and filter products by status, like this: ```sql theme={null} SELECT id, title FROM shopify_datasource.products WHERE status = 'active' ORDER BY id LIMIT 5; ``` To insert new data, run the `INSERT INTO` statement, providing the following values: `title`, `body_html`, `vendor`, `product_type`, `tags`, `status`. To update existing data, run the `UPDATE` statement. To delete data, run the `DELETE` statement. ### Customers table You can query this table as below: ```sql theme={null} SELECT * FROM shopify_datasource.customers; ``` To insert new data, run this statement: ```sql theme={null} INSERT INTO shopify_datasource.customers(first_name, last_name, email, phone) VALUES ('John', 'Doe', 'john.doe@example.com', '+10001112222'); ``` To update existing data, run the `UPDATE` statement. To delete data, run the `DELETE` statement. ### Orders table You can query this table as below: ```sql theme={null} SELECT * FROM shopify_datasource.orders; ``` To insert new data, run the `INSERT INTO` statement. To update existing data, run the `UPDATE` statement. To delete data, run the `DELETE` statement. ### CustomerReviews table You can query this table as below: ```sql theme={null} SELECT * FROM shopify_datasource.customer_reviews; ``` ### InventoryLevel table You can query this table as below: ```sql theme={null} SELECT * FROM shopify_datasource.inventory_level; ``` ### Location table You can query this table as below: ```sql theme={null} SELECT * FROM shopify_datasource.locations; ``` ### CarrierService table You can query this table as below: ```sql theme={null} SELECT * FROM shopify_datasource.carrier_service; ``` To insert new data, run the `INSERT INTO` statement, providing the following values: `name`, `callback_url`, `service_discovery`. To update existing data, run the `UPDATE` statement. To delete data, run the `DELETE` statement. ### ShippingZone table You can query this table as below: ```sql theme={null} SELECT * FROM shopify_datasource.shipping_zone; ``` ### SalesChannel table You can query this table as below: ```sql theme={null} SELECT * FROM shopify_datasource.sales_channel; ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/shopify_handler/README.md). # Slack Source: https://docs.mindsdb.com/integrations/app-integrations/slack This documentation describes the integration of MindsDB with [Slack](https://slack.com/), a cloud-based collaboration platform. The integration allows MindsDB to access data from Slack and enhance Slack with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Slack to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Slack. ## Connection Establish a connection to Slack from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/slack_handler) as an engine. ```sql theme={null} CREATE DATABASE slack_datasource WITH ENGINE = 'slack', PARAMETERS = { "token": "values", -- required parameter "app_token": "values" -- optional parameter }; ``` The Slack handler is initialized with the following parameters: * `token` is a Slack bot token to use for authentication. * `app_token` is a Slack app token to use for authentication. Please note that `app_token` is an optional parameter. Without providing it, you need to integrate an app into a Slack channel. ### Method 1: Chatbot responds in direct messages to a Slack app One way to connect Slack is to use both bot and app tokens. By following the instructions below, you'll set up the Slack app and be able to message this Slack app directly to chat with the bot. If you want to use Slack in the [`CREATE CHATBOT`](/agents/chatbot) syntax, use this method of connecting Slack to MindsDB. Here is how to set up a Slack app and generate both a Slack bot token and a Slack app token: 1. Follow [this link](https://api.slack.com/apps) and sign in with your Slack account. 2. Create a new app `From scratch` or select an existing app. * Please note that the following instructions support apps created `From scratch`. * For apps created `From an app manifest`, please follow the [Slack docs here](https://api.slack.com/reference/manifests). 3. Go to *Basic Information* under *Settings*. * Under *App-Level Tokens*, click on *Generate Token and Scopes*. * Name the token `socket` and add the `connections:write` scope. * **Copy and save the `xapp-...` token - you'll need it to publish the chatbot.** 4. Go to *Socket Mode* under *Settings* and toggle the button to *Enable Socket Mode*. 5. Go to *OAuth & Permissions* under *Features*. * Add the following *Bot Token Scopes*: * app\_mentions:read * channels:history * channels:read * chat:write * groups:history * groups:read (optional) * im:history * im:read * im:write * mpim:read (optional) * users.profile:read * users:read (optional) * In the *OAuth Tokens for Your Workspace* section, click on *Install to Workspace* and then *Allow*. * **Copy and save the `xoxb-...` token - you'll need it to publish the chatbot.** 6. Go to *App Home* under *Features* and click on the checkbox to *Allow users to send Slash commands and messages from the messages tab*. 7. Go to *Event Subscriptions* under *Features*. * Toggle the button to *Enable Events*. * Under *Subscribe to bot events*, click on *Add Bot User Event* and add `app_mention` and `message.im`. * Click on *Save Changes*. 8. Now you can use tokens from points 3 and 5 to initialize the Slack handler in MindsDB. This connection method enables you to chat directly with an app via Slack. Alternatively, you can connect an app to the Slack channel: * Go to the channel where you want to use the bot. * Right-click on the channel and select *View Channel Details*. * Select *Integrations*. * Click on *Add an App*. Here is how to connect Slack to MindsDB: ```sql theme={null} CREATE DATABASE slack_datasource WITH ENGINE = 'slack', PARAMETERS = { "token": "xoxb-...", "app_token": "xapp-..." }; ``` It comes with the `conversations` and `messages` tables. ### Method 2: Chatbot responds on a defined Slack channel Another way to connect to Slack is to use the bot token only. By following the instructions below, you'll set up the Slack app and integrate it into one of the channels from which you can directly chat with the bot. Here is how to set up a Slack app and generate a Slack bot token: 1. Follow [this link](https://api.slack.com/apps) and sign in with your Slack account. 2. Create a new app `From scratch` or select an existing app. * Please note that the following instructions support apps created `From scratch`. * For apps created `From an app manifest`, please follow the [Slack docs here](https://api.slack.com/reference/manifests). 3. Go to the *OAuth & Permissions* section. 4. Under the *Scopes* section, add the *Bot Token Scopes* necessary for your application. You can add more later as well. * channels:history * channels:read * chat:write * groups:read * im:read * mpim:read * users:read 5. Install the bot in your workspace. 6. Under the *OAuth Tokens for Your Workspace* section, copy the the *Bot User OAuth Token* value. 7. Open your Slack application and add the App/Bot to one of the channels: * Go to the channel where you want to use the bot. * Right-click on the channel and select *View Channel Details*. * Select *Integrations*. * Click on *Add an App*. 8. Now you can use the token from step 6 to initialize the Slack handler in MindsDB and use the channel name to query and write messages. Here is how to connect Slack to MindsDB: ```sql theme={null} CREATE DATABASE slack_datasource WITH ENGINE = 'slack', PARAMETERS = { "token": "xoxb-..." }; ``` ## Usage The following usage applies when **Connection Method 2** was used to connect Slack. See the usage for **Connection Method 1** [via the `CREATE CHATBOT` syntax](/sql/tutorials/create-chatbot). Retrieve data from a specified table by providing the integration and table names: ```sql theme={null} SELECT * FROM slack_datasource.table_name LIMIT 10; ``` ## Supported Tables The Slack integration supports the following tables: ### `conversations` Table The `conversations` virtual table is used to query conversations (channels, DMs, and groups) in the connected Slack workspace. ```sql theme={null} -- Retrieve all conversations in the workspace SELECT * FROM slack_datasource.conversations; -- Retrieve a specific conversation using its ID SELECT * FROM slack_datasource.conversations WHERE id = ""; -- Retrieve a specific conversation using its name SELECT * FROM slack_datasource.conversations WHERE name = ""; ``` ### `messages` Table The `messages` virtual table is used to query, post, update, and delete messages in specific conversations within the connected Slack workspace. ```sql theme={null} -- Retrieve all messages from a specific conversation -- channel_id is a required parameter and can be found in the conversations table SELECT * FROM slack_datasource.messages WHERE channel_id = ""; -- Post a new message -- channel_id and text are required parameters INSERT INTO slack_datasource.messages (channel_id, text) VALUES("", "Hello from SQL!"); -- Update a bot-posted message -- channel_id, ts, and text are required parameters UPDATE slack_datasource.messages SET text = "Updated message content" WHERE channel_id = "" AND ts = ""; -- Delete a bot-posted message -- channel_id and ts are required parameters DELETE FROM slack_datasource.messages WHERE channel_id = "" AND ts = ""; ``` You can also find the channel ID by right-clicking on the conversation in Slack, selecting 'View conversation details' or 'View channel details,' and copying the channel ID from the bottom of the 'About' tab. ### `threads` Table The `threads` virtual table is used to query and post messages in threads within the connected Slack workspace. ```sql theme={null} -- Retrieve all messages in a specific thread -- channel_id and thread_ts are required parameters -- thread_ts is the timestamp of the parent message and can be found in the messages table SELECT * FROM slack_datasource.threads WHERE channel_id = "" AND thread_ts = ""; -- Post a message to a thread INSERT INTO slack_datasource.threads (channel_id, thread_ts, text) VALUES("", "", "Replying to the thread!"); ``` ### `users` Table The `users` virtual table is used to query user information in the connected Slack workspace. ```sql theme={null} -- Retrieve all users in the workspace SELECT * FROM slack_datasource.users; -- Retrieve a specific user by name SELECT * FROM slack_datasource.users WHERE name = "John Doe"; ``` ## Rate Limit Considerations The Slack API enforces rate limits on data retrieval. Therefore, when querying the above tables, by default, the first 1000 (999 for `messages`) records are returned. To retrieve more records, use the `LIMIT` clause in your SQL queries. For example: ```sql theme={null} SELECT * FROM slack_datasource.conversations LIMIT 2000; ``` When using the LIMIT clause to query additional records, you may encounter Slack API rate limits. ## Next Steps Follow [this tutorial](/use-cases/ai_agents/build_ai_agents) to build an AI agent with MindsDB. # Strapi Source: https://docs.mindsdb.com/integrations/app-integrations/strapi [Strapi](https://strapi.io/) is a popular open-source Headless Content Management System (CMS) that empowers developers to work with their preferred tools and frameworks, while providing content editors with a user-friendly interface to manage and distribute content across various platforms. The Strapi Handler is a MindsDB handler that enables SQL-based querying of Strapi collections. This documentation provides a brief overview of its features, initialization parameters, and example usage. ## Connection To use the Strapi Handler, initialize it with the following parameters: * `host`: Strapi server host. * `port`: Strapi server port (typically 1337). * `api_token`: Strapi server API token for authentication. * `plural_api_ids`: List of plural API IDs for the collections. To get started, create a Strapi engine database with the following SQL command: ```sql theme={null} CREATE DATABASE myshop --- Display name for the database. WITH ENGINE = 'strapi', --- Name of the MindsDB handler. PARAMETERS = { "host" : "", --- Host (can be an IP address or URL). "port" : "", --- Common port is 1337. "api_token": "", --- API token of the Strapi server. "plural_api_ids" : [""] --- Plural API IDs of the collections. }; ``` ## Usage Retrieve data from a collection: ```sql theme={null} SELECT * FROM myshop.; ``` Filter data based on specific criteria: ```sql theme={null} SELECT * FROM myshop. WHERE id = ``` Insert new data into a collection: ```sql theme={null} INSERT INTO myshop. (, , ...) VALUES (, , ...); ``` Note: You only able to insert data into the collection which has `create` permission. Modify existing data in a collection: ```sql theme={null} UPDATE myshop. SET = , = , ... WHERE id = ; ``` Note: You only able to update data into the collection which has `update` permission. # Stripe Source: https://docs.mindsdb.com/integrations/app-integrations/stripe In this section, we present how to connect Stripe to MindsDB. [Stripe](https://stripe.com/) is a financial technology company that provides a set of software and payment processing solutions for businesses and individuals to accept payments over the internet. Stripe is one of the leading payment gateway and online payment processing platforms. Data from Stripe can be utilized within MindsDB to train AI models and chatbots based on customers, products, and payment intents, and make relevant predictions and forecasts. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Stripe to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Stripe. ## Connection This handler was implemented using [stripe-python](https://github.com/stripe/stripe-python), the Python library for the Stripe API. There is only one parameter required to set up the connection with Stripe: * `api_key`: a Stripe API key. You can find your API keys in the Stripe Dashboard. [Read more](https://stripe.com/docs/keys). To connect to Stripe using MindsDB, the following CREATE DATABASE statement can be used: ```sql theme={null} CREATE DATABASE stripe_datasource WITH ENGINE = 'stripe', PARAMETERS = { "api_key": "sk_..." }; ``` ## Usage Now, you can query the data in your Stripe account (customers, for example) as follows: ```sql theme={null} SELECT * FROM stripe_datasource.customers ``` You can run more advanced queries to fetch specific customers in a defined order: ```sql theme={null} SELECT name, email FROM stripe_datasource.customers WHERE currency = 'inr' ORDER BY name LIMIT 5 ``` ### Supported tables The following tables are supported by the Stripe handler: * `customers` * `products` * `payment_intents` # Symbl Source: https://docs.mindsdb.com/integrations/app-integrations/symbl This documentation describes the integration of MindsDB with [Symbl](https://symbl.ai/), a platform with state-of-the-art and task-specific LLMs that enables businesses to analyze multi-party conversations at scale. This integration allows MindsDB to process conversation data and extract insights from it. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Symbl to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). Please note that in order to successfully install the dependencies for Symbl, it is necessary to install `portaudio` and few other Linux packages in the Docker container first. To do this, run the following commands: 1. Start an interactive shell in the container: ```bash theme={null} docker exec -it mindsdb_container sh ``` If you haven't specified a name when spinning up the MindsDB container with `docker run`, you can find it by running `docker ps`. If you are using Docker Desktop, you can navigate to 'Containers', locate the multi-container application running the extension, click on the `mindsdb_service` container and then click on the 'Exec' tab to start an interactive shell. 2. Install the required packages: ```bash theme={null} apt-get update && apt-get install -y \ libportaudio2 libportaudiocpp0 portaudio19-dev \ python3-dev \ build-essential \ && rm -rf /var/lib/apt/lists/* ``` ## Connection Establish a connection to your Symbl from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE mindsdb_symbl WITH ENGINE = 'symbl', PARAMETERS = { "app_id": "app_id", "app_secret":"app_secret" }; ``` Required connection parameters include the following: * `app_id`: The Symbl app identifier. * `app_secret`: The Symbl app secret. ## Usage First, process the conversation data and get the conversation ID via the `get_conversation_id` table: ```sql theme={null} SELECT * FROM mindsdb_symbl.get_conversation_id WHERE audio_url="https://symbltestdata.s3.us-east-2.amazonaws.com/newPhonecall.mp3"; ``` Next, use the conversation ID to get the results of the above from the other supported tables: ```sql theme={null} SELECT * FROM mindsdb_symbl.get_messages WHERE conversation_id="5682305049034752"; ``` Other supported tables include: * `get_topics` * `get_questions` * `get_analytics` * `get_action_items` The above examples utilize `mindsdb_symbl` as the datasource name, which is defined in the `CREATE DATABASE` command. # Twitter Source: https://docs.mindsdb.com/integrations/app-integrations/twitter In this section, we present how to connect Twitter accounts to MindsDB. [Twitter](https://twitter.com/) is a widely recognized social media platform and microblogging service that allows users to share short messages called tweets. The Twitter handler enables you to fetch tweets and create replies utilizing AI models wthin MindsDB. Furthermore, you can automate the process of fetching tweets, preparing replies, and sending replies to Twitter. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Twitter to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Twitter. ## Connection To connect a Twitter account to MindsDB, you need a Twitter developer account. Please note that it requires a paid developer account. We recommend you use the [Elevated access](https://developer.twitter.com/en/support/twitter-api/developer-account) allowing you to pull 2m tweets and to avoid *parameters or authentication issue* error you might get sometimes. You can check [this step-by-step guide](https://medium.com/@skillcate/set-up-twitter-api-to-pull-2m-tweets-month-44d004c6f7ce) describing how to apply for the Elevated access. If you don't already have a Twitter developer account, follow the steps in the video below to apply for one. [Begin here to apply for a Twitter developer account](https://developer.twitter.com/apply-for-access) Watch this [step-by-step video](https://www.youtube.com/watch?v=qVe7PeC0sUQ) explaining the process. When presented with questions under *How will you use the Twitter API or Twitter Data?*, use answers similar to the ones below (tweak to fit your exact use case). The more thorough your answers are, the more likely it is your account will get approved. **Intended Usage (In Your Words)** *I have a blog and want to educate users how to use the Twitter API with MindsDB.* *I will read tweets that mention me and use them with MindsDB machine learning to generate responses. I plan to post tweets 2-3 times a day and keep using Twitter like I normally would.* **Are you planning to analyze Twitter data?** *I plan to build machine learning algorithms based on Twitter data. I am interested in doing sentiment analysis and topic analysis.* *I will potentially extract:* * *Tweet text* * *Favorite count and retweet count* * *Hashtags and mentions* **Will your app use Tweet, Retweet, Like, Follow, or Direct Message functionality?** *I will use the Twitter API to post responses to tweets that mention me.* *I will have word filters to make sure that I never share offensive or potentially controversial subjects.* **Do you plan to display Tweets or aggregate data about Twitter content outside Twitter?** *I plan to share aggregate data as examples for users of my upcoming blog. I don't intend to create an automated dashboard that consumes a lot of Twitter API calls.* *Every API call will be done locally, or automated on a simple web server. Aggregate of data will be for educational purposes only.* **Will your product, service, or analysis make Twitter content or derived information available to a government entity?** Answer NO to this one. If you already have a Twitter developer account, you need to generate API keys following the instructions below or heading to the [Twitter developer website](https://developer.twitter.com/en). * Create an application with Read/Write permissions activated: * Open [developer portal](https://developer.twitter.com/en/portal/projects-and-apps). * Select the `Add app` button to create a new app. * Select the `Create new` button. * Select `Production` and give it a name. * Copy and populate the following in the below `CREATE DATABASE` statement: * `Bearer Token` as a value of the `bearer_token` parameter. * `API Key` as a value of the `consumer_key` parameter. * `API Key Secret` as a value of the `consumer_secret` parameter. * Setup user authentication settings: * Click `Setup` under `User authentication settings`: * On `Permissions`, select `Read and Write`. * On `Type of app`, select `Web App`, `Automated App or Bot`. * On `App info`, provide any URL for the callback URL and website URL (you can use the URL of this page). * Click `Save`. * Generate access tokens: * Once you are back in the app settings, click `Keys and Tokens`: * Generate `Access Token` and `Access Token Secret` and populate it in the below `CREATE DATABASE` statement: * `Access Token` as a value of the `access_token` parameter. * `Access Token Secret` as a value of the `access_token_secret` parameter. Once you have all the tokens and keys, here is how to connect your Twitter account to MindsDB: ```sql theme={null} CREATE DATABASE my_twitter WITH ENGINE = 'twitter', PARAMETERS = { "bearer_token": "twitter bearer token", "consumer_key": "twitter consumer key", "consumer_secret": "twitter consumer key secret", "access_token": "twitter access token", "access_token_secret": "twitter access token secret" }; ``` ## Usage The `my_twitter` database contains a table called `tweets` by default. Here is how to search tweets containing `mindsdb` keyword: ```sql theme={null} SELECT id, created_at, author_username, text FROM my_twitter.tweets WHERE query = '(mindsdb OR #mindsdb) -is:retweet -is:reply' AND created_at > '2023-02-16' LIMIT 20; ``` Please note that we can see only recent tweets from the past seven days. The `created_at` column condition is skipped if the provided date is earlier than seven days. Alternatively, you can use a Twitter native query, as below: ```sql theme={null} SELECT * FROM my_twitter ( search_recent_tweets( query = '(mindsdb OR #mindsdb) -is:retweet -is:reply', start_time = '2023-03-16T00:00:00.000Z', max_results = 2 ) ); ``` To learn more about native queries in MindsDB, visit our docs [here](/sql/native-queries). Here is how to write tweets: ```sql theme={null} INSERT INTO my_twitter.tweets (reply_to_tweet_id, text) VALUES (1626198053446369280, 'MindsDB is great! now its super simple to build ML powered apps'), (1626198053446369280, 'Holy!! MindsDB is the best thing they have invented for developers doing ML'); ``` For more information about available actions and development plans, visit [this page](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/twitter_handler/README.md). **What's next?** Check out the [tutorial on how to create a Twitter chatbot](/sql/tutorials/twitter-chatbot) to see one of the interesting applications of this integration. # Web Crawler Source: https://docs.mindsdb.com/integrations/app-integrations/web-crawler In this section, we present how to use a web crawler within MindsDB. A web crawler is an automated script designed to systematically browse and index content on the internet. Within MindsDB, you can utilize a web crawler to efficiently collect data from various websites. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To use Web Crawler with MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection This handler does not require any connection parameters. Here is how to initialize a web crawler: ```sql theme={null} CREATE DATABASE my_web WITH ENGINE = 'web'; ``` The above query creates a database called `my_web`. This database by default has a table called `crawler` that stores data from a given URL or multiple URLs. ## Usage ### Parameters #### Crawl Depth The `crawl_depth` parameter defines how deep the crawler should navigate through linked pages: * `crawl_depth = 0`: Crawls only the specified page. * `crawl_depth = 1`: Crawls the specified page and all linked pages on it. * Higher values continue the pattern. #### Page Limits There are multiple ways to limit the number of pages returned: * The `LIMIT` clause defines the maximum number of pages returned globally. * The `per_url_limit` parameter limits the number of pages returned for each specific URL, if more than one URL is provided. ### Crawling a Single URL The following example retrieves data from a single webpage: ```sql theme={null} SELECT * FROM my_web.crawler WHERE url = 'https://docs.mindsdb.com/'; ``` Returns **1 row** by default. To retrieve more pages from the same URL, specify the `LIMIT`: ```sql theme={null} SELECT * FROM my_web.crawler WHERE url = 'https://docs.mindsdb.com/' LIMIT 30; ``` Returns up to **30 rows**. ### Crawling Multiple URLs To crawl multiple URLs at once: ```sql theme={null} SELECT * FROM my_web.crawler WHERE url IN ('https://docs.mindsdb.com/', 'https://dev.mysql.com/doc/', 'https://mindsdb.com/'); ``` Returns **3 rows** by default (1 row per URL). To apply a per-URL limit: ```sql theme={null} SELECT * FROM my_web.crawler WHERE url IN ('https://docs.mindsdb.com/', 'https://dev.mysql.com/doc/') AND per_url_limit = 2; ``` Returns **4 rows** (2 rows per URL). ### Crawling with Depth To crawl all pages linked within a website: ```sql theme={null} SELECT * FROM my_web.crawler WHERE url = 'https://docs.mindsdb.com/' AND crawl_depth = 1; ``` Returns **1 + x rows**, where `x` is the number of linked webpages. For multiple URLs with crawl depth: ```sql theme={null} SELECT * FROM my_web.crawler WHERE url IN ('https://docs.mindsdb.com/', 'https://dev.mysql.com/doc/') AND crawl_depth = 1; ``` Returns **2 + x + y rows**, where `x` and `y` are the number of linked pages from each URL. ### Get PDF Content MindsDB accepts [file uploads](/sql/create/file) of `csv`, `xlsx`, `xls`, `sheet`, `json`, and `parquet`. However, you can also configure the web crawler to fetch data from PDF files accessible via URLs. ```sql theme={null} SELECT * FROM my_web.crawler WHERE url = '' LIMIT 1; ``` ### Configuring Web Handler for Specific Domains The Web Handler can be configured to interact only with specific domains by using the `web_crawling_allowed_sites` setting in the `config.json` file. This feature allows you to restrict the handler to crawl and process content only from the domains you specify, enhancing security and control over web interactions. To configure this, simply list the allowed domains under the `web_crawling_allowed_sites` key in `config.json`. For example: ```json theme={null} "web_crawling_allowed_sites": [ "https://docs.mindsdb.com", "https://another-allowed-site.com" ] ``` ## Troubleshooting `Web crawler encounters character encoding issues` * **Symptoms**: Extracted text appears garbled or contains strange characters instead of the expected text. * **Checklist**: 1. Open a GitHub Issue: If you encounter a bug or a repeatable error with encoding, report it on the [MindsDB GitHub](https://github.com/mindsdb/mindsdb/issues) repository by opening an issue. `Web crawler times out while trying to fetch content` * **Symptoms**: The crawler fails to retrieve data from a website, resulting in timeout errors. * **Checklist**: 1. Check the network connection to ensure the target site is reachable. # YouTube Source: https://docs.mindsdb.com/integrations/app-integrations/youtube In this section, we present how to connect YouTube to MindsDB. [YouTube](https://www.youtube.com/) is a popular online video-sharing platform and social media website where users can upload, view, share, and interact with videos created by individuals and organizations from around the world. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB on your system or obtain access to cloud options. 2. To use YouTube with MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection There are two ways you can connect YouTube to MindsDB: 1. Limited permissions: This option provides MindsDB with read-only access to YouTube, including viewing comments data. 2. Elevated permissions: This option provides MindsDB with full access to YouTube, including viewing comments data and posting replies to comments. ### Option 1: Limited permissions Establish a connection to YouTube from MindsDB by executing the below SQL command and following the Google authorization link provided as output: ```sql theme={null} CREATE DATABASE mindsdb_youtube WITH ENGINE = 'youtube', PARAMETERS = { "youtube_api_token": "" }; ``` Alternatively, you can connect YouTube to MindsDB via the form. To do that, click on the `Add` button, choose `New Datasource`, search for `YouTube`, and follow the instructions in the form. After providing the connection name and the YouTube API token, click on the `Test Connection` button. Once the connection is established, click on the `Save and Continue` button. Required connection parameters include the following: * `youtube_api_token`: It is a Google API key used for authentication. Check out [this guide](https://blog.hubspot.com/website/how-to-get-youtube-api-key) on how to create the API key to access YouTube data. ### Option 2: Elevated permissions Establish a connection to YouTube from MindsDB by executing the below SQL command and following the Google authorization link provided as output: ```sql theme={null} CREATE DATABASE mindsdb_youtube WITH ENGINE = 'youtube', PARAMETERS = { "credentials_file": "path-to-credentials-json-file" -- alternatively, use the credentials_url parameter }; ``` Alternatively, you can connect YouTube to MindsDB via the form. To do that, click on the `Add` button, choose `New Datasource`, search for `YouTube`, and follow the instructions in the form. After providing the connection name and the credentials file or URL, click on the `Test Connection` button and complete the authorization process in the pop-up window. Once the connection is established, click on the `Save and Continue` button. Required connection parameters include one of the following: * `credentials_file`: It is a path to a file generated from the Google Cloud Console, as described below. * `credentials_url`: It is a URL to a file generated from the Google Cloud Console, as described below. 1. Open the Google Cloud Console. 2. Create a new project. 3. Inside this project, go to APIs & Services: * Go to Enabled APIs & services: * Click on ENABLE APIS AND SERVICES from the top bar. * Search for YouTube Data API v3 and enable it. * Go to OAuth consent screen: * Click on GET STARTED. * Provide app name and support email. * Choose Audience based on who will be using the app. * Add the Contact Information (email address) of the developer. * Agree to the terms and click on CONTINUE. * Click on Create. * Click on Audience on the left sidebar and under Test users, add the email addresses of the users who will be testing the app. When you are ready to publish the app, you can come back here and click on PUBLISH APP and this app will become available to either the organization or the public based on the audience you have chosen. * Go to Credentials: * Click on CREATE CREDENTIALS from the top bar and choose OAuth client ID. * Choose type as `Web application` and provide a name. Under Authorized redirect URIs, enter URL where MindsDB has been deployed followed by `/verify-auth`. For example, if you are running MindsDB locally (on `https://localhost:47334`), enter `https://localhost:47334/verify-auth`. * Click on CREATE. * Download the JSON file that is required to connect YouTube to MindsDB. ## Usage Use the established connection to query the `comments` table. You can query for one video's comments: ```sql theme={null} SELECT * FROM mindsdb_youtube.comments WHERE video_id = "raWFGQ20OfA"; ``` Or for one channels's comments: ```sql theme={null} SELECT * FROM mindsdb_youtube.comments WHERE channel_id="UC-..."; ``` You can include ordering and limiting the output data: ```sql theme={null} SELECT * FROM mindsdb_youtube.comments WHERE video_id = "raWFGQ20OfA" ORDER BY display_name ASC LIMIT 5; ``` Use the established connection to query the `channels` table. ```sql theme={null} SELECT * FROM mindsdb_youtube.channels WHERE channel_id="UC-..."; ``` Here, the `channel_id` column is mandatory in the `WHERE` clause. Use the established connection to query the `videos` table. ```sql theme={null} SELECT * FROM mindsdb_youtube.videos WHERE video_id="id"; ``` Here, the `video_id` column is mandatory in the `WHERE` clause. With the connection option 2, you can insert replies to comments: ```sql theme={null} INSERT INTO mindsdb_youtube.comments (comment_id, reply) VALUES ("comment_id", "reply message"); ``` # Airtable Source: https://docs.mindsdb.com/integrations/data-integrations/airtable This is the implementation of the Airtable data handler for MindsDB. [Airtable](https://www.airtable.com/lp/campaign/database) is a platform that makes it easy to build powerful, custom applications. These tools can streamline just about any process, workflow, or project. And best of all, you can build them without ever learning to write a single line of code. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Airtable to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Airtable. ## Implementation This handler is implemented using `duckdb`, a library that allows SQL queries to be executed on `pandas` DataFrames. In essence, when querying a particular table, the entire table is first pulled into a `pandas` DataFrame using the [Airtable API](https://airtable.com/api). Once this is done, SQL queries can be run on the DataFrame using `duckdb`. The required arguments to establish a connection are as follows: * `base_id` is the Airtable base ID. * `table_name` is the Airtable table name. * `api_key` is the API key for the Airtable API. ## Usage In order to make use of this handler and connect to the Airtable database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE airtable_datasource WITH engine = 'airtable', parameters = { "base_id": "dqweqweqrwwqq", "table_name": "iris", "api_key": "knlsndlknslk" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM airtable_datasource.example_tbl; ``` At the moment, only the `SELECT` statement is allowed to be executed through `duckdb`. This, however, has no restriction on running machine learning algorithms against your data in Airtable using the `CREATE MODEL` statement. # Amazon Aurora Source: https://docs.mindsdb.com/integrations/data-integrations/amazon-aurora This is the implementation of the Amazon Aurora handler for MindsDB. [Amazon Aurora](https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/CHAP_AuroraOverview.html) is a fully managed relational database engine that's compatible with MySQL and PostgreSQL. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Amazon Aurora to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Amazon Aurora. ## Implementation This handler was implemented using the existing MindsDB handlers for MySQL and PostgreSQL. The required arguments to establish a connection are as follows: * `host`: the host name or IP address of the Amazon Aurora DB cluster. * `port`: the TCP/IP port of the Amazon Aurora DB cluster. * `user`: the username used to authenticate with the Amazon Aurora DB cluster. * `password`: the password to authenticate the user with the Amazon Aurora DB cluster. * `database`: the database name to use when connecting with the Amazon Aurora DB cluster. There optional arguments that can be used are as follows: * `db_engine`: the database engine of the Amazon Aurora DB cluster. This can take one of two values: 'mysql' or 'postgresql'. This parameter is optional, but if it is not provided, `aws_access_key_id` and `aws_secret_access_key` parameters must be provided. * `aws_access_key_id`: the access key for the AWS account. This parameter is optional and is only required to be provided if the `db_engine` parameter is not provided. * `aws_secret_access_key`: the secret key for the AWS account. This parameter is optional and is only required to be provided if the `db_engine` parameter is not provided. ## Usage In order to make use of this handler and connect to an Amazon Aurora MySQL DB Cluster in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE aurora_mysql_datasource WITH engine = 'aurora', parameters = { "db_engine": "mysql", "host": "mysqlcluster.cluster-123456789012.us-east-1.rds.amazonaws.com", "port": 3306, "user": "admin", "password": "password", "database": "example_db" }; ``` Now, you can use this established connection to query your database as follows: ```sql theme={null} SELECT * SELECT * FROM aurora_mysql_datasource.example_table; ``` Similar commands can be used to establish a connection and query Amazon Aurora PostgreSQL DB Cluster: ```sql theme={null} CREATE DATABASE aurora_postgres_datasource WITH engine = 'aurora', parameters = { "db_engine": "postgresql", "host": "postgresmycluster.cluster-123456789012.us-east-1.rds.amazonaws.com", "port": 5432, "user": "postgres", "password": "password", "database": "example_db " }; SELECT * FROM aurora_postgres_datasource.example_table ``` If you want to switch to different database, you can include it in your query as: ```sql theme={null} SELECT * FROM aurora_datasource.new_database.example_table; ``` # Amazon DynamoDB Source: https://docs.mindsdb.com/integrations/data-integrations/amazon-dynamodb This documentation describes the integration of MindsDB with [Amazon DynamoDB](https://aws.amazon.com/dynamodb/), a serverless, NoSQL database service that enables you to develop modern applications at any scale. This data source integration is thread-safe, utilizing a connection pool where each thread is assigned its own connection. When handling requests in parallel, threads retrieve connections from the pool as needed. ## Prerequisites Before proceeding, ensure that MindsDB is installed locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). ## Connection Establish a connection to your Amazon DynamoDB from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE dynamodb_datasource WITH engine = 'dynamodb', parameters = { "aws_access_key_id": "PCAQ2LJDOSWLNSQKOCPW", "aws_secret_access_key": "U/VjewPlNopsDmmwItl34r2neyC6WhZpUiip57i", "region_name": "us-east-1" }; ``` Required connection parameters include the following: * `aws_access_key_id`: The AWS access key that identifies the user or IAM role. * `aws_secret_access_key`: The AWS secret access key that identifies the user or IAM role. * `region_name`: The AWS region to connect to. Optional connection parameters include the following: * `aws_session_token`: The AWS session token that identifies the user or IAM role. This becomes necessary when using temporary security credentials. ## Usage Retrieve data from a specified table by providing the integration name and the table name: ```sql theme={null} SELECT * FROM dynamodb_datasource.table_name LIMIT 10; ``` Indexes can also be queried by adding a third-level namespace: ```sql theme={null} SELECT * FROM dynamodb_datasource.table_name.index_name LIMIT 10; ``` The queries issued to Amazon DynamoDB are in PartiQL, a SQL-compatible query language for Amazon DynamoDB. For more information, refer to the [PartiQL documentation](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ql-reference.html). There are a few limitations to keep in mind when querying data from Amazon DynamoDB (some of which are specific to PartiQL): * The `LIMIT`, `GROUP BY` and `HAVING` clauses are not supported in PartiQL `SELECT` statements. Furthermore, subqueries and joins are not supported either. Refer to the [PartiQL documentation for SELECT statements](https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/ql-reference.select.html) for more information. * `INSERT` statements are not supported by this integration. However, this can be overcome by issuing a 'native query' via an established connection. An example of this is provided below. Run PartiQL queries directly on Amazon DynamoDB: ```sql theme={null} SELECT * FROM dynamodb_datasource ( --Native Query Goes Here INSERT INTO "Music" value {'Artist' : 'Acme Band1','SongTitle' : 'PartiQL Rocks'} ); ``` The above examples utilize `dynamodb_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the Amazon S3 DynamoDB. * **Checklist**: 1. Confirm that provided AWS credentials are correct. Try making a direct connection to the Amazon DynamoDB using the AWS CLI. 2. Ensure a stable network between MindsDB and AWS. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing special characters. * **Checklist**: 1. Ensure table names with special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel-data * Incorrect: SELECT \* FROM integration.'travel-data' * Correct: SELECT \* FROM integration.\`travel-data\` # Amazon Redshift Source: https://docs.mindsdb.com/integrations/data-integrations/amazon-redshift This documentation describes the integration of MindsDB with [Amazon Redshift](https://docs.aws.amazon.com/redshift/latest/mgmt/welcome.html), a fully managed, petabyte-scale data warehouse service in the cloud. You can start with just a few hundred gigabytes of data and scale to a petabyte or more, enabling you to use your data to acquire new insights for your business and customers. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Redshift to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to your Redshift database from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE redshift_datasource WITH engine = 'redshift', parameters = { "host": "examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com", "port": 5439, "database": "example_db", "user": "awsuser", "password": "my_password" }; ``` Required connection parameters include the following: * `host`: The host name or IP address of the Redshift cluster. * `port`: The port to use when connecting with the Redshift cluster. * `database`: The database name to use when connecting with the Redshift cluster. * `user`: The username to authenticate the user with the Redshift cluster. * `password`: The password to authenticate the user with the Redshift cluster. Optional connection parameters include the following: * `schema`: The database schema to use. Default is public. * `sslmode`: The SSL mode for the connection. ## Usage Retrieve data from a specified table by providing the integration name, schema, and table name: ```sql theme={null} SELECT * FROM redshift_datasource.schema_name.table_name LIMIT 10; ``` Run Amazon Redshift SQL queries directly on the connected Redshift database: ```sql theme={null} SELECT * FROM redshift_datasource ( --Native Query Goes Here WITH VENUECOPY AS (SELECT * FROM VENUE) SELECT * FROM VENUECOPY ORDER BY 1 LIMIT 10; ); ``` The above examples utilize `redshift_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the Amazon Redshift cluster. * **Checklist**: 1. Make sure the Redshift cluster is active. 2. Confirm that host, port, user, password and database are correct. Try a direct Redshift connection using a client like DBeaver. 3. Ensure that the security settings of the Redshift cluster allow connections from MindsDB. 4. Ensure a stable network between MindsDB and Redshift. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel data * Incorrect: SELECT \* FROM integration.'travel data' * Correct: SELECT \* FROM integration.\`travel data\` This [troubleshooting guide](https://docs.aws.amazon.com/redshift/latest/mgmt/troubleshooting-connections.html) provided by AWS might also be helpful. # Amazon S3 Source: https://docs.mindsdb.com/integrations/data-integrations/amazon-s3 This documentation describes the integration of MindsDB with [Amazon S3](https://docs.aws.amazon.com/AmazonS3/latest/userguide/Welcome.html), an object storage service that offers industry-leading scalability, data availability, security, and performance. This data source integration is thread-safe, utilizing a connection pool where each thread is assigned its own connection. When handling requests in parallel, threads retrieve connections from the pool as needed. ## Prerequisites Before proceeding, ensure that MindsDB is installed locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). ## Connection Establish a connection to your Amazon S3 bucket from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE s3_datasource WITH engine = 's3', parameters = { "aws_access_key_id": "AQAXEQK89OX07YS34OP", "aws_secret_access_key": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY", "bucket": "my-bucket" }; ``` Note that sample parameter values are provided here for reference, and you should replace them with your connection parameters. Required connection parameters include the following: * `aws_access_key_id`: The AWS access key that identifies the user or IAM role. * `aws_secret_access_key`: The AWS secret access key that identifies the user or IAM role. Optional connection parameters include the following: * `aws_session_token`: The AWS session token that identifies the user or IAM role. This becomes necessary when using temporary security credentials. * `bucket`: The name of the Amazon S3 bucket. If not provided, all available buckets can be queried, however, this can affect performance, especially when listing all of the available objects. ## Usage Retrieve data from a specified object (file) in a S3 bucket by providing the integration name and the object key: ```sql theme={null} SELECT * FROM s3_datasource.`my-file.csv`; LIMIT 10; ``` If a bucket name is provided in the `CREATE DATABASE` command, querying will be limited to that bucket and the bucket name can be ommitted from the object key as shown in the example above. However, if the bucket name is not provided, the object key must include the bucket name, such as `s3_datasource.`my-bucket/my-folder/my-file.csv\`. Wrap the object key in backticks (\`) to avoid any issues parsing the SQL statements provided. This is especially important when the object key contains spaces, special characters or prefixes, such as `my-folder/my-file.csv`. At the moment, the supported file formats are CSV, TSV, JSON, and Parquet. The above examples utilize `s3_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. The special `files` table can be used to list all objects available in the specified bucket or all buckets if the bucket name is not provided: ```sql theme={null} SELECT * FROM s3_datasource.files LIMIT 10 ``` The content of files can also be retrieved by explicitly requesting the `content` column. This column is empty by default to avoid unnecessary data transfer: ```sql theme={null} SELECT path, content FROM s3_datasource.files LIMIT 10 ``` This table will return all objects regardless of the file format, however, only the supported file formats mentioned above can be queried. ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the Amazon S3 bucket. * **Checklist**: 1. Make sure the Amazon S3 bucket exists. 2. Confirm that provided AWS credentials are correct. Try making a direct connection to the S3 bucket using the AWS CLI. 3. Ensure a stable network between MindsDB and AWS. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing object names containing spaces, special characters or prefixes. * **Checklist**: 1. Ensure object names with spaces, special characters or prefixes are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel/travel\_data.csv * Incorrect: SELECT \* FROM integration.'travel/travel\_data.csv' * Correct: SELECT \* FROM integration.\`travel/travel\_data.csv\` # Apache Cassandra Source: https://docs.mindsdb.com/integrations/data-integrations/apache-cassandra This is the implementation of the Cassandra data handler for MindsDB. [Cassandra](https://cassandra.apache.org/_/index.html) is a free and open-source, distributed, wide-column store, NoSQL database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Apache Cassandra to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Apache Cassandra. ## Implementation As ScyllaDB is API-compatible with Apache Cassandra, the Cassandra data handler extends the ScyllaDB handler and uses the `scylla-driver` Python library. The required arguments to establish a connection are as follows: * `host` is the host name or IP address of the Cassandra database. * `port` is the port to use when connecting. * `user` is the user to authenticate. * `password` is the password to authenticate the user. * `keyspace` is the keyspace to connect, the top level container for tables. * `protocol_version` is not required and defaults to 4. ## Usage In order to make use of this handler and connect to the Cassandra server in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE sc WITH engine = "cassandra", parameters = { "host": "127.0.0.1", "port": "9043", "user": "user", "password": "pass", "keyspace": "test_data", "protocol_version": 4 }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM cassandra_datasource.example_table LIMIT 10; ``` # Apache Druid Source: https://docs.mindsdb.com/integrations/data-integrations/apache-druid This is the implementation of the Druid data handler for MindsDB. [Apache Druid](https://druid.apache.org/docs/latest/design) is a real-time analytics database designed for fast slice-and-dice analytics (*OLAP* queries) on large data sets. Most often, Druid powers use cases where real-time ingestion, fast query performance, and high uptime are important. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Apache Druid to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Apache Druid. ## Implementation This handler was implemented using the `pydruid` library, the Python API for Apache Druid. The required arguments to establish a connection are as follows: * `host` is the host name or IP address of the Apache Druid database. * `port` is the port that Apache Druid is running on. * `path` is the query path. * `scheme` is the URI schema. This parameter is optional and defaults to `http`. * `user` is the username used to authenticate with Apache Druid. This parameter is optional. * `password` is the password used to authenticate with Apache Druid. This parameter is optional. ## Usage In order to make use of this handler and connect to Apache Druid in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE druid_datasource WITH engine = 'druid', parameters = { "host": "localhost", "port": 8888, "path": "/druid/v2/sql/", "scheme": "http" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM druid_datasource.example_tbl; ``` # Apache Hive Source: https://docs.mindsdb.com/integrations/data-integrations/apache-hive This documentation describes the integration of MindsDB with [Apache Hive](https://hive.apache.org/), a data warehouse software project built on top of Apache Hadoop for providing data query and analysis. Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. The integration allows MindsDB to access data from Apache Hive and enhance Apache Hive with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect Apache Hive to MindsDB, install the required dependencies following [this instruction](https://docs.mindsdb.com/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to Apache Hive from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/hive_handler) as an engine. ```sql theme={null} CREATE DATABASE hive_datasource WITH engine = 'hive', parameters = { "username": "demo_user", "password": "demo_password", "host": "127.0.0.1", "database": "default" }; ``` Required connection parameters include the following: * `host`: The hostname, IP address, or URL of the Apache Hive server. * `database`: The name of the Apache Hive database to connect to. Optional connection parameters include the following: * `username`: The username for the Apache Hive database. * `password`: The password for the Apache Hive database. * `port`: The port number for connecting to the Apache Hive server. Default is `10000`. * `auth`: The authentication mechanism to use. Default is `CUSTOM`. Other options are `NONE`, `NOSASL`, `KERBEROS` and `LDAP`. ## Usage Retrieve data from a specified table by providing the integration and table names: ```sql theme={null} SELECT * FROM hive_datasource.table_name LIMIT 10; ``` Run HiveQL queries directly on the connected Apache Hive database: ```sql theme={null} SELECT * FROM hive_datasource ( --Native Query Goes Here FROM (FROM (FROM src SELECT TRANSFORM(value) USING 'mapper' AS value, count) mapped SELECT cast(value as double) AS value, cast(count as int) AS count SORT BY value, count) sorted SELECT TRANSFORM(value, count) USING 'reducer' AS whatever ); ``` The above examples utilize `hive_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Troubleshooting `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the Apache Hive database. * **Checklist**: 1. Ensure that the Apache Hive server is running and accessible 2. Confirm that host, port, user, and password are correct. Try a direct Apache Hive connection using a client like DBeaver. 3. Test the network connection between the MindsDB host and the Apache Hive server. # Apache Ignite Source: https://docs.mindsdb.com/integrations/data-integrations/apache-ignite This is the implementation of the Apache Ignite data handler for MindsDB. [Apache Ignite](https://ignite.apache.org/docs/latest/) is a distributed database for high-performance computing with in-memory speed. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Apache Ignite to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Apache Ignite. ## Implementation This handler is implemented using the `pyignite` library, the Apache Ignite thin (binary protocol) client for Python. The required arguments to establish a connection are as follows: * `host` is the host name or IP address of the Apache Ignite cluster's node. * `port` is the TCP/IP port of the Apache Ignite cluster's node. Must be an integer. There are several optional arguments that can be used as well, * `username` is the username used to authenticate with the Apache Ignite cluster. This parameter is optional. Default: None. * `password` is the password to authenticate the user with the Apache Ignite cluster. This parameter is optional. Default: None. * `schema` is the schema to use for the connection to the Apache Ignite cluster. This parameter is optional. Default: PUBLIC. ## Usage In order to make use of this handler and connect to an Apache Ignite database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE ignite_datasource WITH ENGINE = 'ignite', PARAMETERS = { "host": "127.0.0.1", "port": 10800, "username": "admin", "password": "password", "schema": "example_schema" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM ignite_datasource.demo_table LIMIT 10; ``` Currently, a connection can be established only to a single node in the cluster. In the future, we'll configure the client to automatically fail over to another node if the connection to the current node fails or times out by providing the hosts and ports for many nodes as explained [here](https://ignite.apache.org/docs/latest/thin-clients/python-thin-client). # Apache Impala Source: https://docs.mindsdb.com/integrations/data-integrations/apache-impala This is the implementation of the Impala data handler for MindsDB. [Apache Impala](https://impala.apache.org/) is an MPP (Massive Parallel Processing) SQL query engine for processing huge volumes of data that is stored in the Apache Hadoop cluster. It is an open source software written in C++ and Java. It provides high performance and low latency compared to other SQL engines for Hadoop. In other words, Impala is the highest performing SQL engine (giving RDBMS-like experience) that provides the fastest way to access data stored in Hadoop Distributed File System. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Apache Impala to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Apache Impala. ## Implementation This handler is implemented using `impyla`, a Python library that allows you to use Python code to run SQL commands on Impala. The required arguments to establish a connection are: * `user` is the username associated with the database. * `password` is the password to authenticate your access. * `host` is the server IP address or hostname. * `port` is the port through which TCP/IP connection is to be made. * `database` is the database name to be connected. ## Usage In order to make use of this handler and connect to the Impala database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE impala_datasource WITH engine = 'impala', parameters = { "user":"root", "password":"p@55w0rd", "host":"127.0.0.1", "port":21050, "database":"Db_NamE" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM impala_datasource.TEST; ``` # Apache Pinot Source: https://docs.mindsdb.com/integrations/data-integrations/apache-pinot This is the implementation of the Pinot data handler for MindsDB. [Apache Pinot](https://pinot.apache.org/) is a real-time distributed OLAP database designed for low-latency query execution even at extremely high throughput. Apache Pinot can ingest directly from streaming sources like Apache Kafka and make events available for querying immediately. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Apache Pinot to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Apache Pinot. ## Implementation This handler was implemented using the `pinotdb` library, the Python DB-API and SQLAlchemy dialect for Pinot. The required arguments to establish a connection are as follows: * `host` is the host name or IP address of the Apache Pinot cluster. * `broker_port` is the port that the Broker of the Apache Pinot cluster is running on. * `controller_port` is the port that the Controller of the Apache Pinot cluster is running on. * `path` is the query path. ## Usage In order to make use of this handler and connect to the Pinot cluster in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE pinot_datasource WITH engine = 'pinot', parameters = { "host":"localhost", "broker_port": 8000, "controller_port": 9000, "path": "/query/sql", "scheme": "http" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM pinot_datasource.example_tbl; ``` # Apache Solr Source: https://docs.mindsdb.com/integrations/data-integrations/apache-solr This is the implementation of the Solr data handler for MindsDB. [Apache Solr](https://solr.apache.org/) is a highly reliable, scalable and fault tolerant, providing distributed indexing, replication and load-balanced querying, automated failover and recovery, centralized configuration, and more. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Apache Solr to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Apache Solr. ## Implementation This handler is implemented using the `sqlalchemy-solr` library, which provides a Python/SQLAlchemy interface. The required arguments to establish a connection are as follows: * `username` is the username used to authenticate with the Solr server. This parameter is optional. * `password` is the password to authenticate the user with the Solr server. This parameter is optional. * `host` is the host name or IP address of the Solr server. * `port` is the port number of the Solr server. * `server_path` defaults to `solr` if not provided. * `collection` is the Solr Collection name. * `use_ssl` defaults to `false` if not provided. Further reference: [https://pypi.org/project/sqlalchemy-solr/](https://pypi.org/project/sqlalchemy-solr/). ## Usage In order to make use of this handler and connect to the Solr database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE solr_datasource WITH engine = 'solr', parameters = { "username": "demo_user", "password": "demo_password", "host": "127.0.0.1", "port": "8981", "server_path": "solr", "collection": "gettingstarted", "use_ssl": "false" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM solr_datasource.gettingstarted LIMIT 10000; ``` **Requirements** A Solr instance with a Parallel SQL supported up and running. There are certain limitations that need to be taken into account when issuing queries to Solr. Refer to [https://solr.apache.org/guide/solr/latest/query-guide/sql-query.html#parallel-sql-queries](https://solr.apache.org/guide/solr/latest/query-guide/sql-query.html#parallel-sql-queries). Don't forget to put limit in the end of the SQL statement # null Source: https://docs.mindsdb.com/integrations/data-integrations/ckan ## CKAN Integration handler This handler facilitates integration with [CKAN](https://ckan.org/). an open-source data catalog platform for managing and publishing open data. CKAN organizes datasets and stores data in its [DataStore](http://docs.ckan.org/en/2.11/maintaining/datastore.html).To retrieve data from CKAN, the [CKANAPI](https://github.com/ckan/ckanapi) must be used. # Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect SAP HANA to MindsDB, install the required dependencies following [this instruction](https://docs.mindsdb.com/setup/self-hosted/docker#install-dependencies). The CKAN handler is included with MindsDB by default, so no additional installation is required. ## Configuration To use the CKAN handler, you need to provide the URL of the CKAN instance you want to connect to. You can do this by setting the `CKAN_URL` environment variable. For example: ```sql theme={null} CREATE DATABASE ckan_datasource WITH ENGINE = 'ckan', PARAMETERS = { "url": "https://your-ckan-instance-url.com", "api_key": "your-api-key-if-required" }; ``` > ***NOTE:*** Some CKAN instances will require you to provide an API Token. You can create one in the CKAN user panel. ## Usage The CKAN handler provides three main tables: * `datasets`: Lists all datasets in the CKAN instance. * `resources`: Lists all resources metadata across all packages. * `datastore`: Allows querying individual datastore resources. ## Example Queries 1. List all datasets: ```sql theme={null} SELECT * FROM `your-datasource`.datasets; ``` 2. List all resources: ```sql theme={null} SELECT * FROM `your-datasource`.resources ; ``` 3. Query a specific datastore resource: ```sql theme={null} SELECT * FROM `your-datasource`.datastore WHERE resource_id = 'your-resource-id'; ``` Replace `your-resource-id-here` with the actual resource ID you want to query. ## Querying Large Resources The CKAN handler supports automatic pagination when querying datastore resources. This allows you to retrieve large datasets without worrying about API limits. You can still use the `LIMIT` clause to limit the number of rows returned by the query. For example: ```sql theme={null} SELECT * FROM ckan_datasource.datastore WHERE resource_id = 'your-resource-id-here' LIMIT 1000; ``` ## Limitations * The handler currently supports read operations only. Write operations are not supported. * Performance may vary depending on the size of the CKAN instance and the complexity of your queries. * The handler may not work with all CKAN instances, especially those with custom configurations. * The handler does not support all CKAN API features. Some advanced features may not be available. * The datastore search will return limited records up to 32000. Please refer to the [CKAN API](https://docs.ckan.org/en/2.11/maintaining/datastore.html#ckanext.datastore.logic.action.datastore_search_sql) documentation for more information. # ClickHouse Source: https://docs.mindsdb.com/integrations/data-integrations/clickhouse This documentation describes the integration of MindsDB with [ClickHouse](https://clickhouse.com/docs/en/intro), a high-performance, column-oriented SQL database management system (DBMS) for online analytical processing (OLAP). The integration allows MindsDB to access data from ClickHouse and enhance ClickHouse with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect ClickHouse to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to ClickHouse from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/clickhouse_handler) as an engine. ```sql theme={null} CREATE DATABASE clickhouse_conn WITH ENGINE = 'clickhouse', PARAMETERS = { "host": "127.0.0.1", "port": "8443", "user": "root", "password": "mypass", "database": "test_data", "protocol" : "https" } ``` Required connection parameters include the following: * `host`: is the hostname or IP address of the ClickHouse server. * `port`: is the TCP/IP port of the ClickHouse server. * `user`: is the username used to authenticate with the ClickHouse server. * `password`: is the password to authenticate the user with the ClickHouse server. * `database`: defaults to `default`. It is the database name to use when connecting with the ClickHouse server. * `protocol`: defaults to `native`. It is an optional parameter. Its supported values are `native`, `http` and `https`. ## Usage The following usage examples utilize the connection to ClickHouse made via the `CREATE DATABASE` statement and named `clickhouse_conn`. Retrieve data from a specified table by providing the integration and table name. ```sql theme={null} SELECT * FROM clickhouse_conn.table_name LIMIT 10; ``` ## Troubleshooting `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the ClickHouse database. * **Checklist**: 1. Ensure that the ClickHouse server is running and accessible 2. Confirm that host, port, user, and password are correct. Try a direct MySQL connection. 3. Test the network connection between the MindsDB host and the ClickHouse server. `Slow Connection Initialization` * **Symptoms**: Connecting to the ClickHouse server takes an exceptionally long time, or connections hang without completing * **Checklist**: 1. Ensure that you are using the appropriate protocol (http, https, or native) for your ClickHouse setup. Misconfigurations here can lead to significant delays. 2. Ensure that firewalls or security groups (in cloud environments) are properly configured to allow traffic on the necessary ports (as 8123 for HTTP or 9000 for native). `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces, reserved words or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel data * Incorrect: SELECT \* FROM integration.'travel data' * Correct: SELECT \* FROM integration.\`travel data\` # Cloud Spanner Source: https://docs.mindsdb.com/integrations/data-integrations/cloud-spanner This is the implementation of the Cloud Spanner data handler for MindsDB. [Cloud Spanner](https://cloud.google.com/spanner) is a fully managed, mission-critical, relational database service that offers transactional consistency at global scale, automatic, synchronous replication for high availability. It supports two SQL dialects: GoogleSQL (ANSI 2011 with extensions) and PostgreSQL. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Cloud Spanner to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Cloud Spanner. ## Implementation This handler was implemented using the `google-cloud-spanner` Python client library. The required arguments to establish a connection are as follows: * `instance_id` is the instance identifier. * `database_id` is the database identifier. * `project` is the identifier of the project that owns the resources. * `credentials` is a stringified GCP service account key JSON. ## Usage In order to make use of this handler and connect to the Cloud Spanner database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE cloud_spanner_datasource WITH engine = 'cloud_spanner', parameters = { "instance_id": "my-instance", "database_id": "example-id", "project": "my-project", "credentials": "{...}" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM cloud_spanner_datasource.my_table; ``` Cloud Spanner supports both PostgreSQL and GoogleSQL dialects. However, not all PostgresSQL features are supported. # CockroachDB Source: https://docs.mindsdb.com/integrations/data-integrations/cockroachdb This is the implementation of the CockroachDB data handler for MindsDB. [CockroachDB](https://www.cockroachlabs.com/docs/) was architected for complex, high performant distributed writes and delivers scale-out read capability. CockroachDB delivers simple relational SQL transactions and obscures complexity away from developers. It is wire-compatible with PostgreSQL and provides a familiar and easy interface for developers. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect CockroachDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to CockroachDB. ## Implementation CockroachDB is wire-compatible with PostgreSQL. Therefore, its implementation extends the PostgreSQL handler. The required arguments to establish a connection are as follows: * `host` is the host name or IP address of the CockroachDB. * `database` is the name of the database to connect to. * `user` is the user to authenticate with the CockroachDB. * `port` is the port to use when connecting. * `password` is the password to authenticate the user. In order to make use of this handler and connect to the CockroachDB server in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE cockroachdb WITH engine = 'cockroachdb', parameters = { "host": "localhost", "database": "dbname", "user": "admin", "password": "password", "port": "5432" }; ``` ## Usage You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM cockroachdb.public.db; ``` # Couchbase Source: https://docs.mindsdb.com/integrations/data-integrations/couchbase This is the implementation of the Couchbase data handler for MindsDB. [Couchbase](https://www.couchbase.com/) is an open-source, distributed multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve many concurrent users by creating, storing, retrieving, aggregating, manipulating, and presenting data. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Couchbase to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Couchbase. ## Implementation This handler is implemented using the `couchbase` library, the Python driver for Couchbase. The required arguments to establish a connection are as follows: * `connection_string`: the connection string for the endpoint of the Couchbase server * `bucket`: the bucket name to use when connecting with the Couchbase server * `user`: the user to authenticate with the Couchbase server * `password`: the password to authenticate the user with the Couchbase server * `scope`: scopes are a level of data organization within a bucket. If omitted, will default to `_default` Note: The connection string expects either the couchbases\:// or couchbase:// protocol. If you are using Couchbase Capella, you can find the `connection_string` under the Connect tab It will also be required to whitelist the machine(s) that will be running MindsDB and database credentials will need to be created for the user. These steps can also be taken under the Connect tab. In order to make use of this handler and connect to a Couchbase server in MindsDB, the following syntax can be used. Note, that the example uses the default `travel-sample` bucket which can be enabled from the couchbase UI with pre-defined scope and documents. ```sql theme={null} CREATE DATABASE couchbase_datasource WITH engine='couchbase', parameters={ "connection_string": "couchbase://localhost", "bucket": "travel-sample", "user": "admin", "password": "password", "scope": "inventory" }; ``` ## Usage Now, you can use this established connection to query your database as follows: ```sql theme={null} SELECT * FROM couchbase_datasource.airport ``` # CrateDB Source: https://docs.mindsdb.com/integrations/data-integrations/cratedb This is the implementation of the CrateDB data handler for MindsDB. [CrateDB](https://crate.io/) is a distributed SQL database management system that integrates a fully searchable document-oriented data store. It is open-source, written in Java, based on a shared-nothing architecture, and designed for high scalability. CrateDB includes components from Lucene, Elasticsearch and Netty. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect CrateDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to CrateDB. ## Implementation This handler is implemented using `crate`, a Python library that allows you to use Python code to run SQL commands on CrateDB. The required arguments to establish a connection are as follows: * `user` is the username associated with the database. * `password` is the password to authenticate your access. * `host` is the hostname or IP adress of the server. * `port` is the port through which connection is to be made. * `schema_name` is schema name to get tables from. Defaults to `doc`. ## Usage In order to make use of this handler and connect to the CrateDB database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE crate_datasource WITH engine = 'crate', parameters = { "user": "crate", "password": "", "host": "127.0.0.1", "port": 4200, "schema_name": "doc" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM crate_datasource.demo; ``` # D0lt Source: https://docs.mindsdb.com/integrations/data-integrations/d0lt This is the implementation of the D0lt data handler for MindsDB. [D0lt](https://docs.dolthub.com/introduction/what-is-dolt) is a single-node and embedded DBMS that incorporates Git-style versioning as a first-class entity. D0lt behaves like Git - it is a content-addressable local database where the main objects are tables instead of files. In D0lt, a user creates a database locally. The database contains tables that can be read and updated using SQL. Similar to Git, writes are staged until the user issues a commit. Upon commit, the writes are appended to permanent storage. Branch and merge semantics are supported allowing for the tables to evolve at a different pace for multiple users. This allows for loose collaboration on data as well as multiple views on the same core data. Merge conflicts are detected for schema and data conflicts. Data conflicts are cell-based, not line-based. Remote repositories allow for cooperation among repository instances. Clone, push, and pull semantics are all available. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect D0lt to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to D0lt. ## Implementation This handler is implemented using `mysql-connector`, a Python library that allows you to use Python code to run SQL commands on the D0lt database. The required arguments to establish a connection are as follows: * `user` is the username associated with the database. * `password` is the password to authenticate your access. * `host` is the hostname or IP address of the server. * `port` is the port through which a TCP/IP connection is to be made. * `database` is the database name to be connected. ## Usage In order to make use of this handler and connect to the D0lt database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE d0lt_datasource WITH engine = 'd0lt', parameters = { "user": "root", "password": "", "host": "127.0.0.1", "port": 3306, "database": "information_schema" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM D0lt_datasource.TEST; ``` # Databend Source: https://docs.mindsdb.com/integrations/data-integrations/databend This is the implementation of the Databend data handler for MindsDB. [Databend](https://databend.rs/) is a modern cloud data warehouse that empowers your object storage for real-time analytics. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Databend to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Databend. ## Implementation This handler is implemented by extending the ClickHouse handler. The required arguments to establish a connection are as follows: * `protocol` is the protocol to query Databend. Supported values include `native`, `http`, `https`. It defaults to `native` if not provided. * `host` is the host name or IP address of the Databend warehouse. * `port` is the TCP/IP port of the Databend warehouse. * `user` is the username used to authenticate with the Databend warehouse. * `password` is the password to authenticate the user with the Databend warehouse. * `database` is the database name to use when connecting with the Databend warehouse. ## Usage In order to make use of this handler and connect to the Databend database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE databend_datasource WITH engine = 'databend', parameters = { "protocol": "https", "user": "root", "port": 443, "password": "password", "host": "some-url.aws-us-east-2.default.databend.com", "database": "test_db" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM databend_datasource.example_tbl; ``` # Databricks Source: https://docs.mindsdb.com/integrations/data-integrations/databricks This documentation describes the integration of MindsDB with [Databricks](https://www.databricks.com/), the world's first data intelligence platform powered by generative AI. The integration allows MindsDB to access data stored in a Databricks workspace and enhance it with AI capabilities. This data source integration is thread-safe, utilizing a connection pool where each thread is assigned its own connection. When handling requests in parallel, threads retrieve connections from the pool as needed. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Databricks to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). If the Databricks cluster you are attempting to connect to is terminated, executing the queries given below will attempt to start the cluster and therefore, the first query may take a few minutes to execute. To avoid any delays, ensure that the Databricks cluster is running before executing the queries. ## Connection Establish a connection to your Databricks workspace from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE databricks_datasource WITH engine = 'databricks', parameters = { "server_hostname": "adb-1234567890123456.7.azuredatabricks.net", "http_path": "sql/protocolv1/o/1234567890123456/1234-567890-test123", "access_token": "dapi1234567890ab1cde2f3ab456c7d89efa", "schema": "example_db" }; ``` Required connection parameters include the following: * `server_hostname`: The server hostname for the cluster or SQL warehouse. * `http_path`: The HTTP path of the cluster or SQL warehouse. * `access_token`: A Databricks personal access token for the workspace. Refer the instructions given [https://docs.databricks.com/en/integrations/compute-details.html](https://docs.databricks.com/en/integrations/compute-details.html) and [https://docs.databricks.com/en/dev-tools/python-sql-connector.html#authentication](https://docs.databricks.com/en/dev-tools/python-sql-connector.html#authentication) to find the connection parameters mentioned above for your compute resource. Optional connection parameters include the following: * `session_configuration`: Additional (key, value) pairs to set as Spark session configuration parameters. This should be provided as a JSON string. * `http_headers`: Additional (key, value) pairs to set in HTTP headers on every RPC request the client makes. This should be provided as `"http_headers": [['Header-1', 'value1'], ['Header-2', 'value2']]`. * `catalog`: The catalog to use for the connection. Default is `hive_metastore`. * `schema`: The schema (database) to use for the connection. Default is `default`. ## Usage Retrieve data from a specified table by providing the integration name, catalog, schema, and table name: ```sql theme={null} SELECT * FROM databricks_datasource.catalog_name.schema_name.table_name LIMIT 10; ``` The catalog and schema names only need to be provided if the table to be queried is not in the specified (or default) catalog and schema. Run Databricks SQL queries directly on the connected Databricks workspace: ```sql theme={null} SELECT * FROM databricks_datasource ( --Native Query Goes Here SELECT city, car_model, RANK() OVER (PARTITION BY car_model ORDER BY quantity) AS rank FROM dealer QUALIFY rank = 1; ); ``` The above examples utilize `databricks_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the Databricks workspace. * **Checklist**: 1. Make sure the Databricks workspace is active. 2. Confirm that server hostname, HTTP path, access token are correctly provided. If the catalog and schema are provided, ensure they are correct as well. 3. Ensure a stable network between MindsDB and Databricks workspace. SQL statements running against tables (of reasonable size) are taking longer than expected. * **Symptoms**: SQL queries taking longer than expected to execute. * **Checklist**: 1. Ensure the Databricks cluster is running before executing the queries. 2. Check the network connection between MindsDB and Databricks workspace. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing special characters. * **Checklist**: 1. Ensure table names with special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel-data * Incorrect: SELECT \* FROM integration.'travel-data' * Correct: SELECT \* FROM integration.\`travel-data\` # DataStax Source: https://docs.mindsdb.com/integrations/data-integrations/datastax This is the implementation of the DataStax data handler for MindsDB. [https://docs.datastax.com/en/astra-db-serverless/index.html\[DataStax](https://docs.datastax.com/en/astra-db-serverless/index.html\[DataStax) Astra DB] is a cloud database-as-a-service based on Apache Cassandra. DataStax also offers on-premises solutions, DataStax Enterprise (DSE) and Hyper-Converged Database (HCD), as well as Astra Streaming, a messaging and event streaming cloud service based on Apache Pulsar. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect DataStax to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Create an [Astra DB database](https://docs.datastax.com/en/astra-db-serverless/databases/create-database.html). ## Implementation DataStax Astra DB is API-compatible with Apache Cassandra and ScyllaDB. Therefore, its implementation extends the ScyllaDB handler and is using the `scylla-driver` Python library. The required arguments to establish a connection are as follows: * `user`: The literal string `token` * `password`: An [Astra application token](https://docs.datastax.com/en/astra-db-serverless/administration/manage-application-tokens.html) * `secure_connect_bundle`: The path to your database's [Secure Connect Bundle](https://docs.datastax.com/en/astra-db-serverless/databases/secure-connect-bundle.html) zip file ## Usage In order to make use of this handler and connect to the Astra DB database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE astra_connection WITH engine = "astra", parameters = { "user": "token", "password": "application_token", "secure_connect_bundle": "/home/Downloads/file.zip" }; ``` or, reference the bundle from Datastax s3 as: ```sql theme={null} CREATE DATABASE astra_connection WITH ENGINE = "astra", PARAMETERS = { "user": "token", "password": "application_token", "secure_connect_bundle": "https://datastax-cluster-config-prod.s3.us-east-2.amazonaws.com/32312-b9eb-4e09-a641-213eaesa12-1/secure-connect-demo.zip?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AK..." } ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM astra_connection.keystore.example_table LIMIT 10; ``` # DuckDB Source: https://docs.mindsdb.com/integrations/data-integrations/duckdb This is the implementation of the DuckDB data handler for MindsDB. [DuckDB](https://duckdb.org/) is an open-source analytical database system. It is designed for fast execution of analytical queries. There are no external dependencies and the DBMS runs completely embedded within a host process, similar to SQLite. DuckDB provides a rich SQL dialect with support for complex queries with transactional guarantees (ACID). ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect DuckDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to DuckDB. ## Implementation This handler is implemented using the `duckdb` Python client library. The DuckDB handler is currently using the `0.7.1.dev187` pre-relase version of the Python client library. In case of issues, make sure your DuckDB database is compatible with this version. See the [`requirements.txt`](https://github.com/mindsdb/mindsdb/blob/main/mindsdb/integrations/handlers/duckdb_handler/requirements.txt) for details. The required arguments to establish a connection are as follows: * `database` is the name of the DuckDB database file. It can be set to `:memory:` to create an in-memory database. The optional arguments are as follows: * `read_only` is a flag that specifies whether the connection is in the read-only mode. This is required if multiple processes want to access the same database file at the same time. ## Usage In order to make use of this handler and connect to the DuckDB database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE duckdb_datasource WITH engine = 'duckdb', parameters = { "database": "db.duckdb" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM duckdb_datasource.my_table; ``` # EdgelessDB Source: https://docs.mindsdb.com/integrations/data-integrations/edgelessdb This is the implementation of the EdgelessDB data handler for MindsDB. [Edgeless](https://edgeless.systems/) is a full SQL database, tailor-made for confidential computing. It seamlessly integrates with your existing tools and workflows to help you unlock the full potential of your data. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect EdgelessDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to EdgelessDB. ## Implementation This handler is implemented by extending the MySQL connector. The required arguments to establish a connection are as follows: * `host`: the host name of the EdgelessDB connection * `port`: the port to use when connecting * `user`: the user to authenticate * `password`: the password to authenticate the user * `database`: database name To use the full potensial of EdgelessDB, you can also specify the following arguments: * `ssl`: whether to use SSL or not * `ssl_ca`: path or url to the CA certificate * `ssl_cert`: path or url to the client certificate * `ssl_key`: path or url to the client key ## Usage In order to use EdgelessDB as a data source in MindsDB, you need to use the following syntax: ```sql theme={null} CREATE DATABASE edgelessdb_datasource WITH ENGINE = "EdgelessDB", PARAMETERS = { "user": "root", "password": "test123@!Aabvhj", "host": "localhost", "port": 3306, "database": "test_schema" } ``` Or you can use the following syntax: ```sql theme={null} CREATE DATABASE edgelessdb_datasource2 WITH ENGINE = "EdgelessDB", PARAMETERS = { "user": "root", "password": "test123@!Aabvhj", "host": "localhost", "port": 3306, "database": "test_schema", "ssl_cert": "/home/marios/demo/cert.pem", "ssl_key": "/home/marios/demo/key.pem" } ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM edgelessdb_datasource.table_name ``` # ElasticSearch Source: https://docs.mindsdb.com/integrations/data-integrations/elasticsearch This documentation describes the integration of MindsDB with [ElasticSearch](https://www.elastic.co/), a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents.. The integration allows MindsDB to access data from ElasticSearch and enhance ElasticSearch with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect ElasticSearch to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to ElasticSearch. ## Connection Establish a connection to ElasticSearch from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/elasticsearch_handler) as an engine. ```sql theme={null} CREATE DATABASE elasticsearch_datasource WITH ENGINE = 'elasticsearch', PARAMETERS={ 'cloud_id': 'xyz', -- optional, if hosts are provided 'hosts': 'https://xyz.xyz.gcp.cloud.es.io:123', -- optional, if cloud_id is provided 'api_key': 'xyz', -- optional, if user and password are provided 'user': 'elastic', -- optional, if api_key is provided 'password': 'xyz' -- optional, if api_key is provided }; ``` The connection parameters include the following: * `cloud_id`: The Cloud ID provided with the ElasticSearch deployment. Required only when `hosts` is not provided. * `hosts`: The ElasticSearch endpoint provided with the ElasticSearch deployment. Required only when `cloud_id` is not provided. * `api_key`: The API key that you generated for the ElasticSearch deployment. Required only when `user` and `password` are not provided. * `user` and `password`: The user and password used to authenticate. Required only when `api_key` is not provided. If you want to connect to the local instance of ElasticSearch, use the below statement: ```sql theme={null} CREATE DATABASE elasticsearch_datasource WITH ENGINE = 'elasticsearch', PARAMETERS = { "hosts": "127.0.0.1:9200", "user": "user", "password": "password" }; ``` Required connection parameters include the following (at least one of these parameters should be provided): * `hosts`: The IP address and port where ElasticSearch is deployed. * `user`: The user used to autheticate access. * `password`: The password used to autheticate access. ## Usage Retrieve data from a specified index by providing the integration name and index name: ```sql theme={null} SELECT * FROM elasticsearch_datasource.my_index LIMIT 10; ``` The above examples utilize `elasticsearch_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. At the moment, the Elasticsearch SQL API has certain limitations that have an impact on the queries that can be issued via MindsDB. The most notable of these limitations are listed below: 1. Only `SELECT` queries are supported at the moment. 2. Array fields are not supported. 3. Nested fields cannot be queried directly. However, they can be accessed using the `.` operator. For a detailed guide on the limitations of the Elasticsearch SQL API, refer to the [official documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-limitations.html). ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the Elasticsearch server. * **Checklist**: 1. Make sure the Elasticsearch server is active. 2. Confirm that server, cloud ID and credentials are correct. 3. Ensure a stable network between MindsDB and Elasticsearch. `Transport Error` or `Request Error` * **Symptoms**: Errors related to the issuing of unsupported queries to Elasticsearch. * **Checklist**: 1. Ensure the query is a `SELECT` query. 2. Avoid querying array fields. 3. Access nested fields using the `.` operator. 4. Refer to the [official documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-limitations.html) for more information if needed. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing index names containing special characters. * **Checklist**: 1. Ensure table names with special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel-data * Incorrect: SELECT \* FROM integration.'travel-data' * Correct: SELECT \* FROM integration.\`travel-data\` This [troubleshooting guide](https://www.elastic.co/guide/en/elasticsearch/reference/current/troubleshooting.html) provided by Elasticsearch might also be helpful. # Firebird Source: https://docs.mindsdb.com/integrations/data-integrations/firebird This is the implementation of the Firebird data handler for MindsDB. [Firebird](https://firebirdsql.org/en/about-firebird/) is a relational database offering many ANSI SQL standard features that runs on Linux, Windows, and a variety of Unix platforms. Firebird offers excellent concurrency, high performance, and powerful language support for stored procedures and triggers. It has been used in production systems, under a variety of names, since 1981. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Firebird to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Firebird. ## Implementation This handler is implemented using the `fdb` library, the Python driver for Firebird. The required arguments to establish a connection are as follows: * `host` is the host name or IP address of the Firebird server. * `database` is the port to use when connecting with the Firebird server. * `user` is the username to authenticate the user with the Firebird server. * `password` is the password to authenticate the user with the Firebird server. ## Usage In order to make use of this handler and connect to the Firebird server in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE firebird_datasource WITH engine = 'firebird', parameters = { "host": "localhost", "database": "C:\Users\minura\Documents\mindsdb\example.fdb", "user": "sysdba", "password": "password" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM firebird_datasource.example_tbl; ``` # Google BigQuery Source: https://docs.mindsdb.com/integrations/data-integrations/google-bigquery This documentation describes the integration of MindsDB with [Google BigQuery](https://cloud.google.com/bigquery?hl=en), a fully managed, AI-ready data analytics platform that helps you maximize value from your data. The integration allows MindsDB to access data stored in the BigQuery warehouse and enhance it with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect BigQuery to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to your BigQuery warehouse from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE bigquery_datasource WITH engine = "bigquery", parameters = { "project_id": "bgtest-1111", "dataset": "mydataset", "service_account_keys": "/tmp/keys.json" }; ``` Required connection parameters include the following: * `project_id`: The globally unique identifier for your project in Google Cloud where BigQuery is located. * `dataset`: The default dataset to connect to. Optional connection parameters include the following: * `service_account_keys`: The full path to the service account key file. * `service_account_json`: The content of a JSON file defined by the `service_account_keys` parameter. One of `service_account_keys` or `service_account_json` has to be provided to establish a connection to BigQuery. ## Usage Retrieve data from a specified table in the default dataset by providing the integration name and table name: ```sql theme={null} SELECT * FROM bigquery_datasource.table_name LIMIT 10; ``` Retrieve data from a specified table in a different dataset by providing the integration name, dataset name and table name: ```sql theme={null} SELECT * FROM bigquery_datasource.dataset_name.table_name LIMIT 10; ``` Run SQL in any supported BigQuery dialect directly on the connected BigQuery database: ```sql theme={null} SELECT * FROM bigquery_datasource ( --Native Query Goes Here SELECT * FROM t1 WHERE t1.a IN (SELECT t2.a FROM t2 FOR SYSTEM_TIME AS OF t1.timestamp_column); ); ``` The above examples utilize `bigquery_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the BigQuery warehouse. * **Checklist**: 1. Make sure that the Google Cloud account is active and the Google BigQuery service is enabled. 2. Confirm that the project ID, dataset and service account credentials are correct. Try a direct BigQuery connection using a client like DBeaver. 3. Ensure a stable network between MindsDB and Google BigQuery. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. Examples: \_ Incorrect: SELECT \_ FROM integration.travel data \_ Incorrect: SELECT \_ FROM integration.'travel data' \_ Correct: SELECT \_ FROM integration.\`travel data\` # Google Cloud SQL Source: https://docs.mindsdb.com/integrations/data-integrations/google-cloud-sql This is the implementation of the Google Cloud SQL data handler for MindsDB. [Cloud SQL](https://cloud.google.com/sql) is a fully-managed database service that makes it easy to set up, maintain, manage, and administer your relational PostgreSQL, MySQL, and SQL Server databases in the cloud. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Google Cloud SQL to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Google Cloud SQL. ## Implementation This handler was implemented using the existing MindsDB handlers for MySQL, PostgreSQL and SQL Server. The required arguments to establish a connection are, * `host`: the host name or IP address of the Google Cloud SQL instance. * `port`: the TCP/IP port of the Google Cloud SQL instance. * `user`: the username used to authenticate with the Google Cloud SQL instance. * `password`: the password to authenticate the user with the Google Cloud SQL instance. * `database`: the database name to use when connecting with the Google Cloud SQL instance. * `db_engine`: the database engine of the Google Cloud SQL instance. This can take one of three values: 'mysql', 'postgresql' or 'mssql'. ## Usage In order to make use of this handler and connect to the Google Cloud SQL instance, you need to create a datasource with the following syntax: ```sql theme={null} CREATE DATABASE cloud_sql_mysql_datasource WITH ENGINE = 'cloud_sql', PARAMETERS = { "db_engine": "mysql", "host": "53.170.61.16", "port": 3306, "user": "admin", "password": "password", "database": "example_db" }; ``` To successfully connect to the Google Cloud SQL instance you have to make sure that the IP address of the machine you are using to connect is added to the authorized networks of the Google Cloud SQL instance. You can do this by following the steps below: 1. Go to the [Cloud SQL Instances](https://console.cloud.google.com/sql/instances) page. 2. Click on the instance you want to add authorized networks to. 3. Click on the **Connections** tab. 4. Click on **Networking** tab. 5. Click on **Add network**. 6. Enter the IP address of the machine you want to connect from. If you are using MindsDB cloud version you can use the following IP address: `18.220.205.95 3.19.152.46 52.14.91.162` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM cloud_sql_mysql_datasource.example_tbl; ``` # Google Sheets Source: https://docs.mindsdb.com/integrations/data-integrations/google-sheets This is the implementation of the Google Sheets data handler for MindsDB. [Google Sheets](https://www.google.com/sheets/about/) is a spreadsheet program included as a part of the free, web-based Google Docs Editors suite offered by Google. Please note that the integration of MindsDB with Google Sheets works for public sheets only. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Google Sheets to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Google Sheets. ## Implementation This handler is implemented using `duckdb`, a library that allows SQL queries to be executed on `pandas` DataFrames. In essence, when querying a particular sheet, the entire sheet is first pulled into a `pandas` DataFrame using the [Google Visualization API](https://developers.google.com/chart/interactive/docs/reference). Once this is done, SQL queries can be run on the DataFrame using `duckdb`. Since the entire sheet needs to be pulled into memory first (DataFrame), it is recommended to be somewhat careful when querying large datasets so as not to overload your machine. The required arguments to establish a connection are as follows: * `spreadsheet_id` is the unique ID of the Google Sheet. * `sheet_name` is the name of the sheet within the Google Sheet. ## Usage In order to make use of this handler and connect to a Google Sheet in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE sheets_datasource WITH engine = 'sheets', parameters = { "spreadsheet_id": "12wgS-1KJ9ymUM-6VYzQ0nJYGitONxay7cMKLnEE2_d0", "sheet_name": "iris" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM sheets_datasource.example_tbl; ``` The name of the table will be the name of the relevant sheet, provided as an input to the `sheet_name` parameter. At the moment, only the `SELECT` statemet is allowed to be executed through `duckdb`. This, however, has no restriction on running machine learning algorithms against your data in Google Sheets using the `CREATE MODEL` statement. # GreptimeDB Source: https://docs.mindsdb.com/integrations/data-integrations/greptimedb This is the implementation of the GreptimeDB data handler for MindsDB. [GreptimeDB](https://greptime.com/) is an open-source, cloud-native time series database features analytical capabilities, scalebility and open protocols support. ## Implementation This handler is implemented by extending the MySQLHandler. Connect GreptimeDB to MindsDB by providing the following parameters: * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. * `user` is the database user. * `password` is the database password. There are several optional parameters that can be used as well. * `ssl` is the `ssl` parameter value that indicates whether SSL is enabled (`True`) or disabled (`False`). * `ssl_ca` is the SSL Certificate Authority. * `ssl_cert` stores SSL certificates. * `ssl_key` stores SSL keys. ## Usage In order to make use of this handler and connect to the GreptimeDB database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE greptimedb_datasource WITH engine = 'greptimedb', parameters = { "host": "127.0.0.1", "port": 4002, "database": "public", "user": "username", "password": "password" }; ``` You can use this established connection to query your table as follows. ```sql theme={null} SELECT * FROM greptimedb_datasource.example_table; ``` # IBM Db2 Source: https://docs.mindsdb.com/integrations/data-integrations/ibm-db2 This documentation describes the integration of MindsDB with [IBM Db2](https://www.ibm.com/db2), the cloud-native database built to power low-latency transactions, real-time analytics and AI applications at scale. The integration allows MindsDB to access data stored in the IBM Db2 database and enhance it with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect IBM Db2 to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to your IBM Db2 database from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE db2_datasource WITH engine = 'db2', parameters = { "host": "127.0.0.1", "user": "db2inst1", "password": "password", "database": "example_db" }; ``` Required connection parameters include the following: * `host`: The hostname, IP address, or URL of the IBM Db2 database. * `user`: The username for the IBM Db2 database. * `password`: The password for the IBM Db2 database. * `database`: The name of the IBM Db2 database to connect to. Optional connection parameters include the following: * `port`: The port number for connecting to the IBM Db2 database. Default is `50000`. * `schema`: The database schema to use within the IBM Db2 database. ## Usage Retrieve data from a specified table by providing the integration name, schema, and table name: ```sql theme={null} SELECT * FROM db2_datasource.schema_name.table_name LIMIT 10; ``` Run IBM Db2 native queries directly on the connected database: ```sql theme={null} SELECT * FROM db2_datasource ( --Native Query Goes Here WITH DINFO (DEPTNO, AVGSALARY, EMPCOUNT) AS (SELECT OTHERS.WORKDEPT, AVG(OTHERS.SALARY), COUNT(*) FROM EMPLOYEE OTHERS GROUP BY OTHERS.WORKDEPT ), DINFOMAX AS (SELECT MAX(AVGSALARY) AS AVGMAX FROM DINFO) SELECT THIS_EMP.EMPNO, THIS_EMP.SALARY, DINFO.AVGSALARY, DINFO.EMPCOUNT, DINFOMAX.AVGMAX FROM EMPLOYEE THIS_EMP, DINFO, DINFOMAX WHERE THIS_EMP.JOB = 'SALESREP' AND THIS_EMP.WORKDEPT = DINFO.DEPTNO ); ``` The above examples utilize `db2_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the IBM Db2 database. * **Checklist**: 1. Make sure the IBM Db2 database is active. 2. Confirm that host, user, password and database are correct. Try a direct connection using a client like DBeaver. 3. Ensure a stable network between MindsDB and the IBM Db2 database. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel-data * Incorrect: SELECT \* FROM integration.'travel-data' * Correct: SELECT \* FROM integration.\`travel-data\` This [guide](https://www.ibm.com/docs/en/db2/11.5?topic=connect-common-db2-problems) of common connection Db2 connection issues provided by IBM might also be helpful. # IBM Informix Source: https://docs.mindsdb.com/integrations/data-integrations/ibm-informix This is the implementation of the IBM Informix data handler for MindsDB. [IBM Informix](https://www.ibm.com/products/informix) is a product family within IBM's Information Management division that is centered on several relational database management system (RDBMS) offerings. The Informix server supports object–relational models and (through extensions) data types that are not a part of the SQL standard. The most widely used of these are the JSON, BSON, time series, and spatial extensions, which provide both data type support and language extensions that permit high-performance domain-specific queries and efficient storage for data sets based on semi-structured, time series, and spatial data. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect IBM Informix to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to IBM Informix. ## Implementation This handler is implemented using `IfxPy/IfxPyDbi`, a Python library that allows you to use Python code to run SQL commands on the Informix database. The required arguments to establish a connection are as follows: * `user` is the username associated with database. * `password` is the password to authenticate your access. * `host` is the hostname or IP address of the server. * `port` is the port through which TCP/IP connection is to be made. * `database` is the database name to be connected. * `schema_name` is the schema name to get tables. * `server` is the name of server you want connect. * `logging_enabled` defines whether logging is enabled or not. Defaults to `True` if not provided. ## Usage In order to make use of this handler and connect to the Informix database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE informix_datasource WITH engine='informix', parameters={ "server": "server", "host": "127.0.0.1", "port": 9091, "user": "informix", "password": "in4mix", "database": "stores_demo", "schema_name": "love", "loging_enabled": False }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM informix_datasource.items; ``` This integration uses `IfxPy`. As it is in development stage, it can be install using `pip install IfxPy`. However, it doesn't work with higher versions of Python, therefore, you have to build it from source. 1. This code downloads and extracts the `onedb-ODBC` driver used to make connection: ```bash theme={null} cd $HOME mkdir Informix cd Informix mkdir -p home/informix/cli wget https://hcl-onedb.github.io/odbc/OneDB-Linux64-ODBC-Driver.tar sudo tar xvf OneDB-Linux64-ODBC-Driver.tar -C ./home/informix/cli rm OneDB-Linux64-ODBC-Driver.tar ``` 2. Add enviroment variables in the `.bashrc` file: ```bash theme={null} export INFORMIXDIR=$HOME/Informix/home/informix/cli/onedb-odbc-driver export LD_LIBRARY_PATH=${LD_LIBRARY_PATH}${INFORMIXDIR}/lib:${INFORMIXDIR}/lib/esql:${INFORMIXDIR}/lib/cli ``` 3. This code clones the `IfxPy` repo, builds a wheel, and installs it: ```bash theme={null} pip install wheel mkdir Temp cd Temp git clone https://github.com/OpenInformix/IfxPy.git cd IfxPy/IfxPy python setup.py bdist_wheel pip install --find-links=./dist IfxPy cd .. cd .. cd .. rm -rf Temp ``` 1. This code downloads and extracts the `onedb-ODBC` driver used to make connection: ```bash theme={null} cd $HOME mkdir Informix cd Informix mkdir /home/informix/cli wget https://hcl-onedb.github.io/odbc/OneDB-Win64-ODBC-Driver.zip tar xvf OneDB-Win64-ODBC-Driver.zip -C ./home/informix/cli del OneDB-Win64-ODBC-Driver.zip ``` 2. Add an enviroment variable: ```bash theme={null} set INFORMIXDIR=$HOME/Informix/home/informix/cli/onedb-odbc-driver ``` 3. Add `%INFORMIXDIR%\bin` to the PATH environment variable. 4. This code clones the `IfxPy` repo, builds a wheel, and installs it: ```bash theme={null} pip install wheel mkdir Temp cd Temp git clone https://github.com/OpenInformix/IfxPy.git cd IfxPy/IfxPy python setup.py bdist_wheel pip install --find-links=./dist IfxPy cd .. cd .. cd .. rmdir Temp ``` # InfluxDB Source: https://docs.mindsdb.com/integrations/data-integrations/influxdb This is the implementation of the InfluxDB data handler for MindsDB. [InfluxDB](https://www.influxdata.com/) is a time series database that can be used to collect data and monitor the system and devices, especially Edge devices. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect InfluxDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to InfluxDB. ## Implementation The required arguments to establish a connection are as follows: * `influxdb_url` is the hosted URL of InfluxDB Cloud. * `influxdb_token` is the authentication token for the hosted InfluxDB Cloud instance. * `influxdb_db_name` is the database name of the InfluxDB Cloud instance. * `influxdb_table_name` is the table name of the InfluxDB Cloud instance. Please follow [this link](https://docs.influxdata.com/influxdb/cloud/security/tokens/create-token/#create-a-token-in-the-influxdb-ui) to generate token for accessing InfluxDB API. ## Usage In order to make use of this handler and connect to the InfluxDB database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE influxdb_source WITH ENGINE = 'influxdb', PARAMETERS = { "influxdb_url": "", "influxdb_token": "", "influxdb_table_name": "" }; ``` You can use this established connection to query your table as follows. ```sql theme={null} SELECT name, time, sensor_id, temperature FROM influxdb_source.tables ORDER BY temperature DESC LIMIT 65; ``` # MariaDB Source: https://docs.mindsdb.com/integrations/data-integrations/mariadb This documentation describes the integration of MindsDB with [MariaDB](https://mariadb.org/), one of the most popular open source relational databases. The integration allows MindsDB to access data from MariaDB and enhance MariaDB with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect MariaDB to MindsDB, install the required dependencies following [this instruction](https://docs.mindsdb.com/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to MariaDB from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/mariadb_handler) as an engine. ```sql theme={null} CREATE DATABASE mariadb_conn WITH ENGINE = 'mariadb', PARAMETERS = { "host": "host-name", "port": 3307, "database": "db-name", "user": "user-name", "password": "password" }; ``` Or: ```sql theme={null} CREATE DATABASE mariadb_conn WITH ENGINE = 'mariadb', PARAMETERS = { "url": "mariadb://user-name@host-name:3307" }; ``` Required connection parameters include the following: * `user`: The username for the MariaDB database. * `password`: The password for the MariaDB database. * `host`: The hostname, IP address, or URL of the MariaDB server. * `port`: The port number for connecting to the MariaDB server. * `database`: The name of the MariaDB database to connect to. Or: * `url`: You can specify a connection to MariaDB Server using a URI-like string, as an alternative connection option. You can also use `mysql://` as the protocol prefix Optional connection parameters include the following: * `ssl`: Boolean parameter that indicates whether SSL encryption is enabled for the connection. Set to True to enable SSL and enhance connection security, or set to False to use the default non-encrypted connection. * `ssl_ca`: Specifies the path to the Certificate Authority (CA) file in PEM format. * `ssl_cert`: Specifies the path to the SSL certificate file. This certificate should be signed by a trusted CA specified in the `ssl_ca` file or be a self-signed certificate trusted by the server. * `ssl_key`: Specifies the path to the private key file (in PEM format). * `use_pure` (`True` by default): Whether to use pure Python or C Extension. If `use_pure=False` and the C Extension is not available, then Connector/Python will automatically fall back to the pure Python implementation. ## Usage The following usage examples utilize the connection to MariaDB made via the `CREATE DATABASE` statement and named `mariadb_conn`. Retrieve data from a specified table by providing the integration and table name. ```sql theme={null} SELECT * FROM mariadb_conn.table_name LIMIT 10; ``` ## Troubleshooting `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the MariaDB database. * **Checklist**: 1. Ensure that the MariaDB server is running and accessible 2. Confirm that host, port, user, and password are correct. Try a direct MySQL connection. 3. Test the network connection between the MindsDB host and the MariaDB server. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces, reserved words or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel data * Incorrect: SELECT \* FROM integration.'travel data' * Correct: SELECT \* FROM integration.\`travel data\` # MatrixOne Source: https://docs.mindsdb.com/integrations/data-integrations/matrixone This is the implementation of the MatrixOne data handler for MindsDB. [MatrixOne](https://github.com/matrixorigin/matrixone) is a future-oriented hyper-converged cloud and edge native DBMS that supports transactional, analytical, and streaming workloads with a simplified and distributed database engine, across multiple data centers, clouds, edges, and other heterogeneous infrastructures. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect MatrixOne to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to MatrixOne. ## Implementation This handler is implemented using `PyMySQL`, a Python library that allows you to use Python code to run SQL commands on the MatrixOne database. The required arguments to establish a connection are as follows: * `user` is the username associated with the database. * `password` is the password to authenticate your access. * `host` is the hostname or IP address of the database. * `port` is the port through which TCP/IP connection is to be made. * `database` is the database name to be connected. There are several optional arguments that can be used as well. * `ssl` indicates whether SSL is enabled (`True`) or disabled (`False`). * `ssl_ca` is the SSL Certificate Authority. * `ssl_cert` stores the SSL certificates. * `ssl_key` stores the SSL keys. ## Usage In order to make use of this handler and connect to the MatrixOne database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE matrixone_datasource WITH engine = 'matrixone', parameters = { "user": "dump", "password": "111", "host": "127.0.0.1", "port": 6001, "database": "mo_catalog" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM Matrixone_datasource.demo; ``` # Microsoft Access Source: https://docs.mindsdb.com/integrations/data-integrations/microsoft-access This is the implementation of the Microsoft Access data handler for MindsDB. [Microsoft Access](https://www.microsoft.com/en-us/microsoft-365/access) is a pseudo-relational database engine from Microsoft. It is part of the Microsoft Office suite of applications that also includes Word, Outlook, and Excel, among others. Access is also available for purchase as a stand-alone product. It uses the Jet Database Engine for data storage. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Microsoft Access to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Microsoft Access. ## Implementation This handler is implemented using `pyodbc`, the Python ODBC bridge. The only required argument to establish a connection is `db_file` that points to a database file to be queried. ## Usage In order to make use of this handler and connect to the Access database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE access_datasource WITH engine = 'access', parameters = { "db_file":"C:\\Users\\minurap\\Documents\\example_db.accdb" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM access_datasource.example_tbl; ``` # Microsoft SQL Server Source: https://docs.mindsdb.com/integrations/data-integrations/microsoft-sql-server This documentation describes the integration of MindsDB with Microsoft SQL Server, a relational database management system developed by Microsoft. The integration allows for advanced SQL functionalities, extending Microsoft SQL Server's capabilities with MindsDB's features. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB [locally via Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or use [MindsDB Cloud](https://cloud.mindsdb.com/). 2. To connect Microsoft SQL Server to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ### Installation The MSSQL handler supports two connection methods: #### Option 1: Standard Connection (pymssql - Recommended) ```bash theme={null} pip install mindsdb[mssql] ``` This installs `pymssql`, which provides native FreeTDS-based connections. Works on all platforms. #### Option 2: ODBC Connection (pyodbc) ```bash theme={null} pip install mindsdb[mssql-odbc] ``` This installs both `pymssql` and `pyodbc` for ODBC driver support. **Additional requirements for ODBC:** * **System ODBC libraries**: On Linux, install `unixodbc` and `unixodbc-dev` ```bash theme={null} sudo apt-get install unixodbc unixodbc-dev ``` * **Microsoft ODBC Driver for SQL Server**: * **Linux**: ```bash theme={null} # Add Microsoft repository curl https://packages.microsoft.com/keys/microsoft.asc | sudo tee /etc/apt/trusted.gpg.d/microsoft.asc curl https://packages.microsoft.com/config/ubuntu/$(lsb_release -rs)/prod.list | sudo tee /etc/apt/sources.list.d/mssql-release.list # Install ODBC Driver 18 sudo apt-get update sudo ACCEPT_EULA=Y apt-get install -y msodbcsql18 ``` * **macOS**: `brew install msodbcsql18` * **Windows**: Download from [Microsoft](https://learn.microsoft.com/en-us/sql/connect/odbc/download-odbc-driver-for-sql-server) To verify installed drivers: ```bash theme={null} odbcinst -q -d ``` ## Connection Establish a connection to your Microsoft SQL Server database from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE mssql_datasource WITH ENGINE = 'mssql', PARAMETERS = { "host": "127.0.0.1", "port": 1433, "user": "sa", "password": "password", "database": "master" }; ``` Required connection parameters include the following: * `user`: The username for the Microsoft SQL Server. * `password`: The password for the Microsoft SQL Server. * `host` The hostname, IP address, or URL of the Microsoft SQL Server. * `database` The name of the Microsoft SQL Server database to connect to. Optional connection parameters include the following: * `port`: The port number for connecting to the Microsoft SQL Server. Default is 1433. * `server`: The server name to connect to. Typically only used with named instances or Azure SQL Database. ### ODBC Connection The handler also supports ODBC connections via `pyodbc` for advanced scenarios like Windows Authentication or specific driver requirements. #### Setup 1. Install: `pip install mindsdb[mssql-odbc]` 2. Install system ODBC driver (see Installation section above) Basic ODBC Connection: ```sql theme={null} CREATE DATABASE mssql_odbc_datasource WITH ENGINE = 'mssql', PARAMETERS = { "host": "127.0.0.1", "port": 1433, "user": "sa", "password": "password", "database": "master", "driver": "ODBC Driver 18 for SQL Server" -- Specifying driver enables ODBC }; ``` ODBC-specific Parameters: * `driver`: The ODBC driver name (e.g., "ODBC Driver 18 for SQL Server"). When specified, enables ODBC mode. * `use_odbc`: Set to `true` to explicitly use ODBC. Optional if `driver` is specified. If this is true default driver is set as `ODBC Driver 17 for SQL Server`. * `encrypt`: Connection encryption: `"yes"` or `"no"`. Driver 18 defaults to `"yes"`. * `trust_server_certificate`: Whether to trust self-signed certificates: `"yes"` or `"no"`. * `connection_string_args`: Additional connection string arguments. #### Example: Azure SQL Database with Encryption: ```sql theme={null} CREATE DATABASE azure_sql_datasource WITH ENGINE = 'mssql', PARAMETERS = { "host": "myserver.database.windows.net", "port": 1433, "user": "adminuser", "password": "SecurePass123!", "database": "mydb", "driver": "ODBC Driver 18 for SQL Server", "encrypt": "yes", "trust_server_certificate": "no" }; ``` #### Example: Local Development (Self-Signed Certificate): ```sql theme={null} CREATE DATABASE local_mssql WITH ENGINE = 'mssql', PARAMETERS = { "host": "localhost", "port": 1433, "user": "sa", "password": "YourStrong@Passw0rd", "database": "testdb", "driver": "ODBC Driver 18 for SQL Server", "encrypt": "yes", "trust_server_certificate": "yes" -- Allow self-signed certs }; ``` ## Usage Retrieve data from a specified table by providing the integration name, schema, and table name: ```sql theme={null} SELECT * FROM mssql_datasource.schema_name.table_name LIMIT 10; ``` Run T-SQL queries directly on the connected Microsoft SQL Server database: ```sql theme={null} SELECT * FROM mssql_datasource ( --Native Query Goes Here SELECT SUM(orderqty) total FROM Product p JOIN SalesOrderDetail sd ON p.productid = sd.productid JOIN SalesOrderHeader sh ON sd.salesorderid = sh.salesorderid JOIN Customer c ON sh.customerid = c.customerid WHERE (Name = 'Racing Socks, L') AND (companyname = 'Riding Cycles'); ); ``` The above examples utilize `mssql_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ### Performance Optimization for Large Datasets The handler is optimized for efficient data processing, but for very large result sets (millions of rows): 1. **Use SQL Server's filtering**: Apply `WHERE` clauses to filter data on the server side 2. **Use pagination**: Use `TOP`/`OFFSET-FETCH` in SQL Server or `LIMIT` in MindsDB queries 3. **Aggregate when possible**: Use `GROUP BY`, `COUNT()`, `AVG()`, etc. to reduce data volume 4. **Index your tables**: Ensure proper indexes on SQL Server for query performance **Example - Paginated Query:** ```sql theme={null} SELECT * FROM mssql_datasource ( SELECT TOP 100000 * FROM large_table ORDER BY id OFFSET 0 ROWS ); ``` ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the Microsoft SQL Server database. * **Checklist**: 1. Make sure the Microsoft SQL Server is active. 2. Confirm that host, port, user, and password are correct. Try a direct Microsoft SQL Server connection using a client like SQL Server Management Studio or DBeaver. 3. Ensure a stable network between MindsDB and Microsoft SQL Server. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel data * Incorrect: SELECT \* FROM integration.'travel data' * Correct: SELECT \* FROM integration.\`travel data\` `ODBC Driver Connection Error` * **Symptoms**: Errors like "Driver not found", "Can't open lib 'ODBC Driver 17 for SQL Server'", or "pyodbc is not installed". * **Checklist**: 1. **Verify pyodbc is installed**: `pip list | grep pyodbc` 2. **Check system ODBC libraries**: `ldconfig -p | grep odbc` (Linux) should show libodbc.so 3. **Verify ODBC drivers**: Run `odbcinst -q -d` to list installed drivers 4. **Match driver name exactly**: Use the exact name from `odbcinst -q -d` (case-sensitive) 5. **For Driver 18 encryption errors**: Add `"encrypt": "yes", "trust_server_certificate": "yes"` for local/dev servers 6. **Test connection manually**: ```python theme={null} import pyodbc print(pyodbc.drivers()) # Should list available drivers ``` # MonetDB Source: https://docs.mindsdb.com/integrations/data-integrations/monetdb This is the implementation of the MonetDB data handler for MindsDB. [MonetDB](https://www.monetdb.org/) is an open-source column-oriented relational database management system originally developed at the Centrum Wiskunde & Informatica in the Netherlands. It is designed to provide high performance on complex queries against large databases, such as combining tables with hundreds of columns and millions of rows. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect MonetDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to MonetDB. ## Implementation This handler is implemented using `pymonetdb`, a Python library that allows you to use Python code to run SQL commands on the MonetDB database. The required arguments to establish a connection are as follows: * `user` is the username associated with the database. * `password` is the password to authenticate your access. * `host` is the host name or IP address. * `port` is the port through which TCP/IP connection is to be made. * `database` is the database name to be connected. * `schema_name` is the schema name to get tables. It is optional and defaults to the current schema if not provided. ## Usage In order to make use of this handler and connect to the MonetDB database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE monetdb_datasource WITH engine = 'monetdb', parameters = { "user": "monetdb", "password": "monetdb", "host": "127.0.0.1", "port": 50000, "schema_name": "sys", "database": "demo" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM monetdb_datasource.demo; ``` # MySQL Source: https://docs.mindsdb.com/integrations/data-integrations/mysql This documentation describes the integration of MindsDB with [MySQL](https://www.mysql.com/), a fast, reliable, and scalable open-source database. The integration allows MindsDB to access data from MySQL and enhance MySQL with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect MySQL to MindsDB, install the required dependencies following [this instruction](https://docs.mindsdb.com/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to MySQL from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/mysql_handler) as an engine. ```sql theme={null} CREATE DATABASE mysql_conn WITH ENGINE = 'mysql', PARAMETERS = { "host": "host-name", "port": 3306, "database": "db-name", "user": "user-name", "password": "password" }; ``` Or: ```sql theme={null} CREATE DATABASE mysql_datasource WITH ENGINE = 'mysql', PARAMETERS = { "url": "mysql://user-name@host-name:3306" }; ``` Required connection parameters include the following: * `user`: The username for the MySQL database. * `password`: The password for the MySQL database. * `host`: The hostname, IP address, or URL of the MySQL server. * `port`: The port number for connecting to the MySQL server. * `database`: The name of the MySQL database to connect to. Or: * `url`: You can specify a connection to MySQL Server using a URI-like string, as an alternative connection option. Optional connection parameters include the following: * `ssl`: Boolean parameter that indicates whether SSL encryption is enabled for the connection. Set to True to enable SSL and enhance connection security, or set to False to use the default non-encrypted connection. * `ssl_ca`: Specifies the path to the Certificate Authority (CA) file in PEM format. * `ssl_cert`: Specifies the path to the SSL certificate file. This certificate should be signed by a trusted CA specified in the `ssl_ca` file or be a self-signed certificate trusted by the server. * `ssl_key`: Specifies the path to the private key file (in PEM format). * `use_pure` (`True` by default): Whether to use pure Python or C Extension. If `use_pure=False` and the C Extension is not available, then Connector/Python will automatically fall back to the pure Python implementation. ## Usage The following usage examples utilize the connection to MySQL made via the `CREATE DATABASE` statement and named `mysql_conn`. Retrieve data from a specified table by providing the integration and table name. ```sql theme={null} SELECT * FROM mysql_conn.table_name LIMIT 10; ``` **Next Steps** Follow [this tutorial](https://docs.mindsdb.com/use-cases/data_enrichment/text-summarization-inside-mysql-with-openai) to see more use case examples. ## Troubleshooting `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the MySQL database. * **Checklist**: 1. Ensure that the MySQL server is running and accessible 2. Confirm that host, port, user, and password are correct. Try a direct MySQL connection. 3. Test the network connection between the MindsDB host and the MySQL server. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces, reserved words or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel data * Incorrect: SELECT \* FROM integration.'travel data' * Correct: SELECT \* FROM integration.\`travel data\` # OceanBase Source: https://docs.mindsdb.com/integrations/data-integrations/oceanbase This is the implementation of the OceanBase data handler for MindsDB. OceanBase is a distributed relational database. It is the only distributed database in the world that has broken both TPC-C and TPC-H records. OceanBase adopts an independently developed integrated architecture, which encompasses both the scalability of a distributed architecture and the performance advantage of a centralized architecture. It supports hybrid transaction/analytical processing (HTAP) with one engine. Its features include strong data consistency, high availability, high performance, online scalability, high compatibility with SQL and mainstream relational databases, transparency to applications, and a high cost/performance ratio. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect OceanBase to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to OceanBase. ## Implementation This handler is implemented by extending the MySQL data handler. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. ## Usage In order to make use of this handler and connect to the OceanBase server in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE oceanbase_datasource WITH ENGINE = 'oceanbase', PARAMETERS = { "host": "127.0.0.1", "user": "oceanbase_user", "password": "password", "port": 2881, "database": "oceanbase_db" }; ``` Now, you can use this established connection to query your database as follows: ```sql theme={null} SELECT * FROM oceanbase_datasource.demo_table LIMIT 10; ``` # OpenGauss Source: https://docs.mindsdb.com/integrations/data-integrations/opengauss This is the implementation of the OpenGauss data handler for MindsDB. [OpenGauss](https://opengauss.org/en/) is an open-source relational database management system released with the Mulan PSL v2 and the kernel built on Huawei's years of experience in the database field. It continuously provides competitive features tailored to enterprise-grade scenarios. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect OpenGauss to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to OpenGauss. ## Implementation This handler is implemented by extending the PostgreSQL data handler. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. ## Usage In order to make use of this handler and connect to the OpenGauss database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE opengauss_datasource WITH ENGINE = 'opengauss', PARAMETERS = { "host": "127.0.0.1", "port": 5432, "database": "opengauss", "user": "mindsdb", "password": "password" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM opengauss_datasource.demo_table LIMIT 10; ``` # Oracle Source: https://docs.mindsdb.com/integrations/data-integrations/oracle This documentation describes the integration of MindsDB with [Oracle](https://www.techopedia.com/definition/8711/oracle-database), one of the most trusted and widely used relational database engines for storing, organizing and retrieving data by type while still maintaining relationships between the various types. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Oracle to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to your Oracle database from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE oracle_datasource WITH ENGINE = 'oracle', PARAMETERS = { "host": "localhost", "service_name": "FREEPDB1", "user": "SYSTEM", "password": "password" }; ``` Required connection parameters include the following: * `user`: The username for the Oracle database. * `password`: The password for the Oracle database. * `dsn`: The data source name (DSN) for the Oracle database. OR * `host`: The hostname, IP address, or URL of the Oracle server. AND * `sid`: The system identifier (SID) of the Oracle database. OR * `service_name`: The service name of the Oracle database. Optional connection parameters include the following: * `port`: The port number for connecting to the Oracle database. Default is 1521. * `disable_oob`: The boolean parameter to disable out-of-band breaks. Default is `false`. * `auth_mode`: The authorization mode to use. * `thick_mode`: Set to `true` to use thick mode for the connection. Thin mode is used by default. * `oracle_client_lib_dir`: The directory path where Oracle Client libraries are located. Required if `thick_mode` is set to `true`. ## Usage Retrieve data from a specified table by providing the integration name, schema, and table name: ```sql theme={null} SELECT * FROM oracle_datasource.schema_name.table_name LIMIT 10; ``` Run PL/SQL queries directly on the connected Oracle database: ```sql theme={null} SELECT * FROM oracle_datasource ( --Native Query Goes Here SELECT employee_id, first_name, last_name, email, hire_date FROM oracle_datasource.hr.employees WHERE department_id = 10 ORDER BY hire_date DESC; ); ``` The above examples utilize `oracle_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the Oracle database. * **Checklist**: 1. Make sure the Oracle database is active. 2. Confirm that the connection parameters provided (DSN, host, SID, service\_name) and the credentials (user, password) are correct. 3. Ensure a stable network between MindsDB and Oracle. * **Symptoms**: Connection timeout errors. * **Checklist**: 1. Verify that the Oracle database is reachable from the MindsDB server. 2. Check for any firewall or network restrictions that might be causing delays. * **Symptoms**: Can't connect to db: Failed to initialize Oracle client: DPI-1047: Cannot locate a 64-bit Oracle Client library: * **Checklist**: 1. Ensure that the Oracle Client libraries are installed on the MindsDB server. 2. Verify that the `oracle_client_lib_dir` parameter is set correctly in the connection configuration. 3. Check that the installed Oracle Client libraries match the architecture (64-bit) of the MindsDB server. This [troubleshooting guide](https://docs.oracle.com/en/database/oracle/oracle-database/19/ntqrf/database-connection-issues.html) provided by Oracle might also be helpful. # OrioleDB Source: https://docs.mindsdb.com/integrations/data-integrations/orioledb This is the implementation of the OrioleDB data handler for MindsDB. [OrioleDB](https://www.orioledata.com/) is a new storage engine for PostgreSQL, bringing a modern approach to database capacity, capabilities, and performance to the world's most-loved database platform. It consists of an extension, building on the innovative table access method framework and other standard Postgres extension interfaces. By extending and enhancing the current table access methods, OrioleDB opens the door to a future of more powerful storage models that are optimized for cloud and modern hardware architectures. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect OrioleDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to OrioleDB. ## Implementation This handler is implemented by extending the PostgreSQL data handler. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `server` is the OrioleDB server. * `database` is the database name. ## Usage In order to make use of this handler and connect to the OrioleDB server in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE orioledb_datasource WITH ENGINE = 'orioledb', PARAMETERS = { "user": "orioledb_user", "password": "password", "host": "127.0.0.1", "port": 55505, "server": "server_name", "database": "oriole_db" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM orioledb_data.demo_table LIMIT 10; ``` # PlanetScale Source: https://docs.mindsdb.com/integrations/data-integrations/planetscale This is the implementation of the PlanetScale data handler for MindsDB. [PlanetScale](https://planetscale.com/) is a MySQL-compatible, serverless database platform. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect PlanetScale to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to PlanetScale. ## Implementation This handler is implemented by extending the MySQL data handler. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. ## Usage In order to make use of this handler and connect to the PlanetScale database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE planetscale_datasource WITH ENGINE = 'planet_scale', PARAMETERS = { "host": "127.0.0.1", "port": 3306, "user": "planetscale_user", "password": "password", "database": "planetscale_db" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM planetscale_datasource.my_table; ``` # PostgreSQL Source: https://docs.mindsdb.com/integrations/data-integrations/postgresql This documentation describes the integration of MindsDB with [PostgreSQL](https://www.postgresql.org/), a powerful, open-source, object-relational database system. The integration allows MindsDB to access data stored in the PostgreSQL database and enhance PostgreSQL with AI capabilities. This data source integration is thread-safe, utilizing a connection pool where each thread is assigned its own connection. When handling requests in parallel, threads retrieve connections from the pool as needed. ### Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect PostgreSQL to MindsDB, install the required dependencies following [this instruction](https://docs.mindsdb.com/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to your PostgreSQL database from MindsDB by executing the following SQL command: ```sql theme={null} CREATE DATABASE postgresql_conn WITH ENGINE = 'postgres', PARAMETERS = { "host": "127.0.0.1", "port": 5432, "database": "postgres", "user": "postgres", "schema": "data", "password": "password" }; ``` Required connection parameters include the following: * `user`: The username for the PostgreSQL database. * `password`: The password for the PostgreSQL database. * `host`: The hostname, IP address, or URL of the PostgreSQL server. * `port`: The port number for connecting to the PostgreSQL server. * `database`: The name of the PostgreSQL database to connect to. Optional connection parameters include the following: * `schema`: The database schema to use. Default is public. * `sslmode`: The SSL mode for the connection. * `connection_parameters`: allows passing any PostgreSQL libpq parameters, such as: * SSL settings: sslrootcert, sslcert, sslkey, sslcrl, sslpassword * Network and reliability options: connect\_timeout, keepalives, keepalives\_idle, keepalives\_interval, keepalives\_count * Session options: application\_name, options, client\_encoding * Any other libpq-supported parameter ## Usage The following usage examples utilize the connection to PostgreSQL made via the `CREATE DATABASE` statement and named `postgresql_conn`. Retrieve data from a specified table by providing the integration name, schema, and table name: ```sql theme={null} SELECT * FROM postgresql_conn.table_name LIMIT 10; ``` Run PostgreSQL-native queries directly on the connected PostgreSQL database: ```sql theme={null} SELECT * FROM postgresql_conn ( --Native Query Goes Here SELECT model, COUNT(*) OVER (PARTITION BY model, year) AS units_to_sell, ROUND((CAST(tax AS decimal) / price), 3) AS tax_div_price FROM used_car_price ); ``` **Next Steps** Follow [this tutorial](https://docs.mindsdb.com/use-cases/predictive_analytics/house-sales-forecasting) to see more use case examples. ## Troubleshooting `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the PostgreSQL database. * **Checklist**: 1. Make sure the PostgreSQL server is active. 2. Confirm that host, port, user, schema, and password are correct. Try a direct PostgreSQL connection. 3. Ensure a stable network between MindsDB and PostgreSQL. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel data * Incorrect: SELECT \* FROM integration.'travel data' * Correct: SELECT \* FROM integration.\`travel data\` # QuestDB Source: https://docs.mindsdb.com/integrations/data-integrations/questdb This is the implementation of the QuestDB data handler for MindsDB. [QuestDB](https://questdb.io/) is a columnar time-series database with high performance ingestion and SQL analytics. It is open-source and available on the cloud. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect QuestDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to QuestDB. ## Implementation This handler is implemented by extending the PostgreSQL data handler. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. * `public` stores a value of `True` or `False`. Defaults to `True` if left blank. ## Usage In order to make use of this handler and connect to the QuestDB server in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE questdb_datasource WITH ENGINE = 'questdb', PARAMETERS = { "host": "127.0.0.1", "port": 8812, "database": "qdb", "user": "admin", "password": "password" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM questdb_datasource.demo_table LIMIT 10; ``` # SAP HANA Source: https://docs.mindsdb.com/integrations/data-integrations/sap-hana This documentation describes the integration of MindsDB with [SAP HANA](https://www.sap.com/products/technology-platform/hana/what-is-sap-hana.html), a multi-model database with a column-oriented in-memory design that stores data in its memory instead of keeping it on a disk. The integration allows MindsDB to access data from SAP HANA and enhance SAP HANA with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect SAP HANA to MindsDB, install the required dependencies following [this instruction](https://docs.mindsdb.com/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to SAP HANA from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/hana_handler) as an engine. ```sql theme={null} CREATE DATABASE sap_hana_datasource WITH ENGINE = 'hana', PARAMETERS = { "address": "123e4567-e89b-12d3-a456-426614174000.hana.trial-us10.hanacloud.ondemand.com", "port": "443", "user": "demo_user", "password": "demo_password", "encrypt": true }; ``` Required connection parameters include the following: * `address`: The hostname, IP address, or URL of the SAP HANA database. * `port`: The port number for connecting to the SAP HANA database. * `user`: The username for the SAP HANA database. * `password`: The password for the SAP HANA database. Optional connection parameters include the following: * 'database': The name of the database to connect to. This parameter is not used for SAP HANA Cloud. * `schema`: The database schema to use. Defaults to the user's default schema. * `encrypt`: The setting to enable or disable encryption. Defaults to \`True' ## Usage Retrieve data from a specified table by providing the integration, schema and table names: ```sql theme={null} SELECT * FROM sap_hana_datasource.schema_name.table_name LIMIT 10; ``` Run Teradata SQL queries directly on the connected Teradata database: ```sql theme={null} SELECT * FROM sap_hana_datasource ( --Native Query Goes Here SELECT customer, year, SUM(sales) FROM t1 GROUP BY ROLLUP(customer, year); SELECT customer, year, SUM(sales) FROM t1 GROUP BY GROUPING SETS ( (customer, year), (customer) ) UNION ALL SELECT NULL, NULL, SUM(sales) FROM t1; ); ``` The above examples utilize `sap_hana_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Troubleshooting `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the SAP HANA database. * **Checklist**: 1. Make sure the SAP HANA database is active. 2. Confirm that address, port, user and password are correct. Try a direct connection using a client like DBeaver. 3. Ensure a stable network between MindsDB and SAP HANA. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel-data * Incorrect: SELECT \* FROM integration.'travel-data' * Correct: SELECT \* FROM integration.\`travel-data\` # SAP SQL Anywhere Source: https://docs.mindsdb.com/integrations/data-integrations/sap-sql-anywhere This is the implementation of the SAP SQL Anywhere data handler for MindsDB. [SAP SQL Anywhere](https://www.sap.com/products/technology-platform/sql-anywhere.html) is an embedded database for application software that enables secure and reliable data management for servers where no DBA is available and synchronization for tens of thousands of mobile devices, Internet of Things (IoT) systems, and remote environments. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect SAP SQL Anywhere to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to SAP SQL Anywhere. ## Implementation This handler is implemented using `sqlanydb`, the Python driver for SAP SQL Anywhere. The required arguments to establish a connection are as follows: * `host` is the host name or IP address of the SAP SQL Anywhere instance. * `port` is the port number of the SAP SQL Anywhere instance. * `user` specifies the user name. * `password` specifies the password for the user. * `database` sets the current database. * `server` sets the current server. ## Usage You can use the below SQL statements to create a table in SAP SQL Anywhere called `TEST`. ```sql theme={null} CREATE TABLE TEST ( ID INTEGER NOT NULL, NAME NVARCHAR(1), DESCRIPTION NVARCHAR(1) ); CREATE UNIQUE INDEX TEST_ID_INDEX ON TEST (ID); ALTER TABLE TEST ADD CONSTRAINT TEST_PK PRIMARY KEY (ID); INSERT INTO TEST VALUES (1, 'h', 'w'); ``` In order to make use of this handler and connect to the SAP SQL Anywhere database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE sap_sqlany_trial WITH ENGINE = 'sqlany', PARAMETERS = { "user": "DBADMIN", "password": "password", "host": "localhost", "port": "55505", "server": "TestMe", "database": "MINDSDB" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM sap_sqlany_trial.test; ``` On execution, we get: | ID | NAME | DESCRIPTION | | -- | ---- | ----------- | | 1 | h | w | # ScyllaDB Source: https://docs.mindsdb.com/integrations/data-integrations/scylladb This is the implementation of the ScyllaDB data handler for MindsDB. [ScyllaDB](https://www.scylladb.com/) is an open-source distributed NoSQL wide-column data store. It was purposefully designed to offer compatibility with Apache Cassandra while outperforming it with higher throughputs and reduced latencies. For a comprehensive understanding of ScyllaDB, visit ScyllaDB's official website. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect ScyllaDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to ScyllaDB. ### Implementation The ScyllaDB handler for MindsDB was developed using the scylla-driver library for Python. The required arguments to establish a connection are as follows: * `host`: Host name or IP address of ScyllaDB. * `port`: Connection port. * `user`: Authentication username. Optional; required only if authentication is enabled. * `password`: Authentication password. Optional; required only if authentication is enabled. * `keyspace`: The specific keyspace (top-level container for tables) to connect to. * `protocol_version`: Optional. Defaults to 4. * `secure_connect_bundle`: Optional. Needed only for connections to DataStax Astra. ## Usage To set up a connection between MindsDB and a Scylla server, utilize the following SQL syntax: ```sql theme={null} CREATE DATABASE scylladb_datasource WITH ENGINE = 'scylladb', PARAMETERS = { "user": "user@mindsdb.com", "password": "pass", "host": "127.0.0.1", "port": "9042", "keyspace": "test_data" }; ``` The protocol version is set to 4 by default. Should you wish to modify it, simply include "protocol\_version": 5 within the PARAMETERS dictionary in the query above. With the connection established, you can execute queries on your keyspace as demonstrated below: ```sql theme={null} SELECT * FROM scylladb_datasource.keystore.example_table LIMIT 10; ``` # SingleStore Source: https://docs.mindsdb.com/integrations/data-integrations/singlestore This is the implementation of the SingleStore data handler for MindsDB. [SingleStore](https://www.singlestore.com/) is a proprietary, cloud-native database designed for data-intensive applications. A distributed, relational, SQL database management system that features ANSI SQL support. It is known for speed in data ingest, transaction processing, and query processing. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect SingleStore to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to SingleStore. ## Implementation This handler is implemented by extending the MySQL data handler. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. There are several optional arguments that can be used as well. * `ssl` is the `ssl` parameter value that indicates whether SSL is enabled (`True`) or disabled (`False`). * `ssl_ca` is the SSL Certificate Authority. * `ssl_cert` stores SSL certificates. * `ssl_key` stores SSL keys. ## Usage In order to make use of this handler and connect to the SingleStore database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE singlestore_datasource WITH ENGINE = 'singlestore', PARAMETERS = { "host": "127.0.0.1", "port": 3306, "database": "singlestore", "user": "root", "password": "password" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM singlestore_datasource.example_table; ``` # Snowflake Source: https://docs.mindsdb.com/integrations/data-integrations/snowflake This documentation describes the integration of MindsDB with [Snowflake](https://www.snowflake.com/en/), a cloud data warehouse used to store and analyze data. The integration allows MindsDB to access data stored in the Snowflake database and enhance it with AI capabilities. **Important!** When querying data from Snowflake, MindsDB automatically converts column names to lower-case. To prevent this, users can provide an alias name as shown below. **This update is introduced with the MindsDB version 25.3.4.1. It is not backward-compatible and has the following implications:** 1. Queries to Snowflake will return column names in lower-case from now on. 2. The models created with Snowflake as a data source must be recreated. **How it works** The below query presents how Snowflake columns are output when queried from MindsDB. ```sql theme={null} SELECT CC_NAME, -- converted to lower-case CC_CLASS AS `CC_CLASS`, -- provided alias name in upper-case CC_EMPLOYEES, cc_employees FROM snowflake_data.TPCDS_SF100TCL.CALL_CENTER; ``` Here is the output: ```sql theme={null} +--------------+----------+--------------+--------------+ | cc_name | CC_CLASS | cc_employees | cc_employees | +--------------+----------+--------------+--------------+ | NY Metro | large | 597159671 | 597159671 | | Mid Atlantic | medium | 944879074 | 944879074 | +--------------+----------+--------------+--------------+ ``` ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Snowflake to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection The Snowflake handler supports two authentication methods: ### 1. Password Authentication (Legacy) Establish a connection using username and password: ```sql theme={null} CREATE DATABASE snowflake_datasource WITH ENGINE = 'snowflake', PARAMETERS = { "account": "tvuibdy-vm85921", "user": "your_username", "password": "your_password", "database": "test_db", "auth_type": "password" }; ``` ### 2. Key Pair Authentication (Recommended) Key pair authentication is more secure and is the recommended method by Snowflake: ```sql theme={null} CREATE DATABASE snowflake_datasource WITH ENGINE = 'snowflake', PARAMETERS = { "account": "tvuibdy-vm85921", "user": "your_username", "private_key_path": "/path/to/your/private_key.pem", "database": "test_db", "auth_type": "key_pair" }; ``` If the private key cannot be accesed from disk (for example when running MindsDB on Cloud), provide the PEM content directly: ```sql theme={null} CREATE DATABASE snowflake_datasource WITH ENGINE = 'snowflake', PARAMETERS = { "account": "tvuibdy-vm85921", "user": "your_username", "private_key": "-----BEGIN PRIVATE KEY-----\\n...\\n-----END PRIVATE KEY-----", "database": "test_db", "auth_type": "key_pair" }; ``` With encrypted private key (passphrase protected): ```sql theme={null} CREATE DATABASE snowflake_datasource WITH ENGINE = 'snowflake', PARAMETERS = { "account": "tvuibdy-vm85921", "user": "your_username", "private_key_path": "/path/to/your/private_key.pem", "private_key_passphrase": "your_passphrase", "database": "test_db", "auth_type": "key_pair" }; ``` ### Connection Parameters Required parameters: * `account`: The Snowflake account identifier. This [guide](https://docs.snowflake.com/en/user-guide/admin-account-identifier) will help you find your account identifier. * `user`: The username for the Snowflake account. * `database`: The name of the Snowflake database to connect to. * `auth_type`: The authentication type to use. Options: `"password"` or `"key_pair"`. Authentication parameters (one method required): * `password`: The password for the Snowflake account (password authentication). * `private_key_path`: Path to the private key file for key pair authentication. * `private_key`: PEM-formatted private key content for key pair authentication. * `private_key_passphrase`: Optional passphrase for encrypted private key (key pair authentication). Optional parameters: * `warehouse`: The Snowflake warehouse to use for running queries. * `schema`: The database schema to use within the Snowflake database. Default is `PUBLIC`. * `role`: The Snowflake role to use. For detailed instructions on setting up key pair authentication, please refer to [AUTHENTICATION.md](AUTHENTICATION.md) or the [Snowflake Key Pair Authentication documentation](https://docs.snowflake.com/en/user-guide/key-pair-auth.html). ## Usage Retrieve data from a specified table by providing the integration name, schema, and table name: ```sql theme={null} SELECT * FROM snowflake_datasource.schema_name.table_name LIMIT 10; ``` Run Snowflake SQL queries directly on the connected Snowflake database: ```sql theme={null} SELECT * FROM snowflake_datasource ( --Native Query Goes Here SELECT employee_table.* EXCLUDE department_id, department_table.* RENAME department_name AS department FROM employee_table INNER JOIN department_table ON employee_table.department_id = department_table.department_id ORDER BY department, last_name, first_name; ); ``` The above examples utilize `snowflake_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Troubleshooting Guide `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the Snowflake account. * **Checklist**: 1. Make sure the Snowflake is active. 2. Confirm that account, user, password and database are correct. Try a direct Snowflake connection using a client like DBeaver. 3. Ensure a stable network between MindsDB and Snowflake. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel data * Incorrect: SELECT \* FROM integration.'travel data' * Correct: SELECT \* FROM integration.\`travel data\` This [troubleshooting guide](https://community.snowflake.com/s/article/Snowflake-Client-Connectivity-Troubleshooting) provided by Snowflake might also be helpful. # SQLite Source: https://docs.mindsdb.com/integrations/data-integrations/sqlite This is the implementation of the SQLite data handler for MindsDB. [SQLite](https://www.sqlite.org/about.html) is an in-process library that implements a self-contained, serverless, zero-configuration, transactional SQL database engine. The code for SQLite is in the public domain and is thus free to use for either commercial or private purpose. SQLite is the most widely deployed database in the world with more applications than we can count, including several high-profile projects. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect SQLite to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to SQLite. ## Implementation This handler is implemented using the standard `sqlite3` library that comes with Python. The only required argument to establish a connection is `db_file` that points to the database file that the connection is to be made to. Optionally, this may also be set to `:memory:` to create an in-memory database. ## Usage In order to make use of this handler and connect to the SQLite database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE sqlite_datasource WITH engine = 'sqlite', parameters = { "db_file": "example.db" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM sqlite_datasource.example_tbl; ``` # StarRocks Source: https://docs.mindsdb.com/integrations/data-integrations/starrocks This is the implementation of the StarRocks data handler for MindsDB. [StarRocks](https://www.starrocks.io/) is the next-generation data platform designed to make data-intensive real-time analytics fast and easy. It delivers query speeds 5 to 10 times faster than other popular solutions. StarRocks can perform real-time analytics well while updating historical records. It can also enhance real-time analytics with historical data from data lakes easily. With StarRocks, you can get rid of the de-normalized tables and get the best performance and flexibility. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect StarRocks to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to StarRocks. ## Implementation This handler is implemented by extending the MySQL data handler. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. ## Usage In order to make use of this handler and connect to the StarRocks server in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE starrocks_datasource WITH ENGINE = 'starrocks', PARAMETERS = { "host": "127.0.0.1", "user": "starrocks_user", "password": "password", "port": 8030, "database": "starrocks_db" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM starrocks_datasource.demo_table LIMIT 10; ``` # Supabase Source: https://docs.mindsdb.com/integrations/data-integrations/supabase This is the implementation of the Supabase data handler for MindsDB. [Supabase](https://supabase.com/) is an open-source Firebase alternative. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Supabase to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Supabase. ## Implementation This handler is implemented by extending the PostgreSQL data handler. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. ## Usage In order to make use of this handler and connect to the Supabase server in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE supabase_datasource WITH ENGINE = 'supabase', PARAMETERS = { "host": "127.0.0.1", "port": 54321, "database": "test", "user": "supabase", "password": "password" }; ``` You can use this established connection to query your database as follows: ```sql theme={null} SELECT * FROM supabase_datasource.public.rentals LIMIT 10; ``` # SurrealDB Source: https://docs.mindsdb.com/integrations/data-integrations/surrealdb This is the implementation of the SurrealDB data handler for MindsDB. [SurrealDB](https://surrealdb.com/) is an innovative NewSQL cloud database, suitable for serverless applications, jamstack applications, single-page applications, and traditional applications. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect SurrealDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to SurrealDB. ## Implementation This handler was implemented by using the python library `pysurrealdb`. The required arguments to establish a connection are: * `host`: the host name of the Surrealdb connection * `port`: the port to use when connecting * `user`: the user to authenticate * `password`: the password to authenticate the user * `database`: database name to be connected * `namespace`: namespace name to be connected ## Usage To establish a connection with our SurrealDB server which is running locally with the public cloud instance. We are going to use `ngrok tunneling` to connect cloud instance to the local SurrealDB server. You can follow this [guide](https://docs.mindsdb.com/sql/create/database#making-your-local-database-available-to-mindsdb) for that. Let's make the connection with the MindsDB public cloud ```sql theme={null} CREATE DATABASE exampledb WITH ENGINE = 'surrealdb', PARAMETERS = { "host": "6.tcp.ngrok.io", "port": "17141", "user": "root", "password": "root", "database": "testdb", "namespace": "testns" }; ``` Please change the `host` and `port` properties in the `PARAMETERS` clause based on the values which you got. We can also query the `dev` table which we created with ```sql theme={null} SELECT * FROM exampledb.dev; ``` # TDengine Source: https://docs.mindsdb.com/integrations/data-integrations/tdengine This is the implementation of the TDEngine data handler for MindsDB. [TDengine](https://tdengine.com/) is an open source, high-performance, cloud native time-series database optimized for Internet of Things (IoT), Connected Cars, and Industrial IoT. It enables efficient, real-time data ingestion, processing, and monitoring of TB and even PB scale data per day, generated by billions of sensors and data collectors. TDengine differentiates itself from other time-series databases with numerous advantages, such as high performance, simplified solution, cloud-native, ease of use, easy data analytics, and open-source. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect TDengine to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to TDengine. ## Implementation This handler is implemented using `taos/taosrest`, a Python library that allows you to use Python code to run SQL commands on the TDEngine server. The required arguments to establish a connection are as follows: * `user` is the username associated with the server. * `password` is the password to authenticate your access. * `url` is the URL to the TDEngine server. For local server, the URL is `localhost:6041` by default. * `token` is the unique token provided while using TDEngine Cloud. * `database` is the database name to be connected. ## Usage In order to make use of this handler and connect to the TDEngine database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE tdengine_datasource WITH ENGINE = 'tdengine', PARAMETERS = { "user": "tdengine_user", "password": "password", "url": "localhost:6041", "token": "token", "database": "tdengine_db" }; ``` You can specify `token` instead of `user` and `password` while using TDEngine. You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM tdengine_datasource.demo_table; ``` # Teradata Source: https://docs.mindsdb.com/integrations/data-integrations/teradata This documentation describes the integration of MindsDB with [Teradata](https://www.teradata.com/why-teradata), the complete cloud analytics and data platform for Trusted AI. The integration allows MindsDB to access data from Teradata and enhance Teradata with AI capabilities. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](https://docs.mindsdb.com/setup/self-hosted/docker) or [Docker Desktop](https://docs.mindsdb.com/setup/self-hosted/docker-desktop). 2. To connect Teradata to MindsDB, install the required dependencies following [this instruction](https://docs.mindsdb.com/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to Teradata from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/teradata_handler) as an engine. ```sql theme={null} CREATE DATABASE teradata_datasource WITH ENGINE = 'teradata', PARAMETERS = { "host": "192.168.0.41", "user": "demo_user", "password": "demo_password", "database": "example_db" }; ``` Required connection parameters include the following: * `host`: The hostname, IP address, or URL of the Teradata server. * `user`: The username for the Teradata database. * `password`: The password for the Teradata database. Optional connection parameters include the following: * `database`: The name of the Teradata database to connect to. Defaults is the user's default database. ## Usage Retrieve data from a specified table by providing the integration, database and table names: ```sql theme={null} SELECT * FROM teradata_datasource.database_name.table_name LIMIT 10; ``` Run Teradata SQL queries directly on the connected Teradata database: ```sql theme={null} SELECT * FROM teradata_datasource ( --Native Query Goes Here SELECT emp_id, emp_name, job_duration AS tsp FROM employee EXPAND ON job_duration AS tsp BY INTERVAL '1' YEAR FOR PERIOD(DATE '2006-01-01', DATE '2008-01-01'); ); ``` The above examples utilize `teradata_datasource` as the datasource name, which is defined in the `CREATE DATABASE` command. ## Troubleshooting `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the Teradata database. * **Checklist**: 1. Make sure the Teradata database is active. 2. Confirm that host, user and password are correct. Try a direct connection using a client like DBeaver. 3. Ensure a stable network between MindsDB and Teradata. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel-data * Incorrect: SELECT \* FROM integration.'travel-data' * Correct: SELECT \* FROM integration.\`travel-data\` `Connection Timeout Error` * **Symptoms**: Connection to the Teradata database times out or queries take too long to execute. * **Checklist**: 1. Ensure the Teradata server is running and accessible (if the server has been idle for a long time, it may have shut down automatically). # TiDB Source: https://docs.mindsdb.com/integrations/data-integrations/tidb This is the implementation of the TiDB data handler for MindsDB. [TiDB](https://www.pingcap.com/tidb/) is an open-source NewSQL database that supports Hybrid Transactional and Analytical Processing workloads. It is MySQL-compatible and can provide horizontal scalability, strong consistency, and high availability. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect TiDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to TiDB. ## Implementation This handler is implemented by extending the MySQL data handler. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. ## Usage In order to make use of this handler and connect to the TiDB database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE tidb_datasource WITH ENGINE = 'tidb', PARAMETERS = { "host": "127.0.0.1", "port": 4000, "database": "tidb", "user": "root", "password": "password" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM tidb_datasource.demo_table; ``` # TimescaleDB Source: https://docs.mindsdb.com/integrations/data-integrations/timescaledb This documentation describes the integration of MindsDB with [TimescaleDB](https://docs.timescale.com). ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect TimescaleDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection Establish a connection to TimescaleDB from MindsDB by executing the following SQL command and providing its [handler name](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/timescaledb_handler) as an engine. ```sql theme={null} CREATE DATABASE timescaledb_datasource WITH engine = 'timescaledb', parameters = { "host": "examplehost.timescaledb.com", "port": 5432, "user": "example_user", "password": "my_password", "database": "tsdb" }; ``` Required connection parameters include the following: * `user`: The username for the TimescaleDB database. * `password`: The password for the TimescaleDB database. * `host`: The hostname, IP address, or URL of the TimescaleDB server. * `port`: The port number for connecting to the TimescaleDB server. * `database`: The name of the TimescaleDB database to connect to. Optional connection parameters include the following: * `schema`: The database schema to use. Default is public. ## Usage Before attempting to connect to a TimescaleDB server using MindsDB, ensure that it accepts incoming connections using [this guide](https://docs.timescale.com/latest/getting-started/setup/remote-connections/). The following usage examples utilize the connection to TimescaleDB made via the `CREATE DATABASE` statement and named `timescaledb_datasource`. Retrieve data from a specified table by providing the integration and table name. You can use this established connection to query your table as follows, ```sql theme={null} SELECT * FROM timescaledb_datasource.sensor; ``` Run PostgreSQL-native queries directly on the connected TimescaleDB database: ```sql theme={null} SELECT * FROM timescaledb_datasource ( --Native Query Goes Here SELECT model, COUNT(*) OVER (PARTITION BY model, year) AS units_to_sell, ROUND((CAST(tax AS decimal) / price), 3) AS tax_div_price FROM used_car_price ); ``` ## Troubleshooting `Database Connection Error` * **Symptoms**: Failure to connect MindsDB with the TimescaleDB database. * **Checklist**: 1. Make sure the TimescaleDB server is active. 2. Confirm that host, port, user, schema, and password are correct. Try a direct TimescaleDB connection. 3. Ensure a stable network between MindsDB and TimescaleDB. `SQL statement cannot be parsed by mindsdb_sql` * **Symptoms**: SQL queries failing or not recognizing table names containing spaces or special characters. * **Checklist**: 1. Ensure table names with spaces or special characters are enclosed in backticks. 2. Examples: * Incorrect: SELECT \* FROM integration.travel data * Incorrect: SELECT \* FROM integration.'travel data' * Correct: SELECT \* FROM integration.\`travel data\` # Trino Source: https://docs.mindsdb.com/integrations/data-integrations/trino This is the implementation of the Trino data handler for MindsDB. [Trino](https://trino.io/) is an open-source distributed SQL query engine designed to query large data sets distributed over one or more heterogeneous data sources. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Trino to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Trino. ## Implementation This handler is implemented using `pyhive`, a collection of Python DB-API and SQLAlchemy interfaces for Presto and Hive. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. There are some optional arguments as follows: * `auth` is the authentication method. Currently, only `basic` is supported. * `http_scheme` takes the value of `http`by default. It can be set to `https` as well. * `catalog` is the catalog. * `schema` is the schema name. * `with` defines default WITH-clause (properties) for ALL tables. This parameter is experimental and might be changed or removed in future release. ## Usage In order to make use of this handler and connect to the Trino database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE trino_datasource WITH ENGINE = 'trino', PARAMETERS = { "host": "127.0.0.1", "port": 443, "auth": "basic", "http_scheme": "https", "user": "trino", "password": "password", "catalog": "default", "schema": "test", "with": "with (transactional = true)" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM trino_datasource.demo_table; ``` # Vertica Source: https://docs.mindsdb.com/integrations/data-integrations/vertica This is the implementation of the Vertica data handler for MindsDB. The column-oriented [Vertica Analytics Platform](https://www.vertica.com/overview/) was designed to manage large, fast-growing volumes of data and with fast query performance for data warehouses and other query-intensive applications. The product claims to greatly improve query performance over traditional relational database systems, and to provide high availability and exabyte scalability on commodity enterprise servers. Vertica runs on multiple cloud computing systems as well as on Hadoop nodes. Vertica's Eon Mode separates compute from storage, using S3 object storage and dynamic allocation of compute notes. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Vertica to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Vertica. ## Implementation This handler is implemented using `vertica-python`, a Python library that allows you to use Python code to run SQL commands on the Vertica database. The required arguments to establish a connection are as follows: * `user` is the username asscociated with the database. * `password` is the password to authenticate your access. * `host` is the host name or IP address of the server. * `port` is the port through which TCP/IP connection is to be made. * `database` is the database name to be connected. * `schema` is the schema name to get tables from. ## Usage In order to make use of this handler and connect to the Vertica database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE vertica_datasource WITH engine = 'vertica', parameters = { "user": "dbadmin", "password": "password", "host": "127.0.0.1", "port": 5433, "schema_name": "public", "database": "VMart" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM vertica_datasource.TEST; ``` # Vitess Source: https://docs.mindsdb.com/integrations/data-integrations/vitess This is the implementation of the Vitess data handler for MindsDB. [Vitess](https://vitess.io/) is a database solution for deploying, scaling, and managing large clusters of open-source database instances. It currently supports MySQL and Percona Server for MySQL. It's architected to run as effectively in a public or private cloud architecture as it does on dedicated hardware. It combines and extends many important SQL features with the scalability of a NoSQL database. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Vitess to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Vitess. ## Implementation This handler is implemented by extending the MySQL data handler. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. ## Usage In order to make use of this handler and connect to the Vitess server in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE vitess_datasource WITH ENGINE = "vitess", PARAMETERS = { "user": "root", "password": "", "host": "localhost", "port": 33577, "database": "commerce" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM vitess_datasource.product LIMIT 10; ``` # YugabyteDB Source: https://docs.mindsdb.com/integrations/data-integrations/yugabytedb This is the implementation of the YugabyteDB data handler for MindsDB. [YugabyteDB](https://www.yugabyte.com/) is a high-performance, cloud-native distributed SQL database that aims to support all PostgreSQL features. It is best fit for cloud-native OLTP (i.e. real-time, business-critical) applications that need absolute data correctness and require at least one of the following: scalability, high tolerance to failures, or globally-distributed deployments. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect YugabyteDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to YugabyteDB. ## Implementation This handler is implemented using `psycopg2`, a Python library that allows you to use Python code to run SQL commands on the YugabyteDB database. The required arguments to establish a connection are as follows: * `user` is the database user. * `password` is the database password. * `host` is the host name, IP address, or URL. * `port` is the port used to make TCP/IP connection. * `database` is the database name. * `schema` is the schema to which your table belongs. ## Usage In order to make use of this handler and connect to the YugabyteDB database in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE yugabyte_datasource WITH engine = 'yugabyte', parameters = { "user": "admin", "password": "1234", "host": "127.0.0.1", "port": 5433, "database": "yugabyte", "schema": "your_schema_name" }; ``` You can use this established connection to query your table as follows: ```sql theme={null} SELECT * FROM yugabyte_datasource.demo; ``` NOTE : If you are using YugabyteDB Cloud with MindsDB Cloud website you need to add below 3 static IPs of MindsDB Cloud to `allow IP list` for accessing it publicly. ``` 18.220.205.95 3.19.152.46 52.14.91.162 ``` ![public](https://github-production-user-asset-6210df.s3.amazonaws.com/75653580/238903548-1b054591-f5db-4a6d-a3d0-d048671e4cfa.png) # Data Integrations Source: https://docs.mindsdb.com/integrations/data-overview MindsDB integrates with numerous data sources, including databases, vector stores, and applications, making data available to AI models by connecting data sources to MindsDB. **MindsDB supports Model Context Protocol (MCP)** MindsDB is an MCP server that enables your MCP applications to answer questions over large-scale federated data. [Learn more here](/mcp/overview). This section contains instructions on how to connect data sources to MindsDB. Note that MindsDB doesn't store or copy your data. Instead, it fetches data directly from your connected sources each time you make a query, ensuring that any changes to the data are instantly reflected. This means your data remains in its original location, and MindsDB always works with the most up-to-date information. ### Applications ### Databases ### Vector Stores
If you don't find a data source of your interest, you can [request a feature here](https://github.com/mindsdb/mindsdb/issues/new?assignees=\&labels=enhancement\&projects=\&template=feature_request_v2.yaml) or build a handler following [this instruction for data handlers](/contribute/data-handlers) and [this instruction for applications](/contribute/app-handlers). **Metadata about data handlers and data sources** **Data handlers** represent a raw implementation of the integration between MindsDB and a data source. Here is how you can query for all the available data handlers used to connect data sources to MindsDB. ```sql theme={null} SELECT * FROM information_schema.handlers WHERE type = 'data'; ``` Or, alternatively: ```sql theme={null} SHOW HANDLERS WHERE type = 'data'; ``` And here is how you can query for all the created AI engines: ```sql theme={null} SELECT * FROM information_schema.databases; ``` Or, alternatively: ```sql theme={null} SHOW DATABASES; ``` # Upload CSV, XLSX, XLS files to MindsDB Source: https://docs.mindsdb.com/integrations/files/csv-xlsx-xls You can upload CSV, XLSX, and XLS files of any size to MindsDB that runs locally via [Docker](/setup/self-hosted/docker) or [pip](/contribute/install). CSV, XLSX, XLS files are stored in the form of a table inside MindsDB. ## Upload files Follow the steps below to upload a file: 1. Click on the `Add` dropdown and choose `Upload file`.

2. Upload a file and provide a name used to access it within MindsDB.

3. Alternatively, upload a file as a link and provide a name used to access it within MindsDB.

## Query files The CSV, XLSX, and XLS files may contain one or more sheets. Here is how to query data within MindsDB. Query for the list of available sheets in the file uploaded under the name `my_file`. ```sql theme={null} SELECT * FROM files.my_file; ``` Query for the content of one of the sheets listed with the command above. ```sql theme={null} SELECT * FROM files.my_file.my_sheet; ``` # Upload JSON files to MindsDB Source: https://docs.mindsdb.com/integrations/files/json You can upload JSON files of any size to MindsDB that runs locally via [Docker](/setup/self-hosted/docker) or [pip](/contribute/install). JSON files are converted into a table, if the JSON file structure allows for it. Otherwise, JSON files are stored similarly to text files. Here is the sample format of a JSON file that can be uploaded to MindsDB: ``` [ { "id": 1, "name": "Alice", "contact": { "email": "alice@example.com", "phone": "123-456-7890" }, "address": { "street": "123 Maple Street", "city": "Wonderland", "zip": "12345" } }, { "id": 2, "name": "Bob", "contact": { "email": "bob@example.com", "phone": "987-654-3210" }, "address": { "street": "456 Oak Avenue", "city": "Builderland", "zip": "67890" } } ] ``` MindsDB converts it into a table where each row stores the high-level object. ```sql theme={null} | id | name | contact | address | | --- | ----- | ---------------------------------------------------- | --------------------------------------------------------------- | | 1 | Alice | {"email":"alice@example.com","phone":"123-456-7890"} | {"city":"Wonderland","street":"123 Maple Street","zip":"12345"} | | 2 | Bob | {"email":"bob@example.com","phone":"987-654-3210"} | {"city":"Builderland","street":"456 Oak Avenue","zip":"67890"} | ``` You can extract the JSON fields from `contact` and `address` columns with the `json_extract` function. ```sql theme={null} SELECT id, name, json_extract(contact, '$.email') AS email, json_extract(address, '$.city') AS city FROM files.json_file_name; ``` ## Upload files Follow the steps below to upload a file: 1. Click on the `Add` dropdown and choose `Upload file`.

2. Upload a file and provide a name used to access it within MindsDB.

3. Alternatively, upload a file as a link and provide a name used to access it within MindsDB.

## Query files Here is how to query data within MindsDB. Query for the content of the file uploaded under the name `my_file`. ```sql theme={null} SELECT * FROM files.my_file; ``` # Upload Parquet files to MindsDB Source: https://docs.mindsdb.com/integrations/files/parquet You can upload Parquet files of any size to MindsDB that runs locally via [Docker](/setup/self-hosted/docker) or [pip](/contribute/install). Parquet files are stored in the form of a table inside MindsDB. ## Upload files Follow the steps below to upload a file: 1. Click on the `Add` dropdown and choose `Upload file`.

2. Upload a file and provide a name used to access it within MindsDB.

3. Alternatively, upload a file as a link and provide a name used to access it within MindsDB.

## Query files Here is how to query data within MindsDB. Query for the content of the file uploaded under the name `my_file`. ```sql theme={null} SELECT * FROM files.my_file; ``` # Upload PDF files to MindsDB Source: https://docs.mindsdb.com/integrations/files/pdf You can upload PDF files of any size to MindsDB that runs locally via [Docker](/setup/self-hosted/docker) or [pip](/contribute/install). Note that MindsDB supports only searchable PDFs, as opposed to scanned PDFs. These are stored in the form of a table inside MindsDB. ## Upload files Follow the steps below to upload a file: 1. Click on the `Add` dropdown and choose `Upload file`.

2. Upload a file and provide a name used to access it within MindsDB.

## Query files Here is how to query data within MindsDB. Query for the content of the file uploaded under the name `my_file`. ```sql theme={null} SELECT * FROM files.my_file; ``` # Upload TXT files to MindsDB Source: https://docs.mindsdb.com/integrations/files/txt You can upload TXT files of any size to MindsDB that runs locally via [Docker](/setup/self-hosted/docker) or [pip](/contribute/install). TXT files are divided into chunks and stored in multiple table cells. MindsDB uses the [TextLoader from LangChain](https://api.python.langchain.com/en/latest/document_loaders/langchain_community.document_loaders.text.TextLoader.html) to load TXT files. ## Upload files Follow the steps below to upload a file: 1. Click on the `Add` dropdown and choose `Upload file`.

2. Upload a file and provide a name used to access it within MindsDB.

## Query files Here is how to query data within MindsDB. Query for the content of the file uploaded under the name `my_file`. ```sql theme={null} SELECT * FROM files.my_file; ``` # Sample Database Source: https://docs.mindsdb.com/integrations/sample-database MindsDB provides a read-only PostgreSQL database pre-loaded with various datasets. These datasets are curated to cover a wide range of scenarios and use cases, allowing you to experiment with different features of MindsDB. Our publicly accessible PostgreSQL database is designed for testing and playground purposes. By using these datasets, you can quickly get started with MindsDB, understand how it works, and see how it can be applied to real-world problems. ## Connection To connect to our read-only PostgreSQL database and access the example datasets, use the following connection parameters: ```python theme={null} CREATE DATABASE postgresql_conn WITH ENGINE = 'postgres', PARAMETERS = { "user": "demo_user", "password": "demo_password", "host": "samples.mindsdb.com", "port": "5432", "database": "demo", "schema": "demo" }; ``` Below is the list of all avaiable datasets as tables. ## Data Tables Here are the tables converted into Markdown format: ### Fraud Detection Dataset This `fraud_detection` table contains data on mobile money transactions where each step represents an hour of simulation. | step | type | amount | nameOrig | oldbalanceOrg | newbalanceOrig | nameDest | oldbalanceDest | newbalanceDest | isFraud | isFlaggedFraud | | ---- | -------- | ------- | ----------- | ------------- | -------------- | ----------- | -------------- | -------------- | ------- | -------------- | | 1 | PAYMENT | 9839.64 | C1231006815 | 170136.0 | 160296.36 | M1979787155 | 0.0 | 0.0 | 0 | 0 | | 1 | PAYMENT | 1864.28 | C1666544295 | 21249.0 | 19384.72 | M2044282225 | 0.0 | 0.0 | 0 | 0 | | 1 | TRANSFER | 181.0 | C1305486145 | 181.0 | 0.0 | C553264065 | 0.0 | 0.0 | 1 | 0 | To learn more about this dataset, please visit [this link](https://www.kaggle.com/datasets/ealaxi/paysim1). ### Customer Churn Dataset This `customer_churn` table contains data of a U.S. bank customer churn. | RowNumber | CustomerId | Surname | CreditScore | Geography | Gender | Age | Tenure | Balance | NumOfProducts | HasCrCard | IsActiveMember | EstimatedSalary | Exited | | --------- | ---------- | -------- | ----------- | --------- | ------ | --- | ------ | -------- | ------------- | --------- | -------------- | --------------- | ------ | | 1 | 15634602 | Hargrave | 619 | France | Female | 42 | 2 | 0 | 1 | 1 | 1 | 101348.88 | 1 | | 2 | 15647311 | Hill | 608 | Spain | Female | 41 | 1 | 83807.86 | 1 | 0 | 1 | 112542.58 | 0 | | 3 | 15619304 | Onio | 502 | France | Female | 42 | 8 | 159660.8 | 3 | 1 | 0 | 113931.57 | 1 | To learn more about this dataset, please visit [this link](https://www.kaggle.com/datasets/shantanudhakadd/bank-customer-churn-prediction). ### Customer Support Chat Dataset This `customer_support_chat` table contains data of chat with customer support, which can be used to train Large Language Models (LLMs) for Fine Tuning and Domain Adaptation. | flags | instruction | category | intent | response | | ----- | ----------------------------------------------------- | -------- | ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | B | question about cancelling order Order Number | ORDER | cancel\_order | I've understood you have a question regarding canceling order Order Number, and I'm here to provide you with the information you need. Please go ahead and ask your question, and I'll do my best to assist you. | | BQZ | i have a question about cancelling order Order Number | ORDER | cancel\_order | I've been informed that you have a question about canceling order Order Number. I'm here to assist you! Please go ahead and let me know what specific question you have, and I'll provide you with all the information and guidance you need. Your satisfaction is my top priority. | To learn more about this dataset, please visit [this link](https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset). ### Bank Customer Transactions Dataset This `bank_customer_transactions` table contains data of customer transactions with demographic and shopping behavior information. | Customer ID | Name | Surname | Gender | Birthdate | Transaction Amount | Date | Merchant Name | Category | | ----------- | -------- | --------- | ------ | ---------- | ------------------ | ---------- | ---------------------- | -------- | | 752858 | Sean | Rodriguez | F | 2002-10-20 | 35.47 | 2023-04-03 | Smith-Russell | Cosmetic | | 26381 | Michelle | Phelps | | 1985-10-24 | 2552.72 | 2023-07-17 | Peck, Spence and Young | Travel | | 305449 | Jacob | Williams | M | 1981-10-25 | 115.97 | 2023-09-20 | Steele Inc | Clothing | To learn more about this dataset, please visit [this link](https://www.kaggle.com/datasets/bkcoban/customer-transactions). ### Telecom Customer Churn Dataset This `telecom_customer_churn` table contains data on customer activities, preferences, and behaviors. | age | gender | security\_no | region\_category | membership\_category | joining\_date | joined\_through\_referral | referral\_id | preferred\_offer\_types | medium\_of\_operation | internet\_option | last\_visit\_time | days\_since\_last\_login | avg\_time\_spent | avg\_transaction\_value | avg\_frequency\_login\_days | points\_in\_wallet | used\_special\_discount | offer\_application\_preference | past\_complaint | complaint\_status | feedback | churn\_risk\_score | | --- | ------ | ------------ | ---------------- | -------------------- | ------------- | ------------------------- | ------------ | ----------------------- | --------------------- | ---------------- | ----------------- | ------------------------ | ---------------- | ----------------------- | --------------------------- | ------------------ | ----------------------- | ------------------------------ | --------------- | ------------------- | ------------------------ | ------------------ | | 18 | F | XW0DQ7H | Village | Platinum Membership | 17-08-2017 | No | xxxxxxxx | Gift Vouchers/Coupons | ? | Wi-Fi | 16:08:02 | 17 | 300.63 | 53005.25 | 17 | 781.75 | Yes | Yes | No | Not Applicable | Products always in Stock | 0 | | 32 | F | 5K0N3X1 | City | Premium Membership | 28-08-2017 | ? | CID21329 | Gift Vouchers/Coupons | Desktop | Mobile\_Data | 12:38:13 | 16 | 306.34 | 12838.38 | 10 | | Yes | No | Yes | Solved | Quality Customer Care | 0 | | 44 | F | 1F2TCL3 | Town | No Membership | 11-11-2016 | Yes | CID12313 | Gift Vouchers/Coupons | Desktop | Wi-Fi | 22:53:21 | 14 | 516.16 | 21027 | 22 | 500.69 | No | Yes | Yes | Solved in Follow-up | Poor Website | 1 | To learn more about this dataset, please visit [this link](https://huggingface.co/datasets/d0r1h/customer_churn). ### House Sales Dataset This `house_sales` table contains data on houses sold throughout the years. | saledate | ma | type | bedrooms | created\_at | | ---------- | ------ | ----- | -------- | -------------------------- | | 2007-09-30 | 441854 | house | 2 | 2007-02-02 15:41:51.922127 | | 2007-12-31 | 441854 | house | 2 | 2007-02-23 22:36:08.540248 | | 2008-03-31 | 441854 | house | 2 | 2007-02-25 19:23:52.585358 | To learn more about this dataset, please visit [this link](https://www.kaggle.com/datasets/). # ChromaDB Source: https://docs.mindsdb.com/integrations/vector-db-integrations/chromadb In this section, we present how to connect ChromaDB to MindsDB. [ChromaDB](https://www.trychroma.com/) is the open-source embedding database. Chroma makes it easy to build LLM apps by making knowledge, facts, and skills pluggable for LLMs. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect ChromaDB to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to ChromaDB. ## Connection This handler is implemented using the `chromadb` Python library. To connect to a remote ChromaDB instance, use the following statement: ```sql theme={null} CREATE DATABASE chromadb_datasource WITH ENGINE = 'chromadb' PARAMETERS = { "host": "YOUR_HOST", "port": YOUR_PORT, "distance": "l2/cosine/ip" -- optional, default is cosine } ``` The required parameters are: * `host`: The host name or IP address of the ChromaDB instance. * `port`: The TCP/IP port of the ChromaDB instance. * `distance`: It defines how the distance between vectors is calculated. Available method include l2, cosine, and ip, as [explained here](https://docs.trychroma.com/docs/collections/configure). To connect to an in-memory ChromaDB instance, use the following statement: ```sql theme={null} CREATE DATABASE chromadb_datasource WITH ENGINE = "chromadb", PARAMETERS = { "persist_directory": "YOUR_PERSIST_DIRECTORY", "distance": "l2/cosine/ip" -- optional } ``` The required parameters are: * `persist_directory`: The directory to use for persisting data. * `distance`: It defines how the distance between vectors is calculated. Available method include l2, cosine, and ip, as [explained here](https://docs.trychroma.com/docs/collections/configure). ## Usage Now, you can use the established connection to create a collection (or table in the context of MindsDB) in ChromaDB and insert data into it: ```sql theme={null} CREATE TABLE chromadb_datasource.test_embeddings ( SELECT embeddings,'{"source": "fda"}' as metadata FROM mysql_datasource.test_embeddings ); ``` `mysql_datasource` is another MindsDB data source that has been created by connecting to a MySQL database. The `test_embeddings` table in the `mysql_datasource` data source contains the embeddings that we want to store in ChromaDB. You can query your collection (table) as shown below: ```sql theme={null} SELECT * FROM chromadb_datasource.test_embeddings; ``` To filter the data in your collection (table) by metadata, you can use the following query: ```sql theme={null} SELECT * FROM chromadb_datasource.test_embeddings WHERE `metadata.source` = "fda"; ``` To conduct a similarity search, the following query can be used: ```sql theme={null} SELECT * FROM chromadb_datasource.test_embeddings WHERE search_vector = ( SELECT embeddings FROM mysql_datasource.test_embeddings LIMIT 1 ); ``` # Couchbase Source: https://docs.mindsdb.com/integrations/vector-db-integrations/couchbase This is the implementation of the Couchbase Vector store data handler for MindsDB. [Couchbase](https://www.couchbase.com/) is an open-source, distributed multi-model NoSQL document-oriented database software package optimized for interactive applications. These applications may serve many concurrent users by creating, storing, retrieving, aggregating, manipulating, and presenting data. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Couchbase to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Couchbase. ## Implementation In order to make use of this handler and connect to a Couchbase server in MindsDB, the following syntax can be used. Note, that the example uses the default `travel-sample` bucket which can be enabled from the couchbase UI with pre-defined scope and documents. ```sql theme={null} CREATE DATABASE couchbase_vectorsource WITH engine='couchbasevector', parameters={ "connection_string": "couchbase://localhost", "bucket": "travel-sample", "user": "admin", "password": "password", "scope": "inventory" }; ``` This handler is implemented using the `couchbase` library, the Python driver for Couchbase. The required arguments to establish a connection are as follows: * `connection_string`: the connection string for the endpoint of the Couchbase server * `bucket`: the bucket name to use when connecting with the Couchbase server * `user`: the user to authenticate with the Couchbase server * `password`: the password to authenticate the user with the Couchbase server * `scope`: scopes are a level of data organization within a bucket. If omitted, will default to `_default` Note: The connection string expects either the couchbases\:// or couchbase:// protocol. If you are using Couchbase Capella, you can find the connection\_string under the Connect tab. It will also be required to whitelist the machine(s) that will be running MindsDB and database credentials will need to be created for the user. These steps can also be taken under the Connect tab. ## Usage Now, you can use the established connection to create a collection (or table in the context of MindsDB) in Couchbase and insert data into it: ### Creating tables Now, you can use the established connection to create a collection (or table in the context of MindsDB) in Couchbase and insert data into it: ```sql theme={null} CREATE TABLE couchbase_vectorsource.test_embeddings ( SELECT embeddings FROM mysql_datasource.test_embeddings ); ``` `mysql_datasource` is another MindsDB data source that has been created by connecting to a MySQL database. The `test_embeddings` table in the `mysql_datasource` data source contains the embeddings that we want to store in Couchbase. ### Querying and searching You can query your collection (table) as shown below: ```sql theme={null} SELECT * FROM couchbase_vectorsource.test_embeddings; ``` To filter the data in your collection (table) by metadata, you can use the following query: ```sql theme={null} SELECT * FROM couchbase_vectorsource.test_embeddings WHERE id = "some_id"; ``` To perform a vector search, the following query can be used: ```sql theme={null} SELECT * FROM couchbase_vectorsource.test_embeddings WHERE embeddings = ( SELECT embeddings FROM mysql_datasource.test_embeddings LIMIT 1 ); ``` ### Deleting records You can delete documents using `DELETE` just like in SQL. ```sql theme={null} DELETE FROM couchbase_vectorsource.test_embeddings WHERE `metadata.test` = 'test1'; ``` ### Dropping connection To drop the connection, use this command ```sql theme={null} DROP DATABASE couchbase_vectorsource; ``` # Milvus Source: https://docs.mindsdb.com/integrations/vector-db-integrations/milvus This is the implementation of the Milvus handler for MindsDB. Milvus is an open-source and blazing fast vector database built for scalable similarity search. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Milvus to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). ## Connection and Usage Visit the [Milvus page for details](https://milvus.io/docs/integration_with_mindsdb.md). # PGVector Source: https://docs.mindsdb.com/integrations/vector-db-integrations/pgvector This is the implementation of the PGVector for MindsDB. PGVector is an open-source vector similarity search for Postgres. It supports the following: * exact and approximate nearest neighbor search, * L2 distance, inner product, and cosine distance, * any language with a Postgres client, * ACID compliance, point-in-time recovery, JOINs, and all of the other great features of Postgres. ## Connection This handler uses `pgvector` Python library. To connect to a PGVector instance, use the following statement: ```sql theme={null} CREATE DATABASE pvec WITH ENGINE = 'pgvector', PARAMETERS = { "host": "127.0.0.1", "port": 5432, "database": "postgres", "user": "user", "password": "password", "distance": "cosine" }; ``` The required arguments to establish a connection are the following: * `host`: The host name or IP address of the postgres instance. * `port`: The port to use when connecting. * `database`: The database to connect to. * `user`: The user to connect as. * `password`: The password to use when connecting. * `distance`: It defines how the distance between vectors is calculated. Available methods include cosine (default), l1, l2, ip, hamming, jaccard. [Learn more here](https://github.com/pgvector/pgvector/blob/master/README.md). ## Usage ### Installing the pgvector extension where you have postgres installed run the following commands to install the pgvector extension `cd /tmp git clone --branch v0.4.4 https://github.com/pgvector/pgvector.git cd pgvector make make install` ### Installing the pgvector python library Ensure you install all from requirements.txt in the pgvector\_handler folder ### Creating a database connection in MindsDB You can create a database connection like you would for a regular postgres database, the only difference is that you need to specify the engine as `pgvector` ```sql theme={null} CREATE DATABASE pvec WITH ENGINE = 'pgvector', PARAMETERS = { "host": "127.0.0.1", "port": 5432, "database": "postgres", "user": "user", "password": "password" }; ``` You can insert data into a new collection like so ```sql theme={null} CREATE TABLE pvec.embed (SELECT embeddings FROM mysql_demo_db.test_embeddings ); CREATE ML_ENGINE openai FROM openai USING api_key = 'your-openai-api-key'; CREATE MODEL openai_emb PREDICT embedding USING engine = 'openai', model_name='text-embedding-ada-002', mode = 'embedding', question_column = 'review'; create table pvec.itemstest ( SELECT m.embedding AS embeddings, t.review content FROM mysql_demo_db.amazon_reviews t join openai_emb m ); ``` You can query a collection within your PGVector as follows: ```sql theme={null} SELECT * FROM pvec.embed Limit 5; SELECT * FROM pvec.itemstest Limit 5; ``` You can query on semantic search like so: ```sql theme={null} SELECT * FROM pvec3.items_test WHERE embeddings = (select * from mindsdb.embedding) LIMIT 5; ``` # Pinecone Source: https://docs.mindsdb.com/integrations/vector-db-integrations/pinecone This is the implementation of the Pinecone for MindsDB. Pinecone is a vector database which is fully-managed, developer-friendly, and easily scalable. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Pinecone to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Pinecone. ## Implementation This handler uses `pinecone-client` python library connect to a pinecone environment. The required arguments to establish a connection are: * `api_key`: the API key that can be found in your pinecone account These optional arguments are used with `CREATE TABLE` statements: * `dimension`: dimensions of the vectors to be stored in the index (default=8) * `metric`: distance metric to be used for similarity search (default='cosine') * `spec`: the spec of the index to be created. This is a dictionary that can contain the following keys: * `cloud`: the cloud provider to use (default='aws') * `region`: the region to use (default='us-east-1') Only the creation of serverless indexes is supported at the moment when running `CREATE TABLE` statements. ## Limitations * [ ] `DROP TABLE` support * [ ] Support for [namespaces](https://docs.pinecone.io/docs/namespaces) * [ ] Display score/distance * [ ] Support for creating/reading sparse values * [ ] `content` column is not supported since it does not exist in Pinecone ## Usage In order to make use of this handler and connect to an environment, use the following syntax: ```sql theme={null} CREATE DATABASE pinecone_dev WITH ENGINE = "pinecone", PARAMETERS = { "api_key": "..." }; ``` You can query pinecone indexes (`temp` in the following examples) based on `id` or `search_vector`, but not both: ```sql theme={null} SELECT * from pinecone_dev.temp WHERE id = "abc" LIMIT 1 ``` ```sql theme={null} SELECT * from pinecone_dev.temp WHERE search_vector = "[1,2,3,4,5,6,7,8]" ``` If you are using subqueries, make sure that the result is only a single row since the use of multiple search vectors is not allowed ```sql theme={null} SELECT * from pinecone_database.temp WHERE search_vector = ( SELECT embeddings FROM sqlitetesterdb.test WHERE id = 10 ) ``` Optionally, you can filter based on metadata too: ```sql theme={null} SELECT * from pinecone_dev.temp WHERE id = "abc" AND metadata.hello < 100 ``` You can delete records using `id` or `metadata` like so: ```sql theme={null} DELETE FROM pinecone_dev.temp WHERE id = "abc" ``` Note that deletion through metadata is not supported in starter tier ```sql theme={null} DELETE FROM pinecone_dev.temp WHERE metadata.tbd = true ``` You can insert data into a new collection like so: ```sql theme={null} CREATE TABLE pinecone_dev.temp ( SELECT * FROM mysql_demo_db.temp LIMIT 10); ``` To update records, you can use insert statement. When there is a conflicting ID in pinecone index, the record is updated with new values. It might take a bit to see it reflected. ```sql theme={null} INSERT INTO pinecone_test.testtable (id,content,metadata,embeddings) VALUES ( 'id1', 'this is a test', '{"test": "test"}', '[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]' ); ``` # Weaviate Source: https://docs.mindsdb.com/integrations/vector-db-integrations/weaviate This is the implementation of the Weaviate for MindsDB. Weaviate is an open-source vector database. It allows you to store data objects and vector embeddings from your favorite ML-models, and scale seamlessly into billions of data objects. ## Prerequisites Before proceeding, ensure the following prerequisites are met: 1. Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). 2. To connect Weaviate to MindsDB, install the required dependencies following [this instruction](/setup/self-hosted/docker#install-dependencies). 3. Install or ensure access to Weaviate. ## Implementation This handler uses `weaviate-client` python library connect to a weaviate instance. The required arguments to establish a connection are: * `weaviate_url`: url of the weaviate database * `weaviate_api_key`: API key to authenticate with weaviate (in case of cloud instance). * `persistence_directory`: directory to be used in case of local storage ### Creating connection In order to make use of this handler and connect to a Weaviate server in MindsDB, the following syntax can be used: ```sql theme={null} CREATE DATABASE weaviate_datasource WITH ENGINE = "weaviate", PARAMETERS = { "weaviate_url" : "https://sample.weaviate.network", "weaviate_api_key": "api-key" }; ``` ```sql theme={null} CREATE DATABASE weaviate_datasource WITH ENGINE = "weaviate", PARAMETERS = { "weaviate_url" : "https://localhost:8080", }; ``` ```sql theme={null} CREATE DATABASE weaviate_datasource WITH ENGINE = "weaviate", PARAMETERS = { "persistence_directory" : "db_path", }; ``` ### Dropping connection To drop the connection, use this command ```sql theme={null} DROP DATABASE weaviate_datasource; ``` ### Creating tables To insert data from a pre-existing table, use `CREATE` ```sql theme={null} CREATE TABLE weaviate_datascource.test (SELECT * FROM sqlitedb.test); ``` As weaviate currently doesn't support json field. So, this creates another table for the "metadata" field and a reference is created in the original table which points to its metadata entry. Weaviate follows GraphQL conventions where classes (which are table schemas) start with a capital letter and properties start with a lowercase letter. So whenever we create a table, the table's name gets capitalized. ### Dropping collections To drop a Weaviate table use this command ```sql theme={null} DROP TABLE weaviate_datasource.tablename; ``` ### Querying and selecting To query database using a search vector, you can use `search_vector` or `embeddings` in `WHERE` clause ```sql theme={null} SELECT * from weaviate_datasource.test WHERE search_vector = '[3.0, 1.0, 2.0, 4.5]' LIMIT 10; ``` Basic query ```sql theme={null} SELECT * from weaviate_datasource.test ``` You can use `WHERE` clause on dynamic fields like normal SQL ```sql theme={null} SELECT * FROM weaviate_datasource.createtest WHERE category = "science"; ``` ### Deleting records You can delete entries using `DELETE` just like in SQL. ```sql theme={null} DELETE FROM weaviate_datasource.test WHERE id IN (1, 2, 3); ``` Update is not supported by mindsdb vector database # MindsDB, an AI Data Solution Source: https://docs.mindsdb.com/mindsdb MindsDB enables humans, AI, agents, and applications to get highly accurate answers across sprawled and large scale data sources.

## Core Philosophy MindsDB is built around three fundamental capabilities that form the foundation of MindsDB, enabling seamless integration, organization, and utilization of data. Connect data from [hundreds of data sources](/integrations/data-overview) that integrate with MindsDB, including databases, data warehouses, applications, and vector databases. Learn more [here](/mindsdb-connect). Unify and organize data from one or multiple (structured and unstructured) data sources, by creating [knowledge bases](/mindsdb_sql/knowledge_bases/overview), [views](/mindsdb_sql/sql/create/view) and [jobs](/mindsdb_sql/sql/create/jobs). Learn more [here](/mindsdb-unify). Generate accurate, context-aware responses from unified data using [agents](/mindsdb_sql/agents/agent) or [MCP API](/mcp/overview), making insights easily accessible across applications and teams. Learn more [here](/mindsdb-respond). ## Install MindsDB MindsDB is an open-source server that can be deployed anywhere, including local machines and clouds, and customized to fit the purpose. * Use [MindsDB via Docker Desktop](/setup/self-hosted/docker-desktop). This is the fastest and recommended way to get started. * Use [MindsDB via Docker](/setup/self-hosted/docker). This provides greater flexibility in customizing the MindsDB instance by rebuilding Docker images. * Use [MindsDB via AWS Marketplace](/setup/cloud/aws-marketplace). This enables running MindsDB in cloud. * Use [MindsDB via PyPI](/contribute/install). This option enables contributions to MindsDB. # Connect Source: https://docs.mindsdb.com/mindsdb-connect MindsDB enables connecting data from various data sources and operating on data without moving it from its source. Granting MindsDB access to data is the foundation for all other capabilities. * **Broad integration support**
Seamlessly connect to databases, applications, and more. * **Real-time data access**
Work with the most up-to-date data without delays from batch processing. * **No data movement required**
Operate directly on data at the source. No copying, syncing, or ETL needed. This documentation includes the following content. These are all the data sources that can be connected to MindsDB.\ Use MindsDB's SQL Editor or connect MindsDB to any SQL client. Use SQL to connect data to MindsDB. # MindsDB as a Federated Query Engine Source: https://docs.mindsdb.com/mindsdb-fqe MindsDB supports federated querying, enabling users to access and analyze data across a wide variety of structured and unstructured data sources using SQL.

## How Query Pushdown Works in MindsDB MindsDB acts as a federated query engine by translating and pushing down SQL queries to the native engines of connected data sources. Rather than retrieving data and processing queries within MindsDB, it delegates computation to the underlying data sources. This “pushdown” approach ensures: * High performance: Queries leverage the indexing and processing capabilities of the native engines. * Low resource usage: MindsDB avoids executing resource-heavy and high-latency operations within the query engine, preventing bottlenecks in CPU, memory, or network. ## Query Translation Limits Each connected data source has its own SQL dialect, features, and constraints. While MindsDB SQL provides a unified interface, not all SQL expressions or data types can be translated across every database engine. In cases where a native data type or expression is not supported by the underlying engine: * The query is passed from MindsDB to the data source in its current form, with unsupported data types handled as strings. * If the data source does not support the syntax, it may return an error. * Errors originating from the underlying data source are passed through to the user to provide the most accurate context. ## Cross-Database Join Limits MindsDB allows joining tables across disparate data sources. However, cross-database joins introduce complexity: * Pushdown can occur partially, not for all joined data sources. * Join conditions for a particular data source must be executable by its underlying database engine. ## Recap MindsDB’s federated query engine enables seamless integration with diverse data systems, but effective use requires understanding the limitations of SQL translation and pushdown: * Pushdown is preferred to optimize performance and avoid resource strain. * Not all SQL constructs are translatable, especially for vector stores or non-relational systems. * Errors may occur when a connected data source cannot parse the generated query. * Workarounds include query decomposition, using simpler expressions, and avoiding unsupported joins or vector logic. Understanding these nuances helps users debug query errors more effectively and make full use of MindsDB’s federated query capabilities. # Navigating the MindsDB GUI Source: https://docs.mindsdb.com/mindsdb-gui MindsDB offers a user-friendly graphical interface that allows users to execute SQL commands, view their outputs, and easily navigate connected data sources, projects, and their contents. Let's explore the features and usage of the MindsDB editor. ## Accessing the MindsDB GUI Editor Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop).

## Exploring the MindsDB GUI Editor ### Query Editor This is the primary component where users can input SQL commands and queries. It provides a code editor environment where users can write, edit, and execute SQL statements. It is located in the top center of the MindsDB GUI.

You can open multiple query editor tabs by clicking the plus button next to the current tab, like this:

### Results Viewer Once a query is executed, the results viewer displays the output of the query. It presents the results in a tabular format, showing rows and columns of data. It is located in the bottom center of the MindsDB GUI.

MindsDB supports additional features such as the following: 1. The [Data Insights](/sql/data-insights) feature provides useful data visualization charts. 2. The Export feature lets you export the query output as a CSV or Markdown file. ### Object Explorer The object explorer provides an overview of the projects, models, views, connected data sources, and tables.

Users can navigate through the available objects by expanding the tree structure items. Upon hovering over the tables, you can query their content using the provided `SELECT` statement, as below.

### Model Progress Bar MindsDB provides a custom SQL statement to create and deploy models as virtual tables. Upon executing the [`CREATE MODEL`](/sql/create/model) statement, you can monitor the training progress at the bottom-left corner below the object explorer.

Once the model is ready, its status updates to complete.

### Add New Data Sources You can connect a data source to MindsDB by clicking the `Add` button and choosing `New Datasource`. It takes you to a page that lists all available data sources, including, databases, data warehouses, applications, and more. Here, you can search for a data source you want to connect to and follow the instructions. For more information, visit the **Data Sources** section of the docs. ### Upload Files You can upload a file to MindsDB by clicking the `Add` button and choosing `Upload File`. It takes you to a form where you can upload a file and give it a name. For more information, visit [our docs here](/sql/create/file). ### Upload Custom Models MindsDB offers a way to upload your custom model in the form of Python code and incorporate it into the MindsDB ecosystem. You can do that by clicking the `Add` button and choosing `Upload custom model`. For more information, visit [our docs here](/custom-model/byom). # Naming Standards for MindsDB Objects Source: https://docs.mindsdb.com/mindsdb-objects MindsDB allows you to create and manage a variety of entities within its ecosystem. All MindsDB objects follow the same naming conventions to ensure consistency and compatibility across the platform. ## MindsDB Entities The following entities can be created in MindsDB: * Databases → [CREATE DATABASE](https://docs.mindsdb.com/mindsdb_sql/sql/create/database) * Knowledge Bases (KBs) → [CREATE KNOWLEDGE\_BASE](https://docs.mindsdb.com/mindsdb_sql/knowledge_bases/create) * Tables → [CREATE TABLE](https://docs.mindsdb.com/mindsdb_sql/sql/create/table) * Views → [CREATE VIEW](https://docs.mindsdb.com/mindsdb_sql/sql/create/view) * Projects → [CREATE PROJECT](https://docs.mindsdb.com/mindsdb_sql/sql/create/project) * Jobs → [CREATE JOB](https://docs.mindsdb.com/mindsdb_sql/sql/create/jobs) * Triggers → [CREATE TRIGGER](https://docs.mindsdb.com/mindsdb_sql/sql/create/trigger) * Agents → [CREATE AGENT](https://docs.mindsdb.com/mindsdb_sql/agents/agent_syntax) ## General Naming Rules When creating these entities, the following conventions apply: * **Case-insensitive names** Object names are not sensitive to letter casing. For example: ```sql theme={null} CREATE VIEW my_view (...); -- creates "my_view" CREATE VIEW My_View (...); -- also creates "my_view" CREATE VIEW MY_VIEW (...); -- also creates "my_view" ``` All names are automatically converted to lowercase. * **Allowed characters** Lowercase letters (`a–z`) Numbers (`0–9`) Underscores (`_`) Example: ```sql theme={null} CREATE AGENT my_agent345 (...); -- creates "my_agent345" ``` * **Special characters** If you need special characters or spaces in object names, enclose them in backticks. ```sql theme={null} CREATE VIEW `my view` (...); -- creates “my view” CREATE VIEW `my-view!` (...); -- creates “my-view!” ``` However, names inside backticks must be lowercase. Using uppercase letters will result in an error because all object names must be in lowercase letters. ```sql theme={null} CREATE VIEW `My View` (...); -- error ``` When working with entities from a data source connected to MindsDB, their original names are preserved and are not subject to MindsDB naming rules. For example, if you connect a Snowflake data source that contains a table named `ANALYTICS_101` with a column named `Date_Time`, you must reference them exactly as they appear in the source, utilizing backticks, as shown below: ```sql theme={null} SELECT `Date_Time` FROM snowflake_data.`ANALYTICS_101`; ``` ## Backward Compatibility Older objects created with uppercase letters are still supported for backward compatibility. To reference them, wrap the name in backticks. ```sql theme={null} SELECT * FROM `MyView`; -- selects from “MyView” DROP VIEW `MyView`; -- deletes “MyView” ``` You cannot create new objects with uppercase letters. For example: ```sql theme={null} CREATE VIEW `MyView` (...); -- error ``` ## Examples Here are some practical examples: ### Databases Note that when enclosing the object name in backticks, it preserves the case-sensitivity and special characters included in the name. Otherwise, the upper-case letters are automatically converted to lower-case letters. See the usage examples below. ```sql theme={null} CREATE DATABASE my_database WITH …; -- creates my_database SELECT * FROM my_database.table_name; -- selects from my_database DROP DATABASE my_database; -- drops my_database CREATE DATABASE MY_DATABASE WITH …; -- creates my_database (note that upper-case letters are converted to lower-case letters) SELECT * FROM my_database.table_name; -- selects from my_database SELECT * FROM MY_DATABASE.table_name; -- selects from my_database DROP DATABASE MY_DATABASE; -- drops my_database CREATE DATABASE `My-database` WITH …; -- creates My-database (note that the name must be enclosed in backticks because it contains a special character) SELECT * FROM `My-database`.table_name; -- selects from My-database DROP DATABASE `My-database`; -- drops My-database ``` ```sql theme={null} -- this works CREATE DATABASE demodata WITH …; SELECT * FROM demodata.table_name; SELECT * FROM `demodata`.table_name; DROP DATABASE demodata; -- this works and converts all letters to lower-case CREATE DATABASE demoData WITH …; SELECT * FROM demoData ... DROP DATABASE demoData; -- this works and keeps upper/lower-case letters because the name is enclosed in backticks CREATE DATABASE `DemoData` WITH …; SELECT * FROM `DemoData` ... DROP DATABASE `DemoData` ... ``` ```sql theme={null} CREATE DATABASE DemoData WITH …; -- creates demodata CREATE DATABASE `DemoData` WITH …; -- cannot create DemoData because demodata already exists DROP DATABASE `DemoData`; -- cannot drop DemoData because DemoData does not exist DROP DATABASE DemoData; -- drops demodata CREATE DATABASE `DemoData` WITH …; -- creates DemoData CREATE DATABASE demodata WITH …; -- cannot create demodata because DemoData already exists DROP DATABASE demodata; -- cannot drop demodata because demodata does not exist DROP DATABASE `DemoData`; -- drops demodata ``` ```sql theme={null} CREATE DATABASE demodata WITH …; -- creates demodata SELECT * FROM DEMODATA.table_name; -- selects from demodata, because DEMODATA is converted to demodata DROP DATABASE demodata; -- drops demodata CREATE DATABASE `DemoData` WITH …; -- creates DemoData SELECT * FROM demodata.table_name; -- cannot select from demodata SELECT * FROM `DemoData`.table_name; -- selects from DemoData DROP DATABASE demodata; -- cannot drop demodata because demodata does not exist DROP DATABASE `DemoData`; -- drops DemoData CREATE DATABASE `Dèmo data 2` WITH …; SELECT * FROM `Dèmo data 2`.table_name; DROP DATABASE `Dèmo data 2`; ``` ### Views ```sql theme={null} CREATE VIEW my_view (...); -- creates "my_view" CREATE VIEW My_View (...); -- also creates "my_view" CREATE VIEW `my view` (...); -- creates "my view" CREATE VIEW `My_View` (...); -- error ``` If an older object named `My_View` exists, you can still use it: ```sql theme={null} SELECT * FROM `My_View`; -- selects from “My_View” DROP VIEW `My_View`; -- deletes “My_View” ``` ### Agents ```sql theme={null} CREATE AGENT my_agent USING ...; -- creates "my_agent" CREATE AGENT My_Agent USING ...; -- also creates "my_agent" CREATE AGENT `my agent 1` USING ...; -- creates "my agent 1" CREATE AGENT `My agent 1` USING ...; -- error ``` If an older object named `My agent 1` exists, you can still use it: ```sql theme={null} SELECT * FROM `My agent 1`; -- selects from “My agent 1” DROP AGENT `My agent 1`; -- deletes “My agent 1” ``` # Respond Source: https://docs.mindsdb.com/mindsdb-respond MindsDB enables generating insightful and accurate responses from unified data using natural language. Whether answering questions, powering applications, or enabling automations, responses are context-aware and grounded in real-time data. * **Natural language data queries**
Ask questions in natural language and receive precise answers. * **AI-powered insights**
Leverage integrated models to analyze, predict, and explain data in context. * **Actionable responses**
Drive decisions and automations directly from query results. This documentation includes the following content. Deploy agents specialized in answering questions over connected and unified data. Connect to MindsDB through MCP (Model Context Protocol) for seamless interaction. # Unify Source: https://docs.mindsdb.com/mindsdb-unify MindsDB enables unifying data from structured and unstructured data sources into a single, queryable interface. This unified view allows seamless querying and model-building across all data without consolidation into one system. * **Federated query engine**
Query across multiple data sources as if they were a single database. * **Structured and unstructured data support**
Unify relational data, documents, vector data, and more in one place. * **No data transformation required**
Use data in its native format without the need for preprocessing. This documentation includes the following content. Index and organize unstructured data for efficient retrieval. Simplify data access by creating unified views across different sources. Organize views, knowledge bases, and models into projects. Operate on data using functions. Schedule tasks with jobs. Set up triggering events on data. # How Agents Work Source: https://docs.mindsdb.com/mindsdb_sql/agents/agent Agents enable conversation with data, including structured and unstructured data connected to MindsDB. Connect your data to MindsDB by [connecting databases or applications](/integrations/data-overview) or [uploading files](/mindsdb_sql/sql/create/file). Users can opt for using [knowledge bases](/mindsdb_sql/knowledge_bases/overview) to store and retrieve data efficiently. Create an agent, passing the connected data and defining the underlying model. ```sql theme={null} CREATE AGENT my_agent USING model = { "provider": "openai", "model_name" : "gpt-4o", "api_key": "sk-abc123" }, data = { "knowledge_bases": ["mindsdb.sales_kb", "mindsdb.orders_kb"], "tables": ["postgres_conn.customers", "mysql_conn.products"] }, prompt_template=' mindsdb.sales_kb stores sales analytics data mindsdb.orders_kb stores order data postgres_conn.customers stores customers data mysql_conn.products stores products data '; ``` Query an agent and ask question over the connected data. ```sql theme={null} SELECT answer FROM my_agent WHERE question = 'What is the average number of orders per customers?'; ``` Follow [this doc page to learn more about the usage of agents](/mindsdb_sql/agents/agent_syntax). # How to Chat with Agents Source: https://docs.mindsdb.com/mindsdb_sql/agents/agent_gui Agents enable conversation with data, including structured and unstructured data connected to MindsDB. MindsDB provides a chat interface that enables users to chat with their data.

Select an agent from the list of existing agents, or create one if none exists yet.

Now the chat interface is connected to this agent via [Agent2Agent Protocol](https://google.github.io/A2A/) and users can chat with the data connected to this agent.

# How to Use Agents Source: https://docs.mindsdb.com/mindsdb_sql/agents/agent_syntax Agents enable conversation with data, including structured and unstructured data connected to MindsDB. ## `CREATE AGENT` Syntax Here is the syntax for creating an agent: ```sql theme={null} CREATE AGENT my_agent USING model = { "provider": "openai", "model_name" : "gpt-4o", "api_key": "sk-abc123", "base_url": "http://example.com", "api_version": "2024-02-01" }, data = { "knowledge_bases": ["project_name.kb_name", ...], "tables": ["datasource_conn_name.table_name", ...] }, prompt_template='describe data', timeout=10; ``` It creates an agent that uses the defined model and has access to the connected data. ```sql theme={null} SHOW AGENTS WHERE name = 'my_agent'; ``` Note that you can insert all tables from a connected data source and all knowledge bases from a project using the `*` syntax. ```sql theme={null} ... data = { "knowledge_bases": ["project_name.*", ...], "tables": ["datasource_conn_name.*", ...] }, ... ``` ### `model` This parameter defines the underlying language model, including: * `provider` It is a required parameter. It defines the model provider from the list below. * `model_name` It is a required parameter. It defines the model name from the list below. * `api_key` It is an optional parameter (applicable to selected providers), which stores the API key to access the model. Users can provide it either in this `api_key` parameter, or using [environment variables](/mindsdb_sql/functions/from_env). * `base_url` It is an optional parameter (applicable to selected providers), which stores the base URL for accessing the model. It is the root URL used to send API requests. * `api_version` It is an optional parameter (applicable to selected providers), which defines the API version. The available models and providers include the following. Available models: * claude-3-opus-20240229 * claude-3-sonnet-20240229 * claude-3-haiku-20240307 * claude-2.1 * claude-2.0 * claude-instant-1.2 Available models include all models accessible from Bedrock. Note that in order to use Bedrock as a model provider, you should ensure the following packages are installed: `langchain_aws` and `transformers`. The following parameters are specific to this provider: * `aws_region_name` is a required parameter. * `aws_access_key_id` is a required parameter. * `aws_secret_access_key` is a required parameter. * `aws_session_token` is an optional parameter. It may be required depending on the AWS permissions setup. Available models: * gemini-2.5-pro-preview-03-25 * gemini-2.0-flash * gemini-2.0-flash-lite * gemini-1.5-flash * gemini-1.5-flash-8b * gemini-1.5-pro Available models: * gemma * llama2 * mistral * mixtral * llava * neural-chat * codellama * dolphin-mixtral * qwen * llama2-uncensored * mistral-openorca * deepseek-coder * nous-hermes2 * phi * orca-mini * dolphin-mistral * wizard-vicuna-uncensored * vicuna * tinydolphin * llama2-chinese * openhermes * zephyr * nomic-embed-text * tinyllama * openchat * wizardcoder * phind-codellama * starcoder * yi * orca2 * falcon * starcoder2 * wizard-math * dolphin-phi * nous-hermes * starling-lm * stable-code * medllama2 * bakllava * codeup * wizardlm-uncensored * solar * everythinglm * sqlcoder * nous-hermes2-mixtral * stable-beluga * yarn-mistral * samantha-mistral * stablelm2 * meditron * stablelm-zephyr * magicoder * yarn-llama2 * wizard-vicuna * llama-pro * deepseek-llm * codebooga * mistrallite * dolphincoder * nexusraven * open-orca-platypus2 * all-minilm * goliath * notux * alfred * megadolphin * xwinlm * wizardlm * duckdb-nsql * notus Available models: * gpt-3.5-turbo * gpt-3.5-turbo-16k * gpt-3.5-turbo-instruct * gpt-4 * gpt-4-32k * gpt-4-1106-preview * gpt-4-0125-preview * gpt-4.1 * gpt-4.1-mini * gpt-4o * o4-mini * o3-mini * o1-mini Available models: * microsoft/phi-3-mini-4k-instruct * mistralai/mistral-7b-instruct-v0.2 * writer/palmyra-med-70b * mistralai/mistral-large * mistralai/codestral-22b-instruct-v0.1 * nvidia/llama3-chatqa-1.5-70b * upstage/solar-10.7b-instruct * google/gemma-2-9b-it * adept/fuyu-8b * google/gemma-2b * databricks/dbrx-instruct * meta/llama-3\_1-8b-instruct * microsoft/phi-3-medium-128k-instruct * 01-ai/yi-large * nvidia/neva-22b * meta/llama-3\_1-70b-instruct * google/codegemma-7b * google/recurrentgemma-2b * google/gemma-2-27b-it * deepseek-ai/deepseek-coder-6.7b-instruct * mediatek/breeze-7b-instruct * microsoft/kosmos-2 * microsoft/phi-3-mini-128k-instruct * nvidia/llama3-chatqa-1.5-8b * writer/palmyra-med-70b-32k * google/deplot * meta/llama-3\_1-405b-instruct * aisingapore/sea-lion-7b-instruct * liuhaotian/llava-v1.6-mistral-7b * microsoft/phi-3-small-8k-instruct * meta/codellama-70b * liuhaotian/llava-v1.6-34b * nv-mistralai/mistral-nemo-12b-instruct * microsoft/phi-3-medium-4k-instruct * seallms/seallm-7b-v2.5 * mistralai/mixtral-8x7b-instruct-v0.1 * mistralai/mistral-7b-instruct-v0.3 * google/paligemma * google/gemma-7b * mistralai/mixtral-8x22b-instruct-v0.1 * google/codegemma-1.1-7b * nvidia/nemotron-4-340b-instruct * meta/llama3-70b-instruct * microsoft/phi-3-small-128k-instruct * ibm/granite-8b-code-instruct * meta/llama3-8b-instruct * snowflake/arctic * microsoft/phi-3-vision-128k-instruct * meta/llama2-70b * ibm/granite-34b-code-instruct Available models: * palmyra-x5 * palmyra-x4 Users can define the model for the agent choosing one of the following options. **Option 1.** Use the `model` parameter to define the specification. ```sql theme={null} CREATE AGENT my_agent USING model = { "provider": "openai", "model_name" : "got-4o", "api_key": "sk-abc123", "base_url": "https://example.com/", "api_version": "2024-02-01" }, ... ``` **Option 2.** Define the default model in the [MindsDB configuration file](/setup/custom-config). If you define `default_llm` in the configuration file, you do not need to provide the `model` parameter when creating an agent. If provide both, then the values from the `model` parameter are used. You can define the default models in the Settings of the MindsDB Editor GUI. ```bash theme={null} "default_llm": { "provider": "openai", "model_name" : "got-4o", "api_key": "sk-abc123", "base_url": "https://example.com/", "api_version": "2024-02-01" } ``` ### `data` This parameter stores data connected to the agent, including knowledge bases and data sources connected to MindsDB. The following parameters store the list of connected data. * `knowledge_bases` stores the list of [knowledge bases](/mindsdb_sql/knowledge_bases/overview) to be used by the agent. * `tables` stores the list of tables from data sources connected to MindsDB. ### `prompt_template` This parameter stores instructions for the agent. It is recommended to provide data description of the data sources listed in the `knowledge_bases` and `tables` parameters to help the agent locate relevant data for answering questions. ### `timeout` This parameter defines the time the agent can take to come back with an answer. For example, when the `timeout` parameter is set to 10, the agent has 10 seconds to return an answer. If the agent takes longer than 10 seconds, it aborts the process and comes back with an answer indicating its failure to return an answer within the defined time interval. ## `SELECT FROM AGENT` Syntax Query an agent to generate responses to questions. ```sql theme={null} SELECT answer FROM my_agent WHERE question = 'What is the average number of orders per customers?'; ``` You can redefine the agent's parameters at the query time as below. ```sql theme={null} SELECT answer FROM my_agent WHERE question = 'What is the average number of orders per customers?'; USING model = { "provider": "openai", "model_name" : "gpt-4.1", "api_key": "sk-abc123" }, data = { "knowledge_bases": ["project_name.kb_name", ...], "tables": ["datasource_conn_name.table_name", ...] }, prompt_template='describe data', timeout=10; ``` The `USING` clause may contain any combination of parameters from the `CREATE AGENT` command, depending on which parameters users want to update for the query. For example, users may want to check the performance of other models to decide which model works better for their use case. ```sql theme={null} SELECT answer FROM my_agent WHERE question = 'What is the average number of orders per customers?'; USING model = { "provider": "google", "model_name" : "gemini-2.5-flash", "api_key": "ABc123" }; ``` ## `ALTER AGENT` Syntax Update existing agents with new data, model, or prompt. ```sql theme={null} ALTER AGENT my_agent USING model = { "provider": "openai", "model_name" : "gpt-4.1", "api_key": "sk-abc123", "base_url": "http://example.com", "api_version": "2024-02-01" }, data = { "knowledge_bases": ["project_name.kb_name", ...], "tables": ["datasource_conn_name.table_name", ...] }, prompt_template='describe data'; ``` Note that all parameters are optional. Users can update any combination of parameters. See detailed descriptions of parameters in the [`CREATE AGENT` section](/mindsdb_sql/agents/agent_syntax#create-agent-syntax). Here is how to connect new data to an agent. ```sql theme={null} ALTER AGENT my_agent USING data = { "knowledge_bases": ["mindsdb.sales_kb"], "tables": ["mysql_db.car_sales", "mysql_db.car_info"] }; ``` And here is how to update a model used by the agent. ```sql theme={null} ALTER AGENT my_agent USING model = { "provider": "openai", "model_name" : "gpt-4.1", "api_key": "sk-abc123" }; ``` ## `DROP AGENT` Syntax Here is the syntax for deleting an agent: ```sql theme={null} DROP AGENT my_agent; ``` # MariaDB SkySQL Setup Guide with MindsDB Source: https://docs.mindsdb.com/mindsdb_sql/connect/connect-mariadb-skysql Find more information on MariaDB Sky SQL [here](https://cloud.MariaDB.com/) ## 1. Select your service for MindsDB If you haven't already, identify the service to be enabled with MindsDB and make sure it is running. Otherwise, skip to step 2. ## 2. Add MindsDB to your service Allowlist Access to MariaDB SkySQL services is [restricted on a per-service basis](https://mariadb.com/products/skysql/docs/security/firewalls/ip-allowlist-services/). Add the following IP addresses to allow MindsDB to connect to your MariaDB service, do this by clicking on the cog icon and navigating to Security Access. In the dialog, input as prompted – one by one – the following IPs: ``` 18.220.205.95 3.19.152.46 52.14.91.162 ``` ## 3. Download your service .pem file A [certificate authority chain](https://mariadb.com/products/skysql/docs/connect/connection-parameters-portal/#certificate-authority-chain) (.pem file) must be provided for proper TLS certificate validation. From your selected service, click on the world globe icon (Connect to service). In the Login Credentials section, click Download. The `aws_skysql_chain.pem` file will download onto your machine. ## 4. Publically Expose your service .pem File Select secure storage for the `aws_skysql_chain.pem` file that allows a working public URL or localpath. For example, you can store it in an S3 bucket. ## 5. Link MindsDB to your MariaDB SkySQL Service To print the query template, go to MindsDB Editor and add a new data source from the Connect tab, choose MariaDB SkySQL from the list. Fill in the values and run a query to complete the setup. Here are the codes: ```sql Template theme={null} CREATE DATABASE maria_datasource --- display name for the database WITH ENGINE = 'MariaDB', --- name of the MindsDB handler PARAMETERS = { "host": " ", --- host IP address or URL "port": , --- port used to make TCP/IP connection "database": " ", --- database name "user": " ", --- database user "password": " ", --- database password "ssl": True/False, --- optional, the `ssl` parameter value indicates whether SSL is enabled (`True`) or disabled (`False`) "ssl_ca": { --- optional, SSL Certificate Authority "path": " " --- either "path" or "url" }, "ssl_cert": { --- optional, SSL certificates "url": " " --- either "path" or "url" }, "ssl_key": { --- optional, SSL keys "path": " " --- either "path" or "url" } }; ``` ```sql Example for MariaDB SkySQL Service theme={null} CREATE DATABASE skysql_datasource WITH ENGINE = 'MariaDB', PARAMETERS = { "host": "mindsdbtest.mdb0002956.db1.skysql.net", "port": 5001, "database": "mindsdb_data", "user": "DB00007539", "password": "password", --- here, the SSL certificate is required "ssl-ca": { "url": "https://mindsdb-web-builds.s3.amazonaws.com/aws_skysql_chain.pem" } }; ```

## What's Next? Now that you are all set, we recommend you check out our **Tutorials** and **Community Tutorials** sections, where you'll find various examples of regression, classification, and time series predictions with MindsDB. To learn more about MindsDB itself, follow the guide on [MindsDB database structure](/sql/table-structure/). Also, don't miss out on the remaining pages from the **SQL API** section, as they explain a common SQL syntax with examples. Have fun! # MindsDB and DBeaver Source: https://docs.mindsdb.com/mindsdb_sql/connect/dbeaver DBeaver is a database tool that allows you to connect to and work with various database engines. You can download it [here](https://dbeaver.io/). ## Data Setup First, create a new database connection in DBeaver by clicking the icon, as shown below.

Next, choose the MySQL database engine and click the *Next* button. If you have multiple `MySQL` options, choose the `Driver for MySQL8 and later`.

Now it's time to fill in the connection details.

Use the following parameters: * `127.0.0.1` or `localhost` for the host name. If you run MindsDB in cloud, specify the host name accordingly. * `47335` for the port, which is the port of the MySQL API exposed by MindsDB. Learn more about [available APIs here](/setup/environment-vars#mindsdb-apis). * `mindsdb` for the database name. * `mindsdb` for the user name, unless specified differently in the [`config.json` file](/setup/custom-config#auth). * `` for the password, unless specified differently in the [`config.json` file](/setup/custom-config#auth). Now we are ready to test the connection. ## Testing the Connection Click on the `Test Connection...` button to check if all the provided data allows you to connect to MindsDB. On success, you should see the message, as below.

## Let's Run Some Queries To finally make sure that our MindsDB database connection works, let's run some queries. ```sql theme={null} SHOW FULL DATABASES; ``` On execution, we get: ```sql theme={null} +----------------------+---------+--------+ | Database | TYPE | ENGINE | +----------------------+---------+--------+ | information_schema | system | [NULL] | | mindsdb | project | [NULL] | | files | data | files | +----------------------+---------+--------+ ``` Here is how it looks in DBeaver:

How to [whitelist MindsDB Cloud IP address](/faqs/whitelist-ips)? ## What's Next? Now that you are all set, we recommend you to check out our [Tutorials](/sql/tutorials/house-sales-forecasting) section where you'll find various examples of regression, classification, and time series predictions with MindsDB or [Community Tutorials](/tutorials) list. To learn more about MindsDB itself, follow the guide on [MindsDB database structure](/sql/table-structure/). Also, don't miss out on the remaining pages from the **SQL API** section, as they explain a common SQL syntax with examples. Have fun! # MindsDB and Deepnote Source: https://docs.mindsdb.com/mindsdb_sql/connect/deepnote We have worked with the team at Deepnote, and built native integration to Deepnote notebooks. Please check: * [Deepnote Demo Guide](https://deepnote.com/project/Machine-Learning-With-SQL-8GDF7bc7SzKlhBLorqoIcw/%2Fmindsdb_demo.ipynb) * [Deepnote Integration Docs](https://docs.deepnote.com/integrations/mindsdb) ## What's Next? Now that you are all set, we recommend you check out our **Tutorials** and **Community Tutorials** sections, where you'll find various examples of regression, classification, and time series predictions with MindsDB. To learn more about MindsDB itself, follow the guide on [MindsDB database structure](/sql/table-structure/). Also, don't miss out on the remaining pages from the **SQL API** section, as they explain a common SQL syntax with examples. Have fun! # MindsDB and Grafana Source: https://docs.mindsdb.com/mindsdb_sql/connect/grafana [Grafana](https://grafana.com/) is an open-source analytics and interactive visualization web application that allows users to ingest data from various sources, query this data, and display it on customizable charts for easy analysis. ## How to Connect To begin, set up Grafana by following one of the methods outlined in the [Grafana Installation Documentation](https://grafana.com/docs/grafana/latest/setup-grafana/installation/#supported-operating-systems). Once Grafana is successfully set up in your environment, navigate to the Connections section, click on Add new connection, and select the MySQL plugin , as shown below.

Now it's time to fill in the connection details.

There are three options, as below. You can connect to your local MindsDB. To do that, please use the connection details below: ``` Host: `127.0.0.1:47335` Username: `mindsdb` Password: Database: ``` Now we are ready to Save & test the connection. ## Testing the Connection Click on the `Save & test` button to check if all the provided data allows you to connect to MindsDB. On success, you should see the message, as below.

## Examples ### Querying To verify the functionality of our MindsDB database connection, you can query data in the Explore view. Use the text edit mode to compose your queries. ```sql theme={null} SHOW FULL DATABASES; ``` On execution, we get:

### Visual Query Builder Now you can build a dashboard with a MindsDB database connection. Example query : ```sql theme={null} CREATE DATABASE mysql_demo_db WITH ENGINE = "mysql", PARAMETERS = { "user": "user", "password": "MindsDBUser123!", "host": "samples.mindsdb.com", "port": "3306", "database": "public" }; SELECT * FROM mysql_demo_db.air_passengers; ``` On execution, we get:

How to [whitelist MindsDB Cloud IP address](/faqs/whitelist-ips)? ## What's Next? Now that you are all set, we recommend you check out our **Tutorials** and **Community Tutorials** sections, where you'll find various examples of regression, classification, and time series predictions with MindsDB. To learn more about MindsDB itself, follow the guide on [MindsDB database structure](/sql/table-structure/). Also, don't miss out on the remaining pages from the **SQL API** section, as they explain a common SQL syntax with examples. Have fun! # MindsDB and Jupyter Notebooks Source: https://docs.mindsdb.com/mindsdb_sql/connect/jupysql Jupysql - full SQL client on Jupyter. It allows you to run SQL and plot large datasets in Jupyter via a %sql and %%sql magics. It also allows users to plot the data directly from the DB ( via %sqlplot magics). Jupysql facilitates working with databases and Jupyter. You can download it [here](https://github.com/ploomber/jupysql) or run a `pip install jupysql`. You can consider an option to interact with MindsDB directly from [MySQL CLI](/connect/mysql-client/) or [Postgres CLI](/connect/postgres-client/). ## How to Connect #### Pre-requisite: * Make sure you have *jupysql* installed: To install it, run `pip install jupysql` * Make sure you have *pymysql* installed: To install it, run `pip install pymysql` You can easily verify the installation of jupysql by running this code: ```python theme={null} %load_ext sql ``` This command loads the package and allows you to run cell magics on top of Jupyter. And for pymysql, validate by running this command: ```python theme={null} import pymysql ``` Please follow the instructions below to connect to your MindsDB via Jupysql and Jupyter. You can use the Python code below to connect your Jupyter notebook (or lab) to Local MindsDB database (via Jupysql). Load the extension: ```python theme={null} %load_ext sql ``` Connect to your DB: ```python theme={null} %sql mysql+pymysql://mindsdb:@127.0.0.1:47335/mindsdb ``` Testing connection by listing the existing tables (pure SQL): ```python theme={null} %sql show tables ``` Please note that we use the following connection details: * Username is `mindsdb` * Password is left empty * Host is `127.0.0.1` * Port is `47335` * Database name is `mindsdb` *Docker* - connecting to docker might have a different port. Create a database connection and execute the code above. On success, only the last command which lists the tables will output. The expected output is: ```bash theme={null} * mysql+pymysql://mindsdb:***@127.0.0.1:47335/mindsdb 2 rows affected. Tables_in_mindsdb models ``` ## What's Next? Now that you are all set, we recommend you check out our **Tutorials** and **Community Tutorials** sections, where you'll find various examples of regression, classification, and time series predictions with MindsDB. To learn more about MindsDB itself, follow the guide on [MindsDB database structure](/sql/table-structure/). Also, don't miss out on the remaining pages from the **SQL API** section, as they explain a common SQL syntax with examples. Have fun! # MindsDB and Metabase Source: https://docs.mindsdb.com/mindsdb_sql/connect/metabase Metabase is open-source software that facilitates data analysis. It lets you visualize your data easily and intuitively. Now that MindsDB supports the MySQL binary protocol, you can connect it to Metabase and see the forecasts by creating and training the models. For more information, visit [Metabase](https://www.metabase.com/). ## Setup ### MindsDB Install MindsDB locally via [Docker](/setup/self-hosted/docker) or [Docker Desktop](/setup/self-hosted/docker-desktop). ### Metabase Now, let's set up the Metabase by following one of the approaches presented on [the Metabase Open Source Edition page](https://www.metabase.com/start/oss/). Here, we use the [.jar approach](https://www.metabase.com/docs/latest/installation-and-operation/running-the-metabase-jar-file.html) for Metabase. ## How to Connect Follow the steps below to connect your MindsDB to Metabase. 1. Open your Metabase and navigate to the *Admin settings* by clicking the cog in the bottom left corner. 2. Once there, click on *Databases* in the top navigation bar. 3. Click on *Add database* in the top right corner. 4. Fill in the form using the following data: ```text theme={null} Database type: `MySQL` Display name: `MindsDB` Host: `localhost` Port: `47335` Database name: `mindsdb` Username: `mindsdb` Password: *leave it empty* ```

5. Click on *Save*. Now you're connected!

## Example Now that the connection between MindsDB and Metabase is established, let's do some examples. Most of the SQL statements that you usually run in your [MindsDB SQL Editor](/connect/mindsdb_editor/) can be run in Metabase as well. Let's start with something easy. On your Metabase's home page, click on *New > SQL query* in the top right corner and then, select your MindsDB database. Let's execute the following command in the editor. ```sql theme={null} SHOW TABLES; ``` On execution, we get:

Please note that creating a [database connection](/sql/tutorials/home-rentals/#connecting-the-data) using the `CREATE DATABASE` statement fails because of the curly braces (`{}`) being used by JDBC as the escape sequences. ```sql theme={null} CREATE DATABASE example_db WITH ENGINE = "postgres", PARAMETERS = { "user": "demo_user", "password": "demo_password", "host": "samples.mindsdb.com", "port": "5432", "database": "demo" }; ``` On execution, we get:

You can overcome this issue using the [MindsDB SQL Editor](/connect/mindsdb_editor/) to create a database. Now, getting back to the Metabase, let's run some queries on the database created with the help of the [MindsDB SQL Editor](/connect/mindsdb_editor/). ```sql theme={null} SELECT * FROM example_db.demo_data.home_rentals LIMIT 10; ``` On execution, we get:

## What's Next? Now that you are all set, we recommend you check out our **Tutorials** and **Community Tutorials** sections, where you'll find various examples of regression, classification, and time series predictions with MindsDB. To learn more about MindsDB itself, follow the guide on [MindsDB database structure](/sql/table-structure/). Also, don't miss out on the remaining pages from the **SQL API** section, as they explain a common SQL syntax with examples. Have fun! # MindsDB SQL Editor Source: https://docs.mindsdb.com/mindsdb_sql/connect/mindsdb_editor MindsDB provides a SQL Editor, so you don't need to download additional SQL clients to connect to MindsDB. ## How to Use the MindsDB SQL Editor There are two ways you can use the Editor, as below. After setting up the MindsDB using [Docker](/setup/self-hosted/docker), or pip on [Linux](/setup/self-hosted/pip/linux)/[Windows](/setup/self-hosted/pip/windows)/[MacOS](/setup/self-hosted/pip/macos), or pip via [source code](/setup/self-hosted/pip/source), go to your terminal and execute the following: ```bash theme={null} python -m mindsdb ``` On execution, we get: ```bash theme={null} ... 2022-05-06 14:07:04,599 - INFO - - GUI available at http://127.0.0.1:47334/ ... ``` Immediately after, your browser automatically opens the MindsDB SQL Editor. In case if it doesn't, visit the URL [`http://127.0.0.1:47334/`](http://127.0.0.1:47334/) in your browser of preference. Here is a sneak peek of the MindsDB SQL Editor: GUI ## What's Next? Now that you are all set, we recommend you check out our **Tutorials** and **Community Tutorials** sections, where you'll find various examples of regression, classification, and time series predictions with MindsDB. To learn more about MindsDB itself, follow the guide on [MindsDB database structure](/sql/table-structure/). Also, don't miss out on the remaining pages from the **SQL API** section, as they explain a common SQL syntax with examples. Have fun! # MindsDB and MySQL CLI Source: https://docs.mindsdb.com/mindsdb_sql/connect/mysql-client MindsDB provides a powerful MySQL API that allows users to connect to it using the MySQL Command Line Client. Please note that connecting to MindsDB's MySQL API is the same as connecting to a MySQL database. Find more information on MySQL CLI [here](https://dev.mysql.com/doc/refman/8.0/en/mysql.html). By default, MindsDB starts the `http` and `mysql` APIs. You can define which APIs to start using the `api` flag as below. ```bash theme={null} python -m mindsdb --api http,mysql,postgres ``` If you want to start MindsDB without the graphical user interface (GUI), use the `--no_studio` flag as below. ```bash theme={null} python -m mindsdb --no_studio ``` ## How to Connect To connect MindsDB in MySQL, use the `mysql` client program: ```bash theme={null} mysql -h [hostname] --port [TCP/IP port number] -u [user] -p [password] ``` Here is the command that allows you to connect to MindsDB. ```bash theme={null} mysql -h 127.0.0.1 --port 47335 -u mindsdb ``` On execution, we get: ```bash theme={null} Welcome to the MariaDB monitor. Commands end with ";" or "\g". Server version: 5.7.1-MindsDB-1.0 (MindsDB) Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MySQL [(none)]> ``` ## What's Next? Now that you are all set, we recommend you check out our [Use Cases](/use-cases/overview) section, where you'll find various examples of regression, classification, time series, and NLP predictions with MindsDB. To learn more about MindsDB itself, follow the guide on [MindsDB database structure](/sql/table-structure/). Also, don't miss out on the remaining pages from the **MindsDB SQL** section, as they explain a common SQL syntax with examples. Have fun! # MindsDB and SQL Alchemy Source: https://docs.mindsdb.com/mindsdb_sql/connect/sql-alchemy SQL Alchemy is a Python SQL toolkit, that provides object-relational mapping features for the Python programming language. SQL Alchemy facilitates working with databases and Python. You can download it [here](https://www.sqlalchemy.org/) or run a `pip install sqlalchemy`. You can consider an option to interact with MindsDB directly from [MySQL CLI](/connect/mysql-client/) or [Postgres CLI](/connect/postgres-client/). ## How to Connect Please follow the instructions below to connect your MindsDB to SQL Alchemy. You can use the Python code below to connect your MindsDB database to SQL Alchemy. Make sure you have the *pymysql* module installed before executing the Python code. To install it, run the `pip install pymysql` command. ```python theme={null} from sqlalchemy import create_engine user = 'mindsdb' password = '' host = '127.0.0.1' port = 47335 database = '' def get_connection(): return create_engine( url="mysql+pymysql://{0}:{1}@{2}:{3}/{4}".format(user, password, host, port, database) ) if __name__ == '__main__': try: engine = get_connection() engine.connect() print(f"Connection to the {host} for user {user} created successfully.") except Exception as ex: print("Connection could not be made due to the following error: \n", ex) ``` Please note that we use the following connection details: * Username is `mindsdb` * Password is left empty * Host is `127.0.0.1` * Port is `47335` * Database name is left empty To create a database connection, execute the code above. On success, the following output is expected: ```bash theme={null} Connection to the 127.0.0.1 for user mindsdb created successfully. ```
The Sqlachemy `create_engine` is lazy. This implies any human error when entering the connection details would be undetectable until an action becomes necessary, such as when calling the `execute` method to execute SQL commands. ## What's Next? Now that you are all set, we recommend you check out our **Tutorials** and **Community Tutorials** sections, where you'll find various examples of regression, classification, and time series predictions with MindsDB. To learn more about MindsDB itself, follow the guide on [MindsDB database structure](/sql/table-structure/). Also, don't miss out on the remaining pages from the **SQL API** section, as they explain a common SQL syntax with examples. Have fun! # MindsDB and Tableau Source: https://docs.mindsdb.com/mindsdb_sql/connect/tableau Tableau lets you visualize your data easily and intuitively. Now that MindsDB supports the MySQL binary protocol, you can connect it to Tableau and see the forecasts. ## How to Connect Follow the steps below to connect your MindsDB to Tableau. First, create a new workbook in Tableau and open the *Connectors* tab in the *Connect to Data* window.

Next, choose *MySQL* and provide the details of your MindsDB connection, such as the IP, port, and database name. Optionally, you can provide a username and password. Then, click *Sign In*.

Here are the connection parameters: ```text theme={null} Host: `localhost` Port: `47335` Database name: `mindsdb` Username: `mindsdb` Password: *leave it empty* ``` You can [set up the authetication with user and password in the config file](/setup/custom-config#auth). Now you're connected! ## Overview of MindsDB in Tableau The content of your MindsDB is visible in the right-side pane.

All the predictors are listed under the *Table* section. You can also switch between the integrations, such as *mindsdb* or *files*, in the *Database* section using the drop-down.

Now, let's run some examples! ## Examples ### Example 1 Previewing one of the tables from the *mysql* integration:

### Example 2 There is one technical limitation. Namely, we cannot join tables from different databases/integrations in Tableau. To overcome this challenge, you can use either views or custom SQL queries. * Previewing a view that joins a data table with a predictor table:

* Using a custom SQL query by clicking the *New Custom SQL* button in the right-side pane:

## What's Next? Now that you are all set, we recommend you check out our **Tutorials** and **Community Tutorials** sections, where you'll find various examples of regression, classification, and time series predictions with MindsDB. To learn more about MindsDB itself, follow the guide on [MindsDB database structure](/sql/table-structure/). Also, don't miss out on the remaining pages from the **SQL API** section, as they explain a common SQL syntax with examples. **From Our Community** Check out the articles and video guides created by our community: * Article on [Predicting & Visualizing Hourly Electricity Demand in the US with MindsDB and Tableau](https://teslimodus.medium.com/predicting-visualizing-hourly-electricity-demand-in-the-us-with-mindsdb-and-tableau-126d1c74d860) by [Teslim Odumuyiwa](https://teslimodus.medium.com/) * Article on [Predicting & Visualizing Petroleum Production with MindsDB and Tableau](https://dev.to/tesprogram/predicting-visualizing-petroleum-production-with-mindsdb-and-tableau-373f) by [Teslim Odumuyiwa](https://github.com/Tes-program) * Article on [Predicting & Visualizing Gas Prices with MindsDB and Tableau](https://dev.to/tesprogram/predicting-visualizing-gas-prices-with-mindsdb-and-tableau-d1p) by [Teslim Odumuyiwa](https://github.com/Tes-program) * Article on [How To Visualize MindsDB Predictions with Tableau](https://dev.to/ephraimx/how-to-visualize-mindsdb-predictions-with-tableau-2bpd) by [Ephraimx](https://dev.to/ephraimx) * Video guide on [Connecting MindsDB to Tableau](https://www.youtube.com/watch?v=eUiBVrm85v4) by [Alissa Troiano](https://github.com/alissatroiano) * Video guide on [Visualizing prediction result in Tableau](https://youtu.be/4aio-8kNbOo) by [Teslim Odumuyiwa](https://github.com/Tes-program) Have fun! # Bring Your Own Function Source: https://docs.mindsdb.com/mindsdb_sql/functions/custom_functions Custom functions provide advanced means of manipulating data. Users can upload custom functions written in Python to MindsDB and apply them to data. ## How It Works You can upload your custom functions via the MindsDB editor by clicking `Add` and `Upload custom functions`, like this:

Here is the form that needs to be filled out in order to bring your custom functions to MindsDB:

Let's briefly go over the files that need to be uploaded: * The Python file stores an implementation of your custom functions. Here is the sample format: ```py theme={null} def function_name_1(a:type, b:type) -> type: return x def function_name_2(a:type, b:type, c:type) -> type: return x ``` Note that if the input and output types are not set, then `str` is used by default. ```py theme={null} def add_integers(a:int, b:int) -> int: return a+b ``` * The optional requirements file, or `requirements.txt`, stores all dependencies along with their versions. Here is the sample format: ```sql theme={null} dependency_package_1 == version dependency_package_2 >= version dependency_package_3 >= verion, < version ... ``` ```sql theme={null} pandas scikit-learn ``` Once you upload the above files, please provide the name for a storage collection. Let's look at an example. ## Example We upload the custom functions, as below:

Here we upload the `functions.py` file that stores an implementation of the functions and the `requirements.txt` file that stores all the dependencies. We named the storage collection as `custom_functions`. Now we can use the functions as below: ```sql theme={null} SELECT functions.add_integers(sqft, 1) AS added_one, sqft FROM example_db.home_rentals LIMIT 1; ``` Here is the output: ```sql theme={null} +-----------+------+ | added_one | sqft | +-----------+------+ | 918 | 917 | +-----------+------+ ``` # The FROM_ENV() Function Source: https://docs.mindsdb.com/mindsdb_sql/functions/from_env MindsDB provides the `FROM_ENV()` function that lets users pull values from the environment variables into MindsDB. ## Usage Here is how to use the `FROM_ENV()` function. ```sql theme={null} FROM_ENV("MDB_MY_ENV_VAR") ``` Note that due to security concerns, **only the environment variables with name starting with `MDB_` can be extracted with the `from_env()` function**. Learn more about [MindsDB variables here](/mindsdb_sql/functions/variables). # The LLM() Function Source: https://docs.mindsdb.com/mindsdb_sql/functions/llm_function MindsDB provides the `LLM()` function that lets users incorporate the LLM-generated output directly into the data queries. ## Prerequisites The `LLM()` function requires a large language model, which can be defined in the following ways: * By setting the `default_llm` parameter in the [MindsDB configuration file](/setup/custom-config#default-llm). * By saving the default model in the MindsDB Editor under Settings. * By defining the environment variables as below, choosing one of the available model providers. Here are the environment variables for the OpenAI provider: ``` LLM_FUNCTION_MODEL_NAME LLM_FUNCTION_TEMPERATURE LLM_FUNCTION_MAX_RETRIES LLM_FUNCTION_MAX_TOKENS LLM_FUNCTION_BASE_URL OPENAI_API_KEY LLM_FUNCTION_API_ORGANIZATION LLM_FUNCTION_REQUEST_TIMEOUT ``` Note that the values stored in the environment variables are specific for each provider. Here are the environment variables for the Anthropic provider: ``` LLM_FUNCTION_MODEL_NAME LLM_FUNCTION_TEMPERATURE LLM_FUNCTION_MAX_TOKENS LLM_FUNCTION_TOP_P LLM_FUNCTION_TOP_K LLM_FUNCTION_DEFAULT_REQUEST_TIMEOUT LLM_FUNCTION_API_KEY LLM_FUNCTION_BASE_URL ``` Note that the values stored in the environment variables are specific for each provider. Here are the environment variables for the LiteLLM provider: ``` LLM_FUNCTION_MODEL_NAME LLM_FUNCTION_TEMPERATURE LLM_FUNCTION_API_BASE LLM_FUNCTION_MAX_RETRIES LLM_FUNCTION_MAX_TOKENS LLM_FUNCTION_TOP_P LLM_FUNCTION_TOP_K ``` Note that the values stored in the environment variables are specific for each provider. Here are the environment variables for the Ollama provider: ``` LLM_FUNCTION_BASE_URL LLM_FUNCTION_MODEL_NAME LLM_FUNCTION_TEMPERATURE LLM_FUNCTION_TOP_P LLM_FUNCTION_TOP_K LLM_FUNCTION_REQUEST_TIMEOUT LLM_FUNCTION_FORMAT LLM_FUNCTION_HEADERS LLM_FUNCTION_NUM_PREDICT LLM_FUNCTION_NUM_CTX LLM_FUNCTION_NUM_GPU LLM_FUNCTION_REPEAT_PENALTY LLM_FUNCTION_STOP LLM_FUNCTION_TEMPLATE ``` Note that the values stored in the environment variables are specific for each provider. Here are the environment variables for the Nvidia NIMs provider: ``` LLM_FUNCTION_BASE_URL LLM_FUNCTION_MODEL_NAME LLM_FUNCTION_TEMPERATURE LLM_FUNCTION_TOP_P LLM_FUNCTION_REQUEST_TIMEOUT LLM_FUNCTION_FORMAT LLM_FUNCTION_HEADERS LLM_FUNCTION_NUM_PREDICT LLM_FUNCTION_NUM_CTX LLM_FUNCTION_NUM_GPU LLM_FUNCTION_REPEAT_PENALTY LLM_FUNCTION_STOP LLM_FUNCTION_TEMPLATE LLM_FUNCTION_NVIDIA_API_KEY ``` Note that the values stored in the environment variables are specific for each provider. **OpenAI-compatible model providers** can be used like OpenAI models. There is a number of OpenAI-compatible model providers including OpenRouter or vLLM. To use models via these providers, users need to define the base URL and the API key of the provider. Here is an example of using OpenRouter. ``` LLM_FUNCTION_MODEL_NAME = "mistralai/devstral-small-2505" LLM_FUNCTION_BASE_URL = "https://openrouter.ai/api/v1" OPENAI_API_KEY = "openrouter-api-key" ``` ## Usage You can use the `LLM()` function to simply ask a question and get an answer. ```sql theme={null} SELECT LLM('How many planets are there in the solar system?'); ``` Here is the output: ```sql theme={null} +------------------------------------------+ | llm | +------------------------------------------+ | There are 8 planets in the solar system. | +------------------------------------------+ ``` Moreover, you can use the `LLM()` function with your data to swiftly complete tasks such as text generation or summarization. ```sql theme={null} SELECT comment, LLM('Describe the comment''s category in one word: ' || comment) AS category FROM example_db.user_comments; ``` Here is the output: ```sql theme={null} +--------------------------+----------+ | comment | category | +--------------------------+----------+ | I hate tacos | Dislike | | I want to dance | Desire | | Baking is not a big deal | Opinion | +--------------------------+----------+ ``` # Standard Functions Source: https://docs.mindsdb.com/mindsdb_sql/functions/standard-functions MindsDB supports standard SQL functions via DuckDB and MySQL engines. ## DuckDB Functions MindsDB executes functions on the underlying DuckDB engine. Therefore, [all DuckDB functions](https://duckdb.org/docs/stable/sql/functions/overview) are supported within MindsDB out of the box. * [Aggregate Functions](https://duckdb.org/docs/stable/sql/functions/aggregates) * [Array Functions](https://duckdb.org/docs/stable/sql/functions/array) * [Bitstring Functions](https://duckdb.org/docs/stable/sql/functions/bitstring) * [Blob Functions](https://duckdb.org/docs/stable/sql/functions/blob) * [Date Format Functions](https://duckdb.org/docs/stable/sql/functions/dateformat) * [Date Functions](https://duckdb.org/docs/stable/sql/functions/date) * [Date Part Functions](https://duckdb.org/docs/stable/sql/functions/datepart) * [Enum Functions](https://duckdb.org/docs/stable/sql/functions/enum) * [Interval Functions](https://duckdb.org/docs/stable/sql/functions/interval) * [Lambda Functions](https://duckdb.org/docs/stable/sql/functions/lambda) * [List Functions](https://duckdb.org/docs/stable/sql/functions/list) * [Map Functions](https://duckdb.org/docs/stable/sql/functions/map) * [Nested Functions](https://duckdb.org/docs/stable/sql/functions/nested) * [Numeric Functions](https://duckdb.org/docs/stable/sql/functions/numeric) * [Pattern Matching](https://duckdb.org/docs/stable/sql/functions/pattern_matching) * [Regular Expressions](https://duckdb.org/docs/stable/sql/functions/regular_expressions) * [Struct Functions](https://duckdb.org/docs/stable/sql/functions/struct) * [Text Functions](https://duckdb.org/docs/stable/sql/functions/text) * [Time Functions](https://duckdb.org/docs/stable/sql/functions/time) * [Timestamp Functions](https://duckdb.org/docs/stable/sql/functions/timestamp) * [Timestamp with Time Zone Functions](https://duckdb.org/docs/stable/sql/functions/timestamptz) * [Union Functions](https://duckdb.org/docs/stable/sql/functions/union) * [Utility Functions](https://duckdb.org/docs/stable/sql/functions/utility) * [Window Functions](https://duckdb.org/docs/stable/sql/functions/window_functions) ## MySQL Functions MindsDB executes MySQL-style functions on the underlying DuckDB engine. The following functions have been adapted to MySQL-style functions. String functions: * [`CHAR`](https://dev.mysql.com/doc/refman/8.4/en/string-functions.html#function_char) * [`FORMAT`](https://dev.mysql.com/doc/refman/8.4/en/string-functions.html#function_format) * [`INSTR`](https://dev.mysql.com/doc/refman/8.0/en/string-functions.html#function_instr) * [`LENGTH`](https://dev.mysql.com/doc/refman/8.4/en/string-functions.html#function_length) * [`LOCATE`](https://dev.mysql.com/doc/refman/8.4/en/string-functions.html#function_locate) * [`SUBSTRING_INDEX`](https://dev.mysql.com/doc/refman/8.4/en/string-functions.html#function_substring-index) * [`UNHEX`](https://dev.mysql.com/doc/refman/8.4/en/string-functions.html#function_unhex) Date and time functions: * [`ADDDATE`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_adddate) * [`ADDTIME`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_addtime) * [`CONVERT_TZ`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_convert-tz) * [`CURDATE`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_curdate) * [`CURTIME`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_curtime) * [`DATE_ADD`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_date-add) * [`DATE_FORMAT`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_date-format) * [`DATE_SUB`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_date-sub) * [`DATEDIFF`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_datediff) * [`DAYNAME`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_dayname) * [`DAYOFMONTH`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_dayofmonth) * [`DAYOFWEEK`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_dayofweek) * [`DAYOFYEAR`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_dayofyear) * [`EXTRACT`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_extract) * [`FROM_DAYS`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_from-days) * [`FROM_UNIXTIME`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_from-unixtime) * [`GET_FORMAT`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_get-format) * [`TIMESTAMPDIFF`](https://dev.mysql.com/doc/refman/8.4/en/date-and-time-functions.html#function_timestampdiff) Other functions: * [`REGEXP_SUBSTR`](https://dev.mysql.com/doc/refman/8.4/en/regexp.html#function_regexp-substr) * [`SHA2`](https://dev.mysql.com/doc/refman/8.4/en/encryption-functions.html#function_sha2) # The TO_MARKDOWN() Function Source: https://docs.mindsdb.com/mindsdb_sql/functions/to_markdown_function MindsDB provides the `TO_MARKDOWN()` function that lets users extract the content of their documents in markdown by simply specifying the document path or URL. This function is especially useful for passing the extracted content of documents through LLMs or for storing them in a [Knowledge Base](/mindsdb_sql/agents/knowledge-bases). ## Configuration The `TO_MARKDOWN()` function supports different file formats and methods of passing documents into it, as well as an LLM required for processing documents. ### Supported File Formats The `TO_MARKDOWN()` function supports PDF, XML, and Nessus file formats. The documents can be provided from URLs, file storage, or Amazon S3 storage. ### Supported LLMs The `TO_MARKDOWN()` function requires an LLM to process the document content into the Markdown format. The supported LLM providers include: * OpenAI * Azure OpenAI * Google The model you select must support multi-modal inputs, that is, both images and text. For example, OpenAI’s gpt-4o is a supported multi-modal model. User can provide an LLM using one of the below methods: 1. Set the default model in the Settings of MindsDB Editor. 2. Set the default model in the [MindsDB configuration file](/setup/custom-config#default-llm). 3. Use environment variables defined below to set an LLM specifically for the `TO_MARKDOWN()` function. The `TO_MARKDOWN_FUNCTION_PROVIDER` environment variable defines the selected provider, which is one of `openai`, `azure_openai`, or `google`. Here are the environment variables for the OpenAI provider: ``` TO_MARKDOWN_FUNCTION_API_KEY (required) TO_MARKDOWN_FUNCTION_MODEL_NAME TO_MARKDOWN_FUNCTION_TEMPERATURE TO_MARKDOWN_FUNCTION_MAX_RETRIES TO_MARKDOWN_FUNCTION_MAX_TOKENS TO_MARKDOWN_FUNCTION_BASE_URL TO_MARKDOWN_FUNCTION_API_ORGANIZATION TO_MARKDOWN_FUNCTION_REQUEST_TIMEOUT ``` Here are the environment variables for the Azure OpenAI provider: ``` TO_MARKDOWN_FUNCTION_API_KEY (required) TO_MARKDOWN_FUNCTION_BASE_URL (required) TO_MARKDOWN_FUNCTION_API_VERSION (required) TO_MARKDOWN_FUNCTION_MODEL_NAME TO_MARKDOWN_FUNCTION_TEMPERATURE TO_MARKDOWN_FUNCTION_MAX_RETRIES TO_MARKDOWN_FUNCTION_MAX_TOKENS TO_MARKDOWN_FUNCTION_API_ORGANIZATION TO_MARKDOWN_FUNCTION_REQUEST_TIMEOUT ``` Here are the environment variables for the Google provider: ``` TO_MARKDOWN_FUNCTION_API_KEY TO_MARKDOWN_FUNCTION_MODEL_NAME TO_MARKDOWN_FUNCTION_TEMPERATURE TO_MARKDOWN_FUNCTION_MAX_TOKENS TO_MARKDOWN_FUNCTION_REQUEST_TIMEOUT ``` ## Usage You can use the `TO_MARKDOWN()` function to extract the content of your documents in markdown format. The arguments for this function are: * `file_path_or_url`: The path or URL of the document you want to extract content from. The following example shows how to use the `TO_MARKDOWN()` function with a PDF document from [Amazon S3 storage connected to MindsDB](/integrations/data-integrations/amazon-s3). ```sql theme={null} SELECT TO_MARKDOWN(public_url) FROM s3_datasource.files; ``` Here are the steps for passing files from Amazon S3 into TO\_MARKDOWN(). 1. Connect Amazon S3 to MindsDB following [this instruction](/integrations/data-integrations/amazon-s3). 2. The `public_url` of the file is generated in the `s3_datasource.files` table upon connecting the Amazon S3 data source to MindsDB. 3. Upon running the above query, the `public_url` of the file is selected from the `s3_datasource.files` table. The following example shows how to use the `TO_MARKDOWN()` function with a PDF document from URL. ```sql theme={null} SELECT TO_MARKDOWN('https://www.princexml.com/howcome/2016/samples/invoice/index.pdf'); ``` Here is the output: ````sql theme={null} +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | to_markdown | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | ```markdown | | # Invoice | | | | YesLogic Pty. Ltd. | | 7 / 39 Bouverie St | | Carlton VIC 3053 | | Australia | | | | www.yeslogic.com | | ABN 32 101 193 560 | | | | Customer Name | | Street | | Postcode City | | Country | | | | Invoice date: | Nov 26, 2016 | | --- | --- | | Invoice number: | 161126 | | Payment due: | 30 days after invoice date | | | | | Description | From | Until | Amount | | | |---------------------------|-------------|-------------|------------| | | | Prince Upgrades & Support | Nov 26, 2016 | Nov 26, 2017 | USD $950.00 | | | | Total | | | USD $950.00 | | | | | Please transfer amount to: | | | | Bank account name: | Yes Logic Pty Ltd | | --- | --- | | Name of Bank: | Commonwealth Bank of Australia (CBA) | | Bank State Branch (BSB): | 063010 | | Bank State Branch (BSB): | 063010 | | Bank State Branch (BSB): | 063019 | | Bank account number: | 13201652 | | Bank SWIFT code: | CTBAAU2S | | Bank address: | 231 Swanston St, Melbourne, VIC 3000, Australia | | | | The BSB number identifies a branch of a financial institution in Australia. When transferring money to Australia, the BSB number is used together with the bank account number and the SWIFT code. Australian banks do not use IBAN numbers. | | | | www.yeslogic.com | | ``` | +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ ```` The content of each PDF page is intelligently extracted by first assessing how visually complex the page is. Based on this assessment, the system decides whether traditional text parsing is sufficient or if the page should be processed using an LLM. ### Usage with Knowledge Bases You can also use the `TO_MARKDOWN()` function to extract content from documents and store it in a [Knowledge Base](/mindsdb_sql/agents/knowledge-bases). This is particularly useful for creating a Knowledge Base from a collection of documents. ```sql theme={null} INSERT INTO my_kb ( SELECT HASH('https://www.princexml.com/howcome/2016/samples/invoice/index.pdf') as id, TO_MARKDOWN('https://www.princexml.com/howcome/2016/samples/invoice/index.pdf') as content ) ``` # Variables Source: https://docs.mindsdb.com/mindsdb_sql/functions/variables MindsDB supports the usage of variables. Users can save values of API keys or other frequently used values and pass them as variables when creating knowledge bases, agents, or other MindsDB object. ## Usage Here is how to create variables in MindsDB. * Create variables using `SET` and save values either using the [`from_env()` function](/mindsdb_sql/functions/from_env) or directly. ```sql theme={null} SET @my_env_var = from_env("MDB_MY_ENV_VAR") SET @my_value = "123456" ``` * Use variables to pass parameters when creating objects in MindsDB. Here is an example for [knowledge bases](/mindsdb_sql/knowledge_bases/overview). ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING embedding_model = { "provider": "openai", "model_name" : "text-embedding-3-large", "api_key": @my_env_var }, ...; ``` # How to Alter Existing Knowledge Bases Source: https://docs.mindsdb.com/mindsdb_sql/knowledge_bases/alter The `ALTER KNOWLEDGE_BASE` command enables users to modify the configuration of the existing knowledge base without the need to recreate it. This document lists parameters that can be altered, explains the process and the effect on the existing knowledge base. ## `ALTER KNOWLEDGE_BASE` Syntax Here is the syntax used to alter the existing knowledge base. ```sql theme={null} ALTER KNOWLEDGE_BASE USING = , ...; ``` The following parameters can be altered: * `embedding_model` Users can alter only the API key of the provider used for the embedding model, while users cannot alter the provider and the model itself because it would be incompatible with the already embedded content that is stored in a knowledge base. ```sql theme={null} ALTER KNOWLEDGE_BASE my_kb USING embedding_model = { 'api_key': 'new-api-key' }; ``` Upon altering the API key of the embedding model’s provider, ensure that the new API key has access to the same embedding model so that the knowledge base can continue to function without issues. * `reranking_model` Users can turn off reranking by setting `reranking_model = false`, or change the provider, API key, and model used for reranking. ```sql theme={null} ALTER KNOWLEDGE_BASE my_kb USING reranking_model = { ‘provider’: ‘new_provider’, ‘model_name’: ‘new_model’, 'api_key': 'new-api-key' }; ALTER KNOWLEDGE_BASE my_kb USING reranking_model = false; ``` Upon updating the reranking model, the knowledge base will use the newly defined reranking model when reranking results, provided that reranking is turned on. * `content_columns` Users can change the content columns. ```sql theme={null} ALTER KNOWLEDGE_BASE my_kb USING content_columns=['content_col1', 'conten_col2', ...]; ``` Upon changing the content columns, all the previously inserted content stays unchanged. Now the knowledge base will be embedding content from columns defined in the most recent call to `ALTER KNOWLEDGE_BASE`. * `metadata_columns` Users can change the metadata columns, overriding the existing metadata columns. ```sql theme={null} ALTER KNOWLEDGE_BASE my_kb USING metadata_columns=['metadata_col1', 'metadata_col2', ...]; ``` Upon changing the metadata columns: * All metadata fields are stored in the knowledge base. No data is removed. * Users can filter only by metadata fields defined in the most recent call to `ALTER KNOWLEDGE_BASE`. * To be able to filter by all metadata fields, include them in the list as below. ```sql theme={null} ALTER KNOWLEDGE_BASE my_kb USING metadata_columns=[‘existing_metadata_fields’, ..., 'new_metadata_fields', ...]; ``` * `id_column` Users can change the ID column. ```sql theme={null} ALTER KNOWLEDGE BASE my_kb USING id_column='my_id'; ``` Upon changing the ID column, users must keep in mind that inserting data with an already existing ID value will update the existing row and not create a new one. * `storage` Users cannot update the underlying vector database of the existing knowledge base. * `preprocessing` Users can modify the [`preprocessing` parameters as defined here](/mindsdb_sql/knowledge_bases/insert_data#chunking-data). # How to Create Knowledge Bases Source: https://docs.mindsdb.com/mindsdb_sql/knowledge_bases/create A knowledge base is an advanced system that organizes information based on semantic meaning rather than simple keyword matching. It integrates embedding models, reranking models, and vector stores to enable context-aware data retrieval. ## `CREATE KNOWLEDGE_BASE` Syntax Here is the syntax for creating a knowledge base: ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING embedding_model = { "provider": "openai", "model_name" : "text-embedding-3-large", "api_key": "sk-..." }, reranking_model = { "provider": "openai", "model_name": "gpt-4o", "api_key": "sk-..." }, storage = my_vector_store.storage_table, metadata_columns = ['date', 'creator', ...], content_columns = ['review', 'content', ...], id_column = 'id'; ``` Upon execution, it registers `my_kb` and associates the specified models and storage. `my_kb` is a unique identifier of the knowledge base within MindsDB. Here is how to list all knowledge bases: ```sql theme={null} SHOW KNOWLEDGE_BASES; ``` Users can use the variables and the [`from_env()` function](/mindsdb_sql/functions/from_env) to pass parameters when creating knowledge bases. As MindsDB stores objects, such as models or knowledge bases, inside [projects](/mindsdb_sql/sql/create/project), you can create a knowledge base inside a custom project. ```sql theme={null} CREATE PROJECT my_project; CREATE KNOWLEDGE_BASE my_project.my_kb USING ... ``` ### Supported LLMs Below is the list of all language models supported for the `embedding_model` and `reranking_model` parameters. #### `provider = 'openai'` This provider is supported for both `embedding_model` and `reranking_model`. Users can define the default embedding and reranking models from OpenAI in Settings of the MindsDB GUI. Furthermore, users can select `Custom OpenAI API` from the dropdown and use models from any OpenAI-compatible API. When choosing `openai` as the model provider, users should define the following model parameters. * `model_name` stores the name of the OpenAI model to be used. * `api_key` stores the OpenAI API key. Learn more about the [OpenAI integration with MindsDB here](/integrations/ai-engines/openai). #### `provider = 'openai_azure'` This provider is supported for both `embedding_model` and `reranking_model`. Users can define the default embedding and reranking models from Azure OpenAI in Settings of the MindsDB GUI. When choosing `openai_azure` as the model provider, users should define the following model parameters. * `model_name` stores the name of the OpenAI model to be used. * `api_key` stores the OpenAI API key. * `base_url` stores the base URL of the Azure instance. * `api_version` stores the version of the Azure instance. Users need to log in to their Azure OpenAI instance to retrieve all relevant parameter values. Next, click on `Explore Azure AI Foundry portal` and go to `Models + endpoints`. Select the model and copy the parameter values. #### `provider = 'google'` This provider is supported for both `embedding_model` and `reranking_model`. Users can define the default embedding and reranking models from Google in Settings of the MindsDB GUI. When choosing `google` as the model provider, users should define the following model parameters. * `model_name` stores the name of the Google model to be used. * `api_key` stores the Google API key. Learn more about the [Google Gemini integration with MindsDB here](/integrations/ai-engines/google_gemini). #### `provider = 'bedrock'` This provider is supported for both `embedding_model` and `reranking_model`. When choosing `bedrock` as the model provider, users should define the following model parameters. * `model_name` stores the name of the model available via Amazon Bedrock. * `aws_access_key_id` stores a unique identifier associated with your AWS account, used to identify the user or application making requests to AWS. * `aws_region_name` stores the name of the AWS region you want to send your requests to (e.g., `"us-west-2"`). * `aws_secret_access_key` stores the secret key associated with your AWS access key ID. It is used to sign your requests securely. * `aws_session_token` is an optional parameter that stores a temporary token used for short-term security credentials when using AWS Identity and Access Management (IAM) roles or temporary credentials. #### `provider = 'snowflake'` This provider is supported for both `embedding_model` and `reranking_model`. When choosing `snowflake` as the model provider, users should choose one of the available models from [Snowflake Cortex AI](https://www.snowflake.com/en/product/features/cortex/) and define the following model parameters. * `model_name` stores the name of the model available via Snowflake Cortex AI. * `api_key` stores the Snowflake Cortex AI API key. * `account_id` stores the Snowflake account ID. Follow the below steps to generate the API key. 1. Generate a key pair according to [this instruction](https://docs.snowflake.com/en/user-guide/key-pair-auth) as below. * Execute these commands in the console: ```bash theme={null} # generate private key openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt # generate public key openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub ``` * Save the public key, that is, the content of rsa\_key.pub, into your database user: ```sql theme={null} ALTER USER my_user SET RSA_PUBLIC_KEY = "" ``` 2. Verify the key pair with the database user. * Install `snowsql` following [this instruction](https://docs.snowflake.com/en/user-guide/snowsql-install-config). * Execute this command in the console: ```bash theme={null} snowsql -a -u my_user --private-key-path rsa_key.p8 ``` 3. Generate JWT token. * Download the Python script from [Snowflake's Developer Guide for Authentication](https://docs.snowflake.com/en/developer-guide/sql-api/authenticating). Here is a [direct download link](https://docs.snowflake.com/en/_downloads/aeb84cdfe91dcfbd889465403b875515/sql-api-generate-jwt.py). * Ensure to have the PyJWT module installed that is required for running the script. * Run the script using this command: ```bash theme={null} sql-api-generate-jwt.py --account --user my_user --private_key_file_path rsa_key.p8 ``` This command returns the JWT token, which is used in the `api_key` parameter for the `snowflake` provider. #### `provider = 'ollama'` This provider is supported for both `embedding_model` and `reranking_model`. Users can define the default embedding and reranking models from Ollama in Settings of the MindsDB GUI. When choosing `ollama` as the model provider, users should define the following model parameters. * `model_name` stores the name of the model to be used. * `base_url` stores the base URL of the Ollama instance. ### `embedding_model` The embedding model is a required component of the knowledge base. It stores specifications of the embedding model to be used. Users can define the embedding model choosing one of the following options. **Option 1.** Use the `embedding_model` parameter to define the specification. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING ... embedding_model = { "provider": "azure_openai", "model_name" : "text-embedding-3-large", "api_key": "sk-abc123", "base_url": "https://ai-6689.openai.azure.com/", "api_version": "2024-02-01" }, ... ``` **Option 2.** Define the default embedding model in the [MindsDB configuration file](/setup/custom-config). You can define the default models in the Settings of the MindsDB Editor GUI. Note that if you define [`default_embedding_model` in the configuration file](/setup/custom-config#default_embedding_model), you do not need to provide the `embedding_model` parameter when creating a knowledge base. If provide both, then the values from the `embedding_model` parameter are used. When using `default_embedding_model` from the configuration file, the knowledge base saves this model internally. Therefore, when changing `default_embedding_model` in the configuration file to a different one after the knowledge base is created, it does not affect the already created knowledge bases. ```bash theme={null} "default_embedding_model": { "provider": "azure_openai", "model_name" : "text-embedding-3-large", "api_key": "sk-abc123", "base_url": "https://ai-6689.openai.azure.com/", "api_version": "2024-02-01" } ``` The embedding model specification includes: * `provider` It is a required parameter. It defines the model provider. * `model_name` It is a required parameter. It defines the embedding model name as specified by the provider. * `api_key` The API key is required to access the embedding model assigned to a knowledge base. Users can provide it either in this `api_key` parameter, or in the `OPENAI_API_KEY` environment variable for `"provider": "openai"` and `AZURE_OPENAI_API_KEY` environment variable for `"provider": "azure_openai"`. * `base_url` It is an optional parameter, which defaults to `https://api.openai.com/v1/`. It is a required parameter when using the `azure_openai` provider. It is the root URL used to send API requests. * `api_version` It is an optional parameter. It is a required parameter when using the `azure_openai` provider. It defines the API version. ### `reranking_model` The reranking model is an optional component of the knowledge base. It stores specifications of the reranking model to be used. Users can disable reranking features of knowledge bases by setting this parameter to `false`. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING ... reranking_model = false, ... ``` Users can enable reranking features of knowledge bases by defining the reranking model choosing one of the following options. **Option 1.** Use the `reranking_model` parameter to define the specification. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING ... reranking_model = { "provider": "azure_openai", "model_name" : "gpt-4o", "api_key": "sk-abc123", "base_url": "https://ai-6689.openai.azure.com/", "api_version": "2024-02-01", "method": "multi-class" }, ... ``` **Option 2.** Define the default reranking model in the [MindsDB configuration file](/setup/custom-config). You can define the default models in the Settings of the MindsDB Editor GUI. Note that if you define [`default_reranking_model` in the configuration file](/setup/custom-config#default-reranking-model), you do not need to provide the `reranking_model` parameter when creating a knowledge base. If provide both, then the values from the `reranking_model` parameter are used. When using `default_reranking_model` from the configuration file, the knowledge base saves this model internally. Therefore, when changing `default_reranking_model` in the configuration file to a different one after the knowledge base is created, it does not affect the already created knowledge bases. ```bash theme={null} "default_reranking_model": { "provider": "azure_openai", "model_name" : "gpt-4o", "api_key": "sk-abc123", "base_url": "https://ai-6689.openai.azure.com/", "api_version": "2024-02-01", "method": "multi-class" } ``` The reranking model specification includes: * `provider` It is a required parameter. It defines the model provider as listed in [supported LLMs](/mindsdb_sql/knowledge_bases/create#supported-llms). * `model_name` It is a required parameter. It defines the embedding model name as specified by the provider. * `api_key` The API key is required to access the embedding model assigned to a knowledge base. Users can provide it either in this `api_key` parameter, or in the `OPENAI_API_KEY` environment variable for `"provider": "openai"` and `AZURE_OPENAI_API_KEY` environment variable for `"provider": "azure_openai"`. * `base_url` It is an optional parameter, which defaults to `https://api.openai.com/v1/`. It is a required parameter when using the `azure_openai` provider. It is the root URL used to send API requests. * `api_version` It is an optional parameter. It is a required parameter when using the `azure_openai` provider. It defines the API version. * `method` It is an optional parameter. It defines the method used to calculate the relevance of the output rows. The available options include `multi-class` and `binary`. It defaults to `multi-class`. **Reranking Method** The `multi-class` reranking method classifies each document chunk (that meets any specified metadata filtering conditions) into one of four relevance classes: 1. Not relevant with class weight of 0.25. 2. Slightly relevant with class weight of 0.5. 3. Moderately relevant with class weight of 0.75. 4. Highly relevant with class weight of 1. The overall `relevance_score` of a document is calculated as the sum of each chunk’s class weight multiplied by its class probability (from model logprob output). The `binary` reranking method simplifies classification by determining whether a document is relevant or not, without intermediate relevance levels. With this method, the overall `relevance_score` of a document is calculated based on the model log probability. ### `storage` The vector store is a required component of the knowledge base. It stores data in the form of embeddings. It is optional for users to provide the `storage` parameter. If not provided, the default ChromaDB is created when creating a knowledge base. The available options include either [PGVector](/integrations/vector-db-integrations/pgvector) or [ChromaDB](/integrations/vector-db-integrations/chromadb). It is recommended to use PGVector version 0.8.0 or higher for a better performance. If the `storage` parameter is not provided, the system creates the default ChromaDB vector database called `_chromadb` with the default table called `default_collection` that stores the embedded data. This default ChromaDB vector database is stored in MindsDB's storage. In order to provide the storage vector database, it is required to connect it to MindsDB beforehand. Here is an example for [PGVector](/integrations/vector-db-integrations/pgvector). ```sql theme={null} CREATE DATABASE my_pgvector WITH ENGINE = 'pgvector', PARAMETERS = { "host": "127.0.0.1", "port": 5432, "database": "postgres", "user": "user", "password": "password", "distance": "cosine" }; CREATE KNOWLEDGE_BASE my_kb USING ... storage = my_pgvector.storage_table, ... ``` Note that you do not need to have the `storage_table` created as it is created when creating a knowledge base. ### `metadata_columns` The data inserted into the knowledge base can be classified as metadata, which enables users to filter the search results using defined data fields. Note that source data column(s) included in `metadata_columns` cannot be used in `content_columns`, and vice versa. This parameter is an array of strings that lists column names from the source data to be used as metadata. If not provided, then all inserted columns (except for columns defined as `id_column` and `content_columns`) are considered metadata columns. Here is an example of usage. A user wants to store the following data in a knowledge base. ```sql theme={null} +----------+-------------------+------------------------+ | order_id | product | notes | +----------+-------------------+------------------------+ | A1B | Wireless Mouse | Request color: black | | 3XZ | Bluetooth Speaker | Gift wrap requested | | Q7P | Laptop Stand | Prefer aluminum finish | +----------+-------------------+------------------------+ ``` Go to the *Complete Example* section below to find out how to access this sample data. The `product` column can be used as metadata to enable metadata filtering. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING ... metadata_columns = ['product'], ... ``` ### `content_columns` The data inserted into the knowledge base can be classified as content, which is embedded by the embedding model and stored in the underlying vector store. Note that source data column(s) included in `content_columns` cannot be used in `metadata_columns`, and vice versa. This parameter is an array of strings that lists column names from the source data to be used as content and processed into embeddings. If not provided, the `content` column is expected by default when inserting data into the knowledge base. Here is an example of usage. A user wants to store the following data in a knowledge base. ```sql theme={null} +----------+-------------------+------------------------+ | order_id | product | notes | +----------+-------------------+------------------------+ | A1B | Wireless Mouse | Request color: black | | 3XZ | Bluetooth Speaker | Gift wrap requested | | Q7P | Laptop Stand | Prefer aluminum finish | +----------+-------------------+------------------------+ ``` Go to the *Complete Example* section below to find out how to access this sample data. The `notes` column can be used as content. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING ... content_columns = ['notes'], ... ``` ### `id_column` The ID column uniquely identifies each source data row in the knowledge base. It is an optional parameter. If provided, this parameter is a string that contains the source data ID column name. If not provided, it is generated from the hash of the content columns. Here is an example of usage. A user wants to store the following data in a knowledge base. ```sql theme={null} +----------+-------------------+------------------------+ | order_id | product | notes | +----------+-------------------+------------------------+ | A1B | Wireless Mouse | Request color: black | | 3XZ | Bluetooth Speaker | Gift wrap requested | | Q7P | Laptop Stand | Prefer aluminum finish | +----------+-------------------+------------------------+ ``` Go to the *Complete Example* section below to find out how to access this sample data. The `order_id` column can be used as ID. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING ... id_column = 'order_id', ... ``` Note that if the source data row is chunked into multiple chunks by the knowledge base (that is, to optimize the storage), then these rows in the knowledge base have the same ID value that identifies chunks from one source data row. **Available options for the ID column values** * User-Defined ID Column:
When users defined the `id_column` parameter, the values from the provided source data column are used to identify source data rows within the knowledge base. * User-Generated ID Column:
When users do not have a column that uniquely identifies each row in their source data, they can generate the ID column values when inserting data into the knowledge base using functions like `HASH()` or `ROW_NUMBER()`. ```sql theme={null} INSERT INTO my_kb ( SELECT ROW_NUMBER() OVER (ORDER BY order_id) AS id, * FROM sample_data.orders ); ``` * Default ID Column:
If the `id_column` parameter is not defined, its default values are build from the hash of the content columns and follow the format: ``.
### Example Here is a sample knowledge base that will be used for examples in the following content. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING embedding_model = { "provider": "openai", "model_name" : "text-embedding-3-large", "api_key": "sk-abc123" }, reranking_model = { "provider": "openai", "model_name": "gpt-4o", "api_key": "sk-abc123" }, metadata_columns = ['product'], content_columns = ['notes'], id_column = 'order_id'; ``` ## `DESCRIBE KNOWLEDGE_BASE` Syntax Users can get details about the knowledge base using the `DESCRIBE KNOWLEDGE_BASE` command. ```sql theme={null} DESCRIBE KNOWLEDGE_BASE my_kb; ``` Here is the sample output: ```sql theme={null} +---------+---------+--------+----------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------+--------------------+----------------+-------+----------+ | NAME | PROJECT | MODEL | STORAGE | PARAMS | INSERT_STARTED_AT | INSERT_FINISHED_AT | PROCESSED_ROWS | ERROR | QUERY_ID | +---------+---------+--------+----------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------+--------------------+----------------+-------+----------+ | my_kb | mindsdb | [NULL] | my_kb_chromadb.default_collection | {"embedding_model": {"provider": "openai", "model_name": "text-embedding-ada-002", "api_key": "sk-xxx"}, "reranking_model": {"provider": "openai", "model_name": "gpt-4o", "api_key": "sk-xxx"}, "default_vector_storage": "my_kb_chromadb"} | [NULL] | [NULL] | [NULL] | [NULL]| [NULL] | +---------+---------+--------+----------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------+--------------------+----------------+-------+----------+ ``` ## `DROP KNOWLEDGE_BASE` Syntax Here is the syntax for deleting a knowledge base: ```sql theme={null} DROP KNOWLEDGE_BASE my_kb; ``` Upon execution, it removes the knowledge base with its content. Upon execution, it identifies matching records based on the user-defined condition and removes all associated data (metadata, content, chunks, embeddings) for matching records from the KB's storage. # How to Evaluate Knowledge Bases Source: https://docs.mindsdb.com/mindsdb_sql/knowledge_bases/evaluate Evaluating knowledge bases verifies how accurate and relevant is the data returned by the knowledge base. ## `EVALUATE KNOWLEDGE_BASE` Syntax With the `EVALUATE KNOWLEDGE_BASE` command, users can evaluate the relevancy and accuracy of the documents and data returned by the knowledge base. Below is the complete syntax that includes both required and optional parameters. ```sql theme={null} EVALUATE KNOWLEDGE_BASE my_kb USING test_table = my_datasource.my_test_table, version = 'doc_id', generate_data = { 'from_sql': 'SELECT id, content FROM my_datasource.my_table', 'count': 100 }, evaluate = false, llm = { 'provider': 'openai', 'api_key':'sk-xxx', 'model_name':'gpt-4' }, save_to = my_datasource.my_result_table; ``` ### `test_table` This is a required parameter that stores the name of the table from one of the data sources connected to MindsDB. For example, `test_table = my_datasource.my_test_table` defines a table named `my_test_table` from a data source named `my_datasource`. This test table stores test data commonly in form of questions and answers. Its content depends on the `version` parameter defined below. Users can provide their own test data or have the test data generated by the `EVALUATE KNOWLEDGE_BASE` command, which is performed when setting the `generate_data` parameter defined below. ### `version` This is an optional parameter that defines the version of the evaluator. If not defined, its default value is `doc_id`. * `version = 'doc_id'` The evaluator checks whether the document ID returned by the knowledge base matched the expected document ID as defined in the test table. * `version = 'llm_relevancy'` The evaluator uses a language model to rank and evaluate responses from the knowledge base. ### `generate_data` This is an optional parameter used to configure the test data generation, which is saved into the table defined in the `test_table` parameter. If not defined, its default value is `false`, meaning that no test data is generated. Available values are as follows: * A dictionary containing the following values: * `from_sql` defines the SQL query that fetches the test data. For example, `'from_sql': 'SELECT id, content FROM my_datasource.my_table'`. If not defined, it fetches test data from the knowledge base on which the `EVALUATE` command is executed: `SELECT chunk_content, id FROM my_kb`. * `count` defines the size of the test dataset. For example, `'count': 100`. Its default value is 20. When providing the `from_sql` parameter, it requires specific column names as follows: * With `version = 'doc_id'`, the `from_sql` parameter should contain a query that returns the `id` and `content` columns, like this: `'from_sql': 'SELECT id_column_name AS id, content_column_names AS content FROM my_datasource.my_table'` * With `version = 'llm_relevancy'`, the `from_sql` parameter should contain a query that returns the `content` column, like this: `'from_sql': 'SELECT content_column_names AS content FROM my_datasource.my_table'` * A value of `true`, such as `generate_data = true`, which implies that default values for `from_sql` and `count` will be used. ### `evaluate` This is an optional parameter that defines whether to evaluate the knowledge base. If not defined, its default value is `true`. Users can opt for setting it to false, `evaluate = false`, in order to generate test data into the test table without running the evaluator. ### `llm` This is an optional parameter that defines a language model to be used for evaluations, if `version` is set to `llm_relevancy`. If not defined, its default value is the [`reranking_model` defined with the knowledge base](/mindsdb_sql/knowledge_bases/create#reranking-model). Users can define it with the `EVALUATE KNOWLEDGE_BASE` command in the same manner. ```sql theme={null} EVALUATE KNOWLEDGE_BASE my_kb USING ... llm = { "provider": "azure_openai", "model_name" : "gpt-4o", "api_key": "sk-abc123", "base_url": "https://ai-6689.openai.azure.com/", "api_version": "2024-02-01", "method": "multi-class" }, ... ``` ### `save_to` This is an optional parameter that stores the name of the table from one of the data sources connected to MindsDB. For example, `save_to = my_datasource.my_result_table` defines a table named `my_result_table` from the data source named `my_datasource`. If not defined, the results are not saved into a table. This table is used to save the evaluation results. By default, evaluation results are returned after executing the `EVALUATE KNOWLEDGE_BASE` statement. ### Evaluation Results When using `version = 'doc_id'`, the following columns are included in the evaluation results: * `total` stores the total number of questions. * `total_found` stores the number of questions to which the knowledge bases provided correct answers. * `retrieved_in_top_10` stores the number of top 10 questions to which the knowledge bases provided correct answers. * `cumulative_recall` stores data that can be used to create a chart. * `avg_query_time` stores the execution time of a search query of the knowledge base. * `name` stores the knowledge base name. * `created_at` stores the timestamp when the evaluation was created. When using `version = 'llm_relevancy'`, the following columns are included in the evaluation results: * `avg_relevancy` stores the average relevancy. * `avg_relevance_score_by_k` stores the average relevancy at k. * `avg_first_relevant_position` stores the average first relevant position. * `mean_mrr` stores the Mean Reciprocal Rank (MRR). * `hit_at_k` stores the Hit\@k value. * `bin_precision_at_k` stores the Binary Precision\@k. * `avg_entropy` stores the average relevance score entropy. * `avg_ndcg` stores the average nDCG. * `avg_query_time` stores the execution time of a search query of the knowledge base. * `name` stores the knowledge base name. * `created_at` stores the timestamp when the evaluation was created. # How to Use Knowledge Bases Source: https://docs.mindsdb.com/mindsdb_sql/knowledge_bases/examples This section contains examples of usage of knowledge bases. ### Sales Data Here is the data that will be inserted into the knowledge base. ```sql theme={null} +----------+-------------------+------------------------+ | order_id | product | notes | +----------+-------------------+------------------------+ | A1B | Wireless Mouse | Request color: black | | 3XZ | Bluetooth Speaker | Gift wrap requested | | Q7P | Laptop Stand | Prefer aluminum finish | +----------+-------------------+------------------------+ ``` You can access this sample data as below: ```sql theme={null} CREATE DATABASE sample_data WITH ENGINE = 'postgres', PARAMETERS = { "user": "demo_user", "password": "demo_password", "host": "samples.mindsdb.com", "port": "5432", "database": "demo", "schema": "demo_data" }; SELECT * FROM sample_data.orders; ``` Here is how to create a knowledge base specifically for the data. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING embedding_model = { "provider": "openai", "model_name" : "text-embedding-3-large", "api_key": "sk-abc123" }, reranking_model = { "provider": "openai", "model_name": "gpt-4o", "api_key": "sk-abc123" }, metadata_columns = ['product'], content_columns = ['notes'], id_column = 'order_id'; ``` Here is how to insert the data. ```sql theme={null} INSERT INTO my_kb SELECT order_id, product, notes FROM sample_data.orders; ``` Here is how to query the knowledge base. ```sql theme={null} SELECT * FROM my_kb WHERE product = 'Wireless Mouse' AND content = 'color' AND relevance > 0.5; ``` ### Financial Data You can access the sample data as below: ```sql theme={null} CREATE DATABASE sample_data WITH ENGINE = 'postgres', PARAMETERS = { "user": "demo_user", "password": "demo_password", "host": "samples.mindsdb.com", "port": "5432", "database": "demo", "schema": "demo_data" }; SELECT * FROM sample_data.financial_headlines; ``` Here is how to create a knowledge base specifically for the data. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING embedding_model = { "provider": "openai", "model_name" : "text-embedding-3-large", "api_key": "sk-xxx" }, reranking_model = { "provider": "openai", "model_name": "gpt-4o", "api_key": "sk-xxx" }, metadata_columns = ['sentiment_labelled'], content_columns = ['headline']; ``` Here is how to insert the data. ```sql theme={null} INSERT INTO my_kb SELECT * FROM sample_data.financial_headlines USING batch_size = 500, threads = 10; ``` Here is how to query the knowledge base. * Query without defined `LIMIT` ```sql theme={null} SELECT * FROM my_kb WHERE content = 'investors'; ``` This query returns 10 rows, as the default `LIMIT` is set to 10.

* Query with defined `LIMIT` ```sql theme={null} SELECT * FROM my_kb WHERE content = 'investors' LIMIT 20; ``` This query returns 20 rows, as the user-defined `LIMIT` is set to 20.

* Query with defined `LIMIT` and `relevance` ```sql theme={null} SELECT * FROM my_kb WHERE content = 'investors' AND relevance >= 0.8 LIMIT 20; ``` This query may return 20 or less rows, depending on whether the relevance scores of the rows match the user-defined condition.

# How to Hybrid Search Knowledge Bases Source: https://docs.mindsdb.com/mindsdb_sql/knowledge_bases/hybrid_search Knowledge bases support two primary search methods: [semantic search](/mindsdb_sql/knowledge_bases/query#semantic-search) and [metadata/keyword search](/mindsdb_sql/knowledge_bases/query#metadata-filtering). Each method has its strengths and ideal use cases. Semantic similarity search uses vector embeddings to retrieve content that is semantically related to a given query. This is especially powerful when users are searching for concepts, ideas, or questions expressed in natural language. However, semantic search may fall short when users are looking for specific keywords, such as acronyms, internal terminology, or custom identifiers. These types of terms are often not well-represented in the embedding model's training data. As a result, embedding-based semantic search might entirely miss results that do contain the exact keyword. To address this gap, knowledge bases offer hybrid search, which combines the best of both worlds: semantic similarity and exact keyword matching. Hybrid search ensures that results relevant by meaning and results matching specific terms are both considered and ranked appropriately. ## Enabling Hybrid Search To use hybrid search, you first need to [create a knowledge base](/mindsdb_sql/knowledge_bases/create) and [insert data into it](/mindsdb_sql/knowledge_bases/insert_data). Hybrid search can be enabled at the time of querying the knowledge base by specifying the appropriate configuration options, as shown below. ```sql theme={null} SELECT * from my_kb WHERE content = ”ACME-213” AND hybrid_search_alpha = 0.8; ``` The `hybrid_search_alpha` parameter enables hybrid search functionality and allows you to control the balance between semantic and keyword relevance, with values varying between 0 (more importance on keyword relevance) and 1 (more importance on semantic relevance) and the default value of 0.5. Alternatively, you can use the `hybrid_search` parameter and set it to `true` in order to enable hybrid search with default `hybrid_search_alpha = 0.5`. Note that hybrid search works only on knowledge bases that use PGVector as a [storage](/mindsdb_sql/knowledge_bases/create#storage). Ensure to [install the PGVector handler to connect it to MindsDB](/integrations/vector-db-integrations/pgvector#usage). Knowledge bases provide optional [reranking features](/mindsdb_sql/knowledge_bases/create#reranking-model) that users can decide to use in specific use cases. When the reranker is available, it is used to rerank results from both the full-text index search and the embedding-based semantic search. It estimates the relevance of each document and orders them from most to least relevant. However, users can disable the reranker using `reranking = false`, which might be desirable for performance reasons or specific use cases. When reranking is disabled, the system still needs to combine the two search result sets. In this case, the final ranking of each document is computed as a weighted average of the embedding similarity score and the [BM25](https://en.wikipedia.org/wiki/Okapi_BM25) keyword relevance score from the full-text search. **Relevance-Based Document Selection for Reranking** When retrieving documents from the full-text index, there is a practical limit on how many documents can be passed to the reranker, since reranking is typically computationally expensive. To ensure that only the most promising candidates are selected for reranking, we apply relevance heuristics during the keyword search stage. One widely used heuristic is BM25, a ranking function that scores documents based on their keyword relevance to the user query. BM25 considers both the frequency of a keyword within a document and how common that keyword is across the entire corpus. By scoring documents using BM25, the system can prioritize more relevant matches and limit reranker input to a smaller, high-quality subset of documents. This helps achieve a balance between performance and retrieval accuracy in hybrid search. This is the so-called alpha reranking. ## Implementation of Hybrid Search Hybrid search in knowledge bases combines semantic similarity and keyword-based search methods into a unified search mechanism. The diagram below illustrates the hybrid search process.

When a user submits a query, it is simultaneously routed through two parallel search mechanisms: an embedding-based semantic search (left) and a full-text keyword search (right). Below is a breakdown of how hybrid search works under the hood: * **Semantic Search** (path on the left) It takes place in parallel with the keyword search. Semantic search starts by embedding the search query and searching against the content of the knowledge base. This results in a set of relevant documents found. * **Keyword Search** (path on the right) It takes place in parallel with the semantic search. The system performs a keyword-based search, using one or more keywords provided in the search query, over the content of the knowledge base. To ensure performance, especially at scale, when dealing with millions of documents, we rely on a full-text indexing system. This index is typically built as an inverted index, mapping keywords to the documents in which they appear. It allows for efficient lookups and rapid retrieval of all entries that contain the given terms. Storage of Full-Text Index Just as embeddings are stored to support semantic similarity search, a full-text index must also be stored to enable efficient keyword-based retrieval. This index serves as the foundation for fast and scalable full-text search and is tightly integrated with the knowledge base. Each knowledge base maintains its own dedicated full-text index, built and updated as documents are ingested or modified. Maintaining this index alongside the stored embeddings ensures that both semantic and keyword search capabilities are always available and performant, forming the backbone of hybrid search. This step ensures that exact matches, like specific acronyms, ticket numbers, or product identifiers, can be found quickly, even if the semantic model wouldn’t have surfaced them. * **Combining Results** At this step, results from both searches are merged. Semantic search returned documents similar in meaning to the user’s query using embeddings, while keyword search returned documents containing the keywords extracted from the user’s query. This complete result set is passed to the reranker. * **Reranking** The results are reranked, considering relevance scores from both search types, and ordered accordingly. There are two mechanisms for reranking the results: * Using the reranking model of the knowledge base If the knowledge base was created with the reranking model provided, the hybrid search uses it to rerank the result set. ```sql theme={null} SELECT * from my_kb WHERE content = ”ACME-213” AND hybrid_search = true; -- here, hybrid_search_alpha = 0.5 ``` In this query, the hybrid search uses the reranking features enabled with the knowledge base. * Using the alpha reranking that can be further customized for hybrid search Users can opt for using the alpha reranking that can be customized specifically for hybrid search. By setting the `hybrid_search_alpha` parameter to any value between 0 and 1, users can give importance to results from the keyword search (if the value is closer to 0) or the semantic search (if the value is closer to 1). ```sql theme={null} SELECT * from my_kb WHERE content = ”ACME-213” AND hybrid_search_alpha = 0.4 AND reranking = false; ``` This query uses hybrid search with emphasis on results from the keyword search. Relevance-Based Document Selection for Reranking When retrieving documents from the full-text index, there is a practical limit on how many documents can be passed to the reranker, since reranking is typically computationally expensive. To ensure that only the most promising candidates are selected for reranking, we apply relevance heuristics during the keyword search stage. One widely used heuristic is BM25, a ranking function that scores documents based on their keyword relevance to the user query. BM25 considers both the frequency of a keyword within a document and how common that keyword is across the entire corpus. By scoring documents using BM25, the system can prioritize more relevant matches and limit reranker input to a smaller, high-quality subset of documents. This helps achieve a balance between performance and retrieval accuracy in hybrid search. Overall, the reranker ensures that highly relevant keyword matches appear alongside semantically similar results, offering users a balanced and accurate response. # How to Insert Data into Knowledge Bases Source: https://docs.mindsdb.com/mindsdb_sql/knowledge_bases/insert_data Knowledge Bases (KBs) organize data across data sources, including databases, files, documents, webpages, enabling efficient search capabilities. Here is what happens to data when it is inserted into the knowledge base.

Upon inserting data into the knowledge base, it is split into chunks, transformed into the embedding representation to enhance the search capabilities, and stored in a vector database. ## `INSERT INTO` Syntax Here is the syntax for inserting data into a knowledge base: ```sql theme={null} INSERT INTO my_kb SELECT order_id, product, notes FROM sample_data.orders; ``` Upon execution, it inserts data into a knowledge base, using the embedding model to embed it into vectors before inserting into an underlying vector database. The status of the `INSERT INTO` is logged in the `information_schema.queries` table with the timestamp when it was ran, and can be queried as follows: ```sql theme={null} SELECT * FROM information_schema.queries; ``` To speed up data insertion, you can use these performance optimization flags: **Skip duplicate checking (kb\_no\_upsert)** ```sql theme={null} INSERT INTO my_kb SELECT * FROM table_name USING kb_no_upsert = true; ``` This skips all duplicate checking and directly inserts data. Use only when the knowledge base is empty (initial data load). **Skip existing items (kb\_skip\_existing)** ```sql theme={null} INSERT INTO my_kb SELECT * FROM table_name USING kb_skip_existing = true; ``` This checks for existing items and skips them entirely, including avoiding embedding calculation for existing content. More efficient than upsert when you only want to insert new items. **Handling duplicate data while inserting into the knowledge base** Knowledge bases uniquely identify data rows using an ID column, which prevents from inserting duplicate data, as follows. * **Case 1: Inserting data into the knowledge base without the `id_column` defined.** When users do not define the `id_column` during the creation of a knowledge base, MindsDB generates the ID for each row using a hash of the content columns, as [explained here](/mindsdb_sql/knowledge_bases/create#id-column). **Example:** If two rows have exactly the same content in the content columns, their hash (and thus their generated ID) will be the same. Note that duplicate rows are skipped and not inserted. Since both rows in the below table have the same content, only one row will be inserted. | name | age | | ----- | --- | | Alice | 25 | | Alice | 25 | * **Case 2: Inserting data into the knowledge base with the `id_column` defined.** When users define the `id_column` during the creation of a knowledge base, then the knowledge base uses that column's values as the row ID. **Example:** If the `id_column` has duplicate values, the knowledge base skips the duplicate row(s) during the insert. The second row in the below table has the same `id` as the first row, so only one of these rows is inserted. | id | name | age | | -- | ----- | --- | | 1 | Alice | 25 | | 1 | Bob | 30 | **Best practice** Ensure the `id_column` uniquely identifies each row to avoid unintentional data loss due to duplicate ID skipping. **Performance optimization for duplicate handling** For better performance when handling duplicates, you can use: * `kb_skip_existing = true`: Checks for existing IDs and skips them completely (no embedding calculation, more efficient) * `kb_no_upsert = true`: Skips duplicate checking entirely (fastest, use only for initial load into empty KB) ### Update Existing Data In order to update existing data in the knowledge base, insert data with the column ID that you want to update and the updated content. Here is an example of usage. A knowledge base stores the following data. ```sql theme={null} +----------+-------------------+------------------------+ | order_id | product | notes | +----------+-------------------+------------------------+ | A1B | Wireless Mouse | Request color: black | | 3XZ | Bluetooth Speaker | Gift wrap requested | | Q7P | Laptop Stand | Prefer aluminum finish | +----------+-------------------+------------------------+ ``` A user updated `Laptop Stand` to `Aluminum Laptop Stand`. ```sql theme={null} +----------+-----------------------+------------------------+ | order_id | product | notes | +----------+-----------------------+------------------------+ | A1B | Wireless Mouse | Request color: black | | 3XZ | Bluetooth Speaker | Gift wrap requested | | Q7P | Aluminum Laptop Stand | Prefer aluminum finish | +----------+-----------------------+------------------------+ ``` Go to the *Complete Example* section below to find out how to access this sample data. Here is how to propagate this change into the knowledge base. ```sql theme={null} INSERT INTO my_kb SELECT order_id, product, notes FROM sample_data.orders WHERE order_id = 'Q7P'; ``` The knowledge base matches the ID value to the existing one and updates the data if required. ### Insert Data using Partitions In order to optimize the performance of data insertion into the knowledge base, users can set up partitions and threads to insert batches of data in parallel. This also enables tracking the progress of data insertion process including cancelling and resuming it if required. Here is an example. ```sql theme={null} INSERT INTO my_kb SELECT order_id, product, notes FROM sample_data.orders USING batch_size = 200, track_column = order_id, threads = 10, error = 'skip'; ``` The parameters include the following: * `batch_size` defines the number of rows fetched per iteration to optimize data extraction from the source. It defaults to 1000. * `threads` defines threads for running partitions. Note that if the [ML task queue](/setup/custom-config#overview-of-config-parameters) is enabled, threads are used automatically. The available values for `threads` are: * a number of threads to be used, for example, `threads = 10`, * a boolean value that defines whether to enable threads, setting `threads = true`, or disable threads, setting `threads = false`. * `track_column` defines the column used for sorting data before partitioning. * `error` defines the error processing options. The available values include `raise`, used to raise errors as they come, or `skip`, used to subside errors. It defaults to `raise` if not provided. After executing the `INSERT INTO` statement with the above parameters, users can view the data insertion progress by querying the `information_schema.queries` table. ```sql theme={null} SELECT * FROM information_schema.queries; ``` Users can cancel the data insertion process using the process ID from the `information_schema.queries` table. ```sql theme={null} SELECT query_cancel(1); ``` Note that canceling the query will not remove the already inserted data. Users can resume the data insertion process using the process ID from the `information_schema.queries` table. ```sql theme={null} SELECT query_resume(1); ``` ### Chunking Data Upon inserting data into the knowledge base, the data chunking is performed in order to optimize the storage and search of data. Each chunk is identified by its chunk ID of the following format: `:of:to`. #### Text Users can opt for defining the chunking parameters when creating a knowledge base. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING ... preprocessing = { "text_chunking_config" : { "chunk_size": 2000, "chunk_overlap": 200 } }, ...; ``` The `chunk_size` parameter defines the size of the chunk as the number of characters. And the `chunk_overlap` parameter defines the number of characters that should overlap between subsequent chunks. #### JSON Users can opt for defining the chunking parameters specifically for JSON data. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING ... preprocessing = { "type": "json_chunking", "json_chunking_config" : { ... } }, ...; ``` When the `type` of chunking is set to `json_chunking`, users can configure it by setting the following parameter values in the `json_chunking_config` parameter: * `flatten_nested`\ It is of the `bool` data type with the default value of `True`.\ It defines whether to flatten nested JSON structures. * `include_metadata`\ It is of the `bool` data type with the default value of `True`.\ It defines whether to include original metadata in chunks. * `chunk_by_object`\ It is of the `bool` data type with the default value of `True`.\ It defines whether to chunk by top-level objects (`True`) or create a single document (`False`). * `exclude_fields`\ It is of the `List[str]` data type with the default value of an empty list.\ It defines the list of fields to exclude from chunking. * `include_fields`\ It is of the `List[str]` data type with the default value of an empty list.\ It defines the list of fields to include in chunking (if empty, all fields except excluded ones are included). * `metadata_fields`\ It is of the `List[str]` data type with the default value of an empty list.\ It defines the list of fields to extract into metadata for filtering (can include nested fields using dot notation). If empty, all primitive fields will be extracted (top-level fields if available, otherwise all primitive fields in the flattened structure). * `extract_all_primitives`\ It is of the `bool` data type with the default value of `False`.\ It defines whether to extract all primitive values (strings, numbers, booleans) into metadata. * `nested_delimiter`\ It is of the `str` data type with the default value of `"."`.\ It defines the delimiter for flattened nested field names. * `content_column`\ It is of the `str` data type with the default value of `"content"`.\ It defines the name of the content column for chunk ID generation. ### Underlying Vector Store Each knowledge base has its underlying vector store that stores data inserted into the knowledge base in the form of embeddings. Users can query the underlying vector store as follows. * KB with the default ChromaDB vector store: ```sql theme={null} SELECT id, content, metadata, embeddings FROM _chromadb.storage_table; ``` * KB with user-defined vector store (either [PGVector](/integrations/vector-db-integrations/pgvector) or [ChromaDB](/integrations/vector-db-integrations/chromadb)): ```sql theme={null} SELECT id, content, metadata, embeddings FROM .; ``` ### Example Here a sample knowledge base created in the previous **Example** section is inserted into. ```sql theme={null} INSERT INTO my_kb SELECT order_id, product, notes FROM sample_data.orders; ``` When inserting into a knowledge base where the `content_columns` parameter was not specified, the column storing content must be aliased `AS content` as below. ```sql theme={null} CREATE KNOWLEDGE_BASE my_kb USING ... id_column = 'order_id', ... ``` ```sql theme={null} INSERT INTO my_kb SELECT order_id, notes AS content FROM sample_data.orders; ``` ## `DELETE FROM` Syntax Here is the syntax for deleting from a knowledge base: ```sql theme={null} DELETE FROM my_kb WHERE id = 'A1B'; ``` ## `CREATE INDEX ON KNOWLEDGE_BASE` Syntax Users can create index on the knowledge base to speed up the search operations. ```sql theme={null} CREATE INDEX ON KNOWLEDGE_BASE my_kb; ``` Note that this feature works only when PGVector is used as the [storage of the knowledge base](/mindsdb_sql/knowledge_bases/create#storage), as ChromaDB provides the index features by default. Upon executing this statement, an index is created on the knowledge base's underlying vector store. This is essentially a database index created on the vector database. Note that having an index on the knowledge base may reduce the speed of the insert operations. Therefore, it is recommended to insert bulk data into the knowledge base before creating an index. The index improves performance of querying the knowledge base, while it may slow down subsequent data inserts. # How Knowledge Bases Work Source: https://docs.mindsdb.com/mindsdb_sql/knowledge_bases/overview A knowledge base is an advanced AI-table that organizes information based on semantic meaning rather than simple keyword matching. It integrates embedding models, reranking models, and vector stores to enable context-aware data retrieval. By performing semantic reasoning across multiple data points, a knowledge base delivers deeper insights and more accurate responses, making it a powerful tool for intelligent data access.