Documentation Index
Fetch the complete documentation index at: https://docs.mindsdb.com/llms.txt
Use this file to discover all available pages before exploring further.
In this section, you’ll find how to add new integrations/databases to MindsDB.
PrerequisiteYou should have the latest version of the MindsDB repository installed locally. Follow this guide to learn how to install MindsDB for development.
What are Database Handlers?
Database handlers act as a bridge to any database. You use database handlers to create databases using the CREATE DATABASE command. So you can reach data from any database that has its handler implemented within MindsDB.
Creating a Database Handler
You can create your own database handler within MindsDB by inheriting from the DatabaseHandler class.
By providing the implementation for some or all of the methods contained in the DatabaseHandler class, you can connect with the database of your choice.
Core Methods
Apart from the __init__() method, there are seven core methods that must be implemented. We recommend checking actual examples in the codebase to get an idea of what goes into each of these methods, as they can change a bit depending on the nature of the system being integrated.
Let’s review the purpose of each method.
| Method | Purpose |
|---|
connect() | It performs the necessary steps to connect to the underlying system. |
disconnect() | It gracefully closes connections established in the connect() method. |
check_connection() | It evaluates if the connection is alive and healthy. This method is called frequently. |
native_query() | It parses any native statement string and acts upon it (for example, raw SQL commands). |
query() | It takes a parsed SQL command in the form of an abstract syntax tree and executes it. |
get_tables() | It lists and returns all the available tables. Each handler decides what a table means for the underlying system when interacting with it from the data layer. Typically, these are actual tables. |
get_columns() | It returns columns of a table registered in the handler with the respective data type. |
Authors can opt for adding private methods, new files and folders, or any combination of these to structure all the necessary work that will enable the core methods to work as intended.
Other Common MethodsUnder the mindsdb.integrations.libs.utils library, contributors can find various methods that may be useful while implementing new handlers.For response formatting, use the following classes from mindsdb.integrations.libs.response:The legacy HandlerResponse class is deprecated. Use TableResponse, OkResponse, or ErrorResponse instead.
Implementation
Each database handler should inherit from the DatabaseHandler class.
Here is a step-by-step guide:
-
Setting the
name class property:
MindsDB uses it internally as the name of the handler.
For example, the CREATE DATABASE statement uses the handler’s name.
CREATE DATABASE integration_name
WITH ENGINE = 'postgres', --- here, the handler's name is `postgres`
PARAMETERS = {
'host': '127.0.0.1',
'user': 'root',
'password': 'password'
};
-
Implementing the
__init__() method:
This method initializes the handler. The connection_data argument contains the PARAMETERS from the CREATE DATABASE statement, such as user, password, etc.
def __init__(self, name: str, connection_data: Optional[dict]):
""" constructor
Args:
name (str): the handler name
"""
-
Implementing the
connect() method:
The connect() method sets up the connection.
def connect(self) -> HandlerStatusResponse:
""" Set up any connections required by the handler
Should return the output of check_connection() method after attempting
connection. Should switch self.is_connected.
Returns:
HandlerStatusResponse
"""
-
Implementing the
disconnect() method:
The disconnect() method closes the existing connection.
def disconnect(self):
""" Close any existing connections
Should switch self.is_connected.
"""
-
Implementing the
check_connection() method:
The check_connection() method performs the health check for the connection.
def check_connection(self) -> HandlerStatusResponse:
""" Check connection to the handler
Returns:
HandlerStatusResponse
"""
-
Implementing the
native_query() method:
The native_query() method runs commands of the native database language.
def native_query(self, query: Any) -> TableResponse | OkResponse | ErrorResponse:
"""Receive raw query and act upon it somehow.
Args:
query (Any): query in native format (str for sql databases,
etc)
Returns:
TableResponse | OkResponse | ErrorResponse
"""
-
Implementing the
query() method:
The query method runs parsed SQL commands.
def query(self, query: ASTNode) -> TableResponse | OkResponse | ErrorResponse:
"""Receive query as AST (abstract syntax tree) and act upon it somehow.
Args:
query (ASTNode): sql query represented as AST. May be any kind
of query: SELECT, INSERT, DELETE, etc
Returns:
TableResponse | OkResponse | ErrorResponse
"""
-
Implementing the
get_tables() method:
The get_tables() method lists all the available tables.
def get_tables(self) -> TableResponse | ErrorResponse:
""" Return list of entities
Return a list of entities that will be accessible as tables.
Returns:
TableResponse | ErrorResponse: should have the same columns as information_schema.tables
(https://dev.mysql.com/doc/refman/8.0/en/information-schema-tables-table.html)
Column 'TABLE_NAME' is mandatory, other is optional.
"""
-
Implementing the
get_columns() method:
The get_columns() method lists all columns of a specified table.
def get_columns(self, table_name: str) -> TableResponse | ErrorResponse:
""" Returns a list of entity columns
Args:
table_name (str): name of one of tables returned by self.get_tables()
Returns:
TableResponse | ErrorResponse: data should have the same columns as information_schema.columns
(https://dev.mysql.com/doc/refman/8.0/en/information-schema-columns-table.html)
Column 'COLUMN_NAME' is mandatory, other is optional. Highly
recommended to define also 'DATA_TYPE': it should be one of
python data types (by default it is str).
"""
Response Classes
The data-returning methods (native_query(), query(), get_tables(), get_columns()) should return one of the following response classes from mindsdb.integrations.libs.response:
| Response Class | Use Case | Key Attributes |
|---|
TableResponse | Queries that return data (SELECT, SHOW, etc.) | data, data_generator, columns, affected_rows |
OkResponse | Successful operations without data (CREATE, DROP, INSERT, UPDATE, DELETE) | affected_rows |
ErrorResponse | Error cases | error_code, error_message, is_expected_error |
TableResponse
TableResponse is used when returning data from queries. It supports two modes of data delivery:
- Immediate data: Pass all data at once via the
data parameter (pandas DataFrame)
- Streaming data: Pass a generator via the
data_generator parameter for lazy loading
from mindsdb.integrations.libs.response import TableResponse, OkResponse, ErrorResponse
# Immediate data response
def native_query(self, query: str) -> TableResponse:
result = self.execute_query(query)
df = pd.DataFrame(result)
return TableResponse(data=df)
# Streaming data response (for large datasets)
def native_query(self, query: str) -> TableResponse:
def data_generator():
cursor = self.connection.cursor()
cursor.execute(query)
while batch := cursor.fetchmany(size=1000):
yield pd.DataFrame(batch)
return TableResponse(data_generator=data_generator())
OkResponse
OkResponse is used for operations that don’t return data:
def native_query(self, query: str) -> OkResponse:
cursor = self.connection.cursor()
cursor.execute(query)
self.connection.commit()
return OkResponse(affected_rows=cursor.rowcount)
ErrorResponse
ErrorResponse is used to report errors:
def native_query(self, query: str) -> ErrorResponse:
try:
# ... execute query
except DatabaseError as e:
return ErrorResponse(
error_code=e.code,
error_message=str(e),
is_expected_error=True # Set to True for user errors (syntax, permissions, etc.)
)
Streaming Support
For handlers that deal with large datasets, implementing streaming support is recommended. This allows data to be returned in chunks rather than loading everything into memory at once.
To enable streaming:
-
Set the
stream_response class attribute to True:
class MyDatabaseHandler(DatabaseHandler):
name = "mydatabase"
stream_response = True # Indicates that handler can return data as a generator
-
Implement
native_query() to return a TableResponse with a data_generator:
def native_query(self, query: str, stream: bool = True) -> TableResponse | OkResponse | ErrorResponse:
if stream:
return self._execute_streaming(query)
else:
return self._execute_immediate(query)
def _execute_streaming(self, query: str) -> TableResponse:
"""Execute query and return results as a stream."""
cursor = self.connection.cursor(name="server_side_cursor")
cursor.execute(query)
columns = [Column(name=col.name, type=col.type) for col in cursor.description]
def generate_data():
while batch := cursor.fetchmany(size=1000):
yield pd.DataFrame(batch, columns=[c.name for c in columns])
return TableResponse(columns=columns, data_generator=generate_data())
Exporting the connection_args Dictionary
The connection_args dictionary contains all of the arguments used to establish the connection along with their descriptions, types, labels, and whether they are required or not.
The connection_args dictionary should be stored in the connection_args.py file inside the handler folder.
The connection_args dictionary is stored in a separate file in order to be able to hide sensitive information such as passwords or API keys.By default, when querying for connection_data from the information_schema.databases table, all sensitive information is hidden. To unhide it, use this command:
Here is an example of the connection_args.py file from the MySQL handler where the password value is set to hidden with 'secret': True.
from collections import OrderedDict
from mindsdb.integrations.libs.const import HANDLER_CONNECTION_ARG_TYPE as ARG_TYPE
connection_args = OrderedDict(
url={
'type': ARG_TYPE.STR,
'description': 'The URI-Like connection string to the MySQL server. If provided, it will override the other connection arguments.',
'required': False,
'label': 'URL'
},
user={
'type': ARG_TYPE.STR,
'description': 'The user name used to authenticate with the MySQL server.',
'required': True,
'label': 'User'
},
password={
'type': ARG_TYPE.PWD,
'description': 'The password to authenticate the user with the MySQL server.',
'required': True,
'label': 'Password',
'secret': True
},
database={
'type': ARG_TYPE.STR,
'description': 'The database name to use when connecting with the MySQL server.',
'required': True,
'label': 'Database'
},
host={
'type': ARG_TYPE.STR,
'description': 'The host name or IP address of the MySQL server. NOTE: use \'127.0.0.1\' instead of \'localhost\' to connect to local server.',
'required': True,
'label': 'Host'
},
port={
'type': ARG_TYPE.INT,
'description': 'The TCP/IP port of the MySQL server. Must be an integer.',
'required': True,
'label': 'Port'
},
ssl={
'type': ARG_TYPE.BOOL,
'description': 'Set it to True to enable ssl.',
'required': False,
'label': 'ssl'
},
ssl_ca={
'type': ARG_TYPE.PATH,
'description': 'Path or URL of the Certificate Authority (CA) certificate file',
'required': False,
'label': 'ssl_ca'
},
ssl_cert={
'type': ARG_TYPE.PATH,
'description': 'Path name or URL of the server public key certificate file',
'required': False,
'label': 'ssl_cert'
},
ssl_key={
'type': ARG_TYPE.PATH,
'description': 'The path name or URL of the server private key file',
'required': False,
'label': 'ssl_key',
}
)
connection_args_example = OrderedDict(
host='127.0.0.1',
port=3306,
user='root',
password='password',
database='database'
)
Exporting All Required Variables
The following should be exported in the __init__.py file of the handler:
- The
Handler class.
- The
version of the handler.
- The
name of the handler.
- The
type of the handler, either DATA handler or ML handler.
- The
icon_path to the file with the database icon.
- The
title of the handler or a short description.
- The
description of the handler.
- The
connection_args dictionary with the connection arguments.
- The
connection_args_example dictionary with an example of the connection arguments.
- The
import_error message that is used if the import of the Handler class fails.
A few of these variables are defined in another file called __about__.py. This file is imported into the __init__.py file.
Here is an example of the __init__.py file for the MySQL handler.
from mindsdb.integrations.libs.const import HANDLER_TYPE
from .__about__ import __version__ as version, __description__ as description
from .connection_args import connection_args, connection_args_example
try:
from .mysql_handler import (
MySQLHandler as Handler,
connection_args_example,
connection_args
)
import_error = None
except Exception as e:
Handler = None
import_error = e
title = 'MySQL'
name = 'mysql'
type = HANDLER_TYPE.DATA
icon_path = 'icon.svg'
__all__ = [
'Handler', 'version', 'name', 'type', 'title', 'description',
'connection_args', 'connection_args_example', 'import_error', 'icon_path'
]
The __about__.py file for the same MySQL handler contains the following variables:
__title__ = 'MindsDB MySQL handler'
__package_name__ = 'mindsdb_mysql_handler'
__version__ = '0.0.1'
__description__ = "MindsDB handler for MySQL"
__author__ = 'MindsDB Inc'
__github__ = 'https://github.com/mindsdb/mindsdb'
__pypi__ = 'https://pypi.org/project/mindsdb/'
__license__ = 'MIT'
__copyright__ = 'Copyright 2022- mindsdb'
Exporting Requirements
In the case if the integration requires other packages to function correctly, list them in the requirements.txt file.
Create a text file named requirements.txt that stores all packages required for using the integration. Here is an example:
mysql-connector-python==9.1.0
...
Check out our Database Handlers!
To see some integration handlers that are currently in use, we encourage you to check out the following handlers inside the MindsDB repository:
And here are all the handlers available in the MindsDB repository.