AI Tables in MariaDB
Now, you can train machine learning models straight from the database by using MindsDB and MariaDB.
Prerequisite
You will need MindsDB version >= 2.0.0 and MariaDB installed:
Configuration
Default configuration
MindsDB will try to use the default configuration(hosts, ports, usernames) for each of the database integrations. If you want to extend that or you are using different parameters creata a new config.json file.
The avaiable configuration options are:
- api['http] -- This key is used for starting the MindsDB http server by providing:
- host(default 0.0.0.0.) - The mindsdb server address.
- port(default 47334) - The mindsdb server port.
- api['mysql'] -- This key is used for database integrations that works through MySQL protocol. The required keys are:
- user(default root).
- password(default empty).
- host(default localhost).
- port(default 47335).
- log -- The logging configuration:
- console_level - "INFO", "DEBUG", "ERROR".
- file - Location of the log file.
- file_level - "INFO", "DEBUG", "ERROR".
- folder logs - Directory of log files.
- format - Format of log message e.g "%(asctime)s - %(levelname)s - %(message)s".
- integrations -- This key specifies the integration type in this case
default_mariadb
. The required keys are:- user(default root) - The MariaDB user name.
- host(default localhost) - Connect to the MariaDB server on the given host.
- password - The password of the MariaDB account.
- type - Integration type(mariadb, postgresql, mysql, clickhouse).
- port(default 3306) - The TCP/IP port number to use for the connection.
- interface -- This key is used by MindsDB and provides the path to the directory where MindsDB shall save configuration and model files:
- datastore
- enabled(default false) - If not provided MindsDB will use default storage inside /var.
- storage_dir - Path to the storage directory for datastore.
- mindsdb_native
- enabled - If not provided mindsdb_native will use default storage inside /var.
- storage_dir - Path to the storage directory for datastore.
- datastore
Configuration example
{
"api": {
"http": {
"host": "0.0.0.0",
"port": "47334"
},
"mysql": {
"host": "127.0.0.1",
"password": "",
"port": "47335",
"user": "root"
}
},
"config_version": "1.3",
"debug": true,
"integrations": {
"default_mariadb": {
"enabled": true,
"host": "localhost",
"password": "password",
"port": 3306,
"type": "mariadb",
"user": "root"
}
},
"log": {
"level": {
"console": "DEBUG",
"file": "INFO"
}
},
"storage_dir": "/storage"
}
Install CONNECT Storage Engine
Also you need to install the CONNECT Storage Engine to access external local data. Checkout MariaDB docs on how to do that.
Start MindsDB
To start mindsdb run following command:
python3 -m mindsdb --api=mysql --config=config.json
Train new model
To train a new model, insert a new record inside the mindsdb.predictors table as:
INSERT INTO
mindsdb.predictors(name, predict, select_data_query, training_options)
VALUES
('used_cars_model', 'price', 'SELECT * FROM test.UsedCarsData', "option,value");
- name (string) -- The name of the predictor.
- predict (string) -- The feature you want to predict, in this example price. To predict multiple featurs include a comma separated string e.g 'price,year'.
- select_data_query (string) -- The SELECT query that will ingest the data to train the model.
- training_options (JSON as comma separated string) -- optional value that contains additional training parameters. For a full list of the parameters check the PredictorInterface.
Query the model
To query the model and get the predictions SELECT the target variable, confidence and explanation for that prediction.
SELECT
price AS predicted,
price_confidence AS confidence,
price_explain AS info
FROM
mindsdb.used_cars_model
WHERE
model = "A6"
AND mileage = 36203
AND transmission = "Automatic"
AND fuelType = "Diesel"
AND mpg = "64.2"
AND engineSize = 2
AND year = 2016
AND tax = 20;
price | predicted | info |
---|---|---|
13111 | 0.9921 | Check JSON bellow |
{
"predicted_value": 13111,
"confidence": 0.9921,
"prediction_quality": "very confident",
"confidence_interval": [10792, 32749],
"important_missing_information": [],
"confidence_composition": {
"Model": 0.009,
"year": 0.013
},
"extra_insights": {
"if_missing": [{
"Model": 12962
}, {
"year": 12137
}, {
"transmission": 2136
}, {
"mileage": 22706
}, {
"fuelType": 7134
}, {
"tax": 13210
}, {
"mpg": 27409
}, {
"engineSize": 13111
}]
}
}
Delete the model
To delete the predictor that you have previously created, you need to delete it from mindsdb.predictors
table. The name should be equal to name added in the INSERT statment while creating the predictor, e.g:
DELETE FROM mindsdb.predictors WHERE name='used_cars_model'
Train and predict multiple features
You can train a model that will predict multiple features by adding a comma separated features values in the predict column. e.g to predict the price
and a year
:
INSERT INTO
mindsdb.predictors(name, predict, select_data_query, training_options)
VALUES
('used_cars_model', 'price,year', 'SELECT * FROM test.UsedCarsData', "option,value"});
select_data_query
:
SELECT
price AS predicted,
FROM
mindsdb.used_cars_model
WHERE
select_data_query='SELECT year FROM price_data';
The requirements to query with select_data_query
are:
- It must be a valid SQL statement
- It must return columns with names the same as predictor fields.
If you want to follow along with a tutorial check out AI Tables in MariaDB tutorial.