Skip to main content
The Bring Your Own Model (BYOM) feature lets you upload your own models in the form of Python code and use them within MindsDB.

How It Works

You can upload your custom model via the MindsDB editor by clicking Add and Upload custom model, like this:

Here is the form that needs to be filled out in order to bring your model to MindsDB:

Let’s briefly go over the files that need to be uploaded:
  • The Python file stores an implementation of your model. It should contain the class with the implementation for the train and predict methods. Here is the sample format:
    class CustomPredictor():
    
        def train(self, df, target_col, args=None):
                <implementation goes here>
                return ''
    
        def predict(self, df):
                <implementation goes here>
                return df
    
import os
import pandas as pd

from sklearn.cross_decomposition import PLSRegression
from sklearn import preprocessing

class CustomPredictor():

    def train(self, df, target_col, args=None):
        print(args, '1111')

        self.target_col = target_col
        y = df[self.target_col]
        x = df.drop(columns=self.target_col)
        x_cols = list(x.columns)

        x_scaler = preprocessing.StandardScaler().fit(x)
        y_scaler = preprocessing.StandardScaler().fit(y.values.reshape(-1, 1))

        xs = x_scaler.transform(x)
        ys = y_scaler.transform(y.values.reshape(-1, 1))

        pls = PLSRegression(n_components=1)
        pls.fit(xs, ys)

        T = pls.x_scores_
        W = pls.x_weights_
        P = pls.x_loadings_
        R = pls.x_rotations_

        self.x_cols = x_cols
        self.x_scaler = x_scaler
        self.P = P

        def calc_limit(df):
            res = None
            for column in df.columns:
                if column == self.target_col: continue
                tbl = df.groupby(self.target_col).agg({column: ['mean', 'min', 'max', 'std']})
                tbl.columns = tbl.columns.get_level_values(1)
                tbl['name'] = column
                tbl['std'] = tbl['std'].fillna(0)
                tbl['lower'] = tbl['mean'] - 3 * tbl['std']
                tbl['upper'] = tbl['mean'] + 3 * tbl['std']
                tbl['lower'] = tbl[["lower", "min"]].max(axis=1)  # lower >= min
                tbl['upper'] = tbl[["upper", "max"]].min(axis=1)  # upper <= max
                tbl = tbl[['name', 'lower', 'mean', 'upper']]
                try:
                    res = pd.concat([res, tbl])
                except:
                    res = tbl
            return res

        trdf = pd.DataFrame()
        trdf[self.target_col] = y.values
        trdf['T1'] = T.squeeze()
        limit = calc_limit(trdf).reset_index()

        self.limit = limit

        return "Trained predictor ready to be stored"

    def predict(self, df):

        yt = df[self.target_col].values
        xt = df[self.x_cols]

        xt = self.x_scaler.transform(xt)

        excess_cols = list(set(df.columns) - set(self.x_cols))

        pred_df = df[excess_cols].copy()

        pred_df[self.target_col] = yt
        pred_df['T1'] = (xt @ self.P).squeeze()

        pred_df = pd.merge(pred_df, self.limit[[self.target_col, 'lower', 'upper']], how='left', on=self.target_col)

        return pred_df
  • The optional requirements file, or requirements.txt, stores all dependencies along with their versions. Here is the sample format:
    dependency_package_1 == version
    dependency_package_2 >= version
    dependency_package_3 >= version, < version
    ...
    
pandas
scikit-learn
Once you upload the above files, please provide an engine name. Please note that your custom model is uploaded to MindsDB as an engine. Then you can use this engine to create a model.

Configuration

The BYOM feature can be configured with the following environment variables:
  • MINDSDB_BYOM_ENABLED This environment variable defines whether the BYOM feature is enabled (MINDSDB_BYOM_ENABLED=true) or disabled (MINDSDB_BYOM_ENABLED=false). Note that when running MindsDB locally, it is enabled by default.
  • MINDSDB_BYOM_DEFAULT_TYPE This environment variable defines the modes of operation of the BYOM feature.
    • MINDSDB_BYOM_DEFAULT_TYPE=venv
      When using the venv mode, MindsDB creates a virtual environment and installs in it the packages listed in the requirements.txt file. This virtual environment is dedicated for the custom model. Note that when running MindsDB locally, it is the default mode.
    • MINDSDB_BYOM_DEFAULT_TYPE=inhouse
      When using the inhouse mode, there is no dedicated virtual environment for the custom model. It uses the environment of MindsDB, therefore, the requirements.txt file is not used with this mode.
  • MINDSDB_BYOM_INHOUSE_ENABLED This environment variable defines whether the inhouse mode is enabled (MINDSDB_BYOM_INHOUSE_ENABLED=true) or disabled (MINDSDB_BYOM_INHOUSE_ENABLED=false). Note that when running MindsDB locally, it is enabled by default.

Example

We upload the custom model, as below:

Here we upload the model.py file that stores an implementation of the model and the requirements.txt file that stores all the dependencies. Once the model is uploaded, it becomes an ML engine within MindsDB. Now we use this custom_model_engine to create a model as follows:
CREATE MODEL custom_model
FROM my_integration
    (SELECT * FROM my_table)
PREDICT target
USING
    ENGINE = 'custom_model_engine';
Let’s query for predictions by joining the custom model with the data table.
SELECT input.feature_column, model_target_column
FROM my_integration.my_table as input
JOIN custom_model as model;
Check out the BYOM handler folder to see the implementation details.
I