> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mindsdb.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Bring Your Own Model

The Bring Your Own Model (BYOM) feature lets you upload your own models in the form of Python code and use them within MindsDB.

## How It Works

You can upload your custom model via the MindsDB editor by clicking `Add` and `Upload custom model`, like this:

<p align="center">
  <img src="https://mintcdn.com/mindsdb/PepdSPcGoBKUq1N5/assets/byom_upload_custom_model.png?fit=max&auto=format&n=PepdSPcGoBKUq1N5&q=85&s=2cd24a57cd41e06b55499c540b879966" width="1784" height="894" data-path="assets/byom_upload_custom_model.png" />
</p>

Here is the form that needs to be filled out in order to bring your model to MindsDB:

<p align="center">
  <img src="https://mintcdn.com/mindsdb/PepdSPcGoBKUq1N5/assets/byom_empty_form.png?fit=max&auto=format&n=PepdSPcGoBKUq1N5&q=85&s=c50b62c350c9bf9ef584a48632497e3e" width="3204" height="1818" data-path="assets/byom_empty_form.png" />
</p>

Let's briefly go over the files that need to be uploaded:

* The Python file stores an implementation of your model. It should contain the class with the implementation for the `train` and `predict` methods. Here is the sample format:

  ```py theme={null}
  class CustomPredictor():

      def train(self, df, target_col, args=None):
              <implementation goes here>
              return ''

      def predict(self, df):
              <implementation goes here>
              return df
  ```

<Accordion title="Example">
  ```py theme={null}
  import os
  import pandas as pd

  from sklearn.cross_decomposition import PLSRegression
  from sklearn import preprocessing

  class CustomPredictor():

      def train(self, df, target_col, args=None):
          print(args, '1111')

          self.target_col = target_col
          y = df[self.target_col]
          x = df.drop(columns=self.target_col)
          x_cols = list(x.columns)

          x_scaler = preprocessing.StandardScaler().fit(x)
          y_scaler = preprocessing.StandardScaler().fit(y.values.reshape(-1, 1))

          xs = x_scaler.transform(x)
          ys = y_scaler.transform(y.values.reshape(-1, 1))

          pls = PLSRegression(n_components=1)
          pls.fit(xs, ys)

          self.pls = pls
          self.y_scaler = y_scaler

          T = pls.x_scores_
          W = pls.x_weights_
          P = pls.x_loadings_
          R = pls.x_rotations_

          self.x_cols = x_cols
          self.x_scaler = x_scaler
          self.P = P

          def calc_limit(df):
              res = None
              for column in df.columns:
                  if column == self.target_col: continue
                  tbl = df.groupby(self.target_col).agg({column: ['mean', 'min', 'max', 'std']})
                  tbl.columns = tbl.columns.get_level_values(1)
                  tbl['name'] = column
                  tbl['std'] = tbl['std'].fillna(0)
                  tbl['lower'] = tbl['mean'] - 3 * tbl['std']
                  tbl['upper'] = tbl['mean'] + 3 * tbl['std']
                  tbl['lower'] = tbl[["lower", "min"]].max(axis=1)  # lower >= min
                  tbl['upper'] = tbl[["upper", "max"]].min(axis=1)  # upper <= max
                  tbl = tbl[['name', 'lower', 'mean', 'upper']]
                  try:
                      res = pd.concat([res, tbl])
                  except:
                      res = tbl
              return res

          trdf = pd.DataFrame()
          trdf[self.target_col] = y.values
          trdf['T1'] = T.squeeze()
          limit = calc_limit(trdf).reset_index()

          self.limit = limit

          return "Trained predictor ready to be stored"

      def predict(self, df):


          xt = df[self.x_cols]

          xt = self.x_scaler.transform(xt)

          excess_cols = list(set(df.columns) - set(self.x_cols))

          pred_df = df[excess_cols].copy()
      
          ys_pred = self.pls.predict(xt)
          y_pred = self.y_scaler.inverse_transform(ys_pred).ravel()
          pred_df[self.target_col] = y_pred

          pred_df['T1'] = (xt @ self.P).squeeze()
          return pred_df
  ```
</Accordion>

* The optional requirements file, or `requirements.txt`, stores all dependencies along with their versions. Here is the sample format:

  ```sql theme={null}
  dependency_package_1 == version
  dependency_package_2 >= version
  dependency_package_3 >= version, < version
  ...
  ```

<Accordion title="Example">
  ```sql theme={null}
  pandas
  scikit-learn
  ```
</Accordion>

Once you upload the above files, please provide an engine name.

Please note that your custom model is uploaded to MindsDB as an engine. Then you can use this engine to create a model.

<p align="center">
  <img src="https://mintcdn.com/mindsdb/tkxKy44mj_2VlYcf/assets/byom_diagram.png?fit=max&auto=format&n=tkxKy44mj_2VlYcf&q=85&s=12e1e8bfac5f8a666f2456adcbddfc85" width="1501" height="232" data-path="assets/byom_diagram.png" />
</p>

## Configuration

The BYOM feature can be configured with the following environment variables:

* `MINDSDB_BYOM_ENABLED`

  This environment variable defines whether the BYOM feature is enabled (`MINDSDB_BYOM_ENABLED=true`) or disabled (`MINDSDB_BYOM_ENABLED=false`). Note that by default, it is disabled.

  Alternatively, you can enable it in the MindsDB configuration file:

  ```json theme={null}
  {
    "byom": {
      "enabled": true
    }
  }
  ```

* `MINDSDB_BYOM_DEFAULT_TYPE`

  This environment variable defines the modes of operation of the BYOM feature.

  * `MINDSDB_BYOM_DEFAULT_TYPE=venv`<br />
    When using the `venv` mode, MindsDB creates a virtual environment and installs in it the packages listed in the `requirements.txt` file. This virtual environment is dedicated for the custom model. Note that when running MindsDB locally, it is the default mode.

  * `MINDSDB_BYOM_DEFAULT_TYPE=inhouse`<br />
    When using the `inhouse` mode, there is no dedicated virtual environment for the custom model. It uses the environment of MindsDB, therefore, the `requirements.txt` file is not used with this mode.

* `MINDSDB_BYOM_INHOUSE_ENABLED`

  This environment variable defines whether the `inhouse` mode is enabled (`MINDSDB_BYOM_INHOUSE_ENABLED=true`) or disabled (`MINDSDB_BYOM_INHOUSE_ENABLED=false`). Note that when running MindsDB locally, it is enabled by default.

## Example

We upload the custom model, as below:

<p align="center">
  <img src="https://mintcdn.com/mindsdb/PepdSPcGoBKUq1N5/assets/byom_form.png?fit=max&auto=format&n=PepdSPcGoBKUq1N5&q=85&s=62543881b7deea0c7f26b57222b18ce2" width="3312" height="1816" data-path="assets/byom_form.png" />
</p>

Here we upload the `model.py` file that stores an implementation of the model and the `requirements.txt` file that stores all the dependencies.

Once the model is uploaded, it becomes an ML engine within MindsDB. Now we use this `custom_model_engine` to create a model as follows:

```sql theme={null}
CREATE MODEL custom_model
FROM my_integration
    (SELECT * FROM my_table)
PREDICT target
USING
    ENGINE = 'custom_model_engine';
```

Let's query for predictions by joining the custom model with the data table. Please note that when querying for predictions, do not include the target column in the `input` data selection.

```sql theme={null}
SELECT
  input.feature_column,
  model.target AS predicted_target
FROM my_integration.my_table AS input
JOIN custom_model AS model;
```

<Info>
  Check out the [BYOM handler folder](https://github.com/mindsdb/mindsdb/tree/main/mindsdb/integrations/handlers/byom_handler) to see the implementation details.
</Info>
