> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mindsdb.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Extend the Default MindsDB Configuration

To follow this guide, install MindsDB locally via [Docker](/setup/self-hosted/docker-desktop) or [PyPI](/setup/self-hosted/pip/source).

## Starting MindsDB with Extended Configuration

Start MindsDB locally with your custom configuration by providing a path to the `config.json` file that stores custom config parameters listed in this section.

<CodeGroup>
  ```bash Docker theme={null}
  docker run --name mindsdb_container -e MINDSDB_CONFIG_PATH=/Users/username/path/config.json -e MINDSDB_APIS=http,mysql -p 47334:47334 -p 47335:47335 mindsdb/mindsdb
  ```

  ```bash Python theme={null}
  python -m mindsdb --api=http,mysql --config=/path-to-the-extended-config-file/config.json
  ```
</CodeGroup>

### Available Config Parameters

Below are all of the custom configuration parameters that should be set according to your requirements and saved into the `config.json` file.

#### `permanent_storage`

```bash theme={null}
{
    "permanent_storage": {
        "location": "absent",
        "bucket": "s3_bucket_name" # optional, used only if "location": "s3"
    },
```

The `permanent_storage` parameter defines where MindsDB stores copies of user files, such as uploaded files, models, and tab content. MindsDB checks the `permanent_storage` location to access the latest version of a file and updates it as needed.

The `location` specifies the storage type.

* `absent` (default): Disables permanent storage and is recommended to use when MindsDB is running locally.
* `local`: Stores files in a local directory defined with `config['paths']['storage']`.
* `s3`: Stores files in an Amazon S3 bucket. This option requires the `bucket` parameter that specifies the name of the S3 bucket where files will be stored.

If this parameter is not set, the path is determined by the `MINDSDB_STORAGE_DIR` environment variable. MindsDB defaults to creating a `mindsdb` folder in the operating system user's home directory.

#### `paths`

```bash theme={null}
    "paths": {
        "root": "/home/mindsdb/var", # optional (alternatively, it can be defined in the MINDSDB_STORAGE_DIR environment variable)
        "content": "/home/mindsdb/var/content", # optional
        "storage": "/home/mindsdb/var/storage", # optional
        "static": "/home/mindsdb/var/static", # optional
        "tmp": "/home/mindsdb/var/tmp", # optional
        "cache": "/home/mindsdb/var/cache", # optional
        "locks": "/home/mindsdb/var/locks", # optional
    },
```

The `paths` parameter allows users to redefine the file paths for various groups of MindsDB files. If only the `root` path is defined, all other folders will be created within that directory. If this parameter is absent, the value is determined by the `MINDSDB_STORAGE_DIR` environment variable.

The `root` parameter defines the base directory for storing all MindsDB files, including models, uploaded files, tab content, and the internal SQLite database (if running locally).

The `content` parameter specifies the directory where user-related files are stored, such as uploaded files, created models, and tab content. The internal SQLite database (if running locally) is stored in the `root` directory instead.

If the `['permanent_storage']['location']` is set to `'local'`, then the `storage` parameter is used to store copies of user files.

The `static` parameter is used to store files for the graphical user interface (GUI) when MindsDB is run locally.

The `tmp` parameter designates a directory for temporary files. Note that the operating system’s default temporary directory may also be used for some temporary files.

If the `['cache']['type']` is set to `'local'`, then the `cache` parameter defines the location for storing cached files for the most recent predictions. For example, if a model is queried with identical input, the result will be stored in the cache and returned directly on subsequent queries, instead of recalculating the prediction.

The `locks` parameter is used to store lock files to prevent race conditions when the `content` folder is shared among multiple applications. This directory helps ensure that file access is managed properly using `fcntl` locks. Note that this is not applicable for Windows OS.

#### `auth`

```bash theme={null}
    "auth":{
        "http_auth_type": "session" | "token"| "session_or_token",
        "http_auth_enabled": true,
        "username": "username",
        "password": "password"
    },
```

The `auth` parameter controls the authentication settings for APIs in MindsDB.

Users can define the authentication type by setting the `http_auth_type` parameter to one of the following values:

* `session_or_token` is the default value. When a user logs in to MindsDB, the session cookie is set and the token is returned in the response. To use the MindsDB API, users can utilize either one or both of these methods.

* `session` sets the session cookie when a user logs in. The session lifetime can be set with the `http_permanent_session_lifetime` parameter.

* `token` returns the token in the response, which is valid indefinitely.

The authentication type can also be set via the `MINDSDB_HTTP_AUTH_TYPE` environment variable with the same values as defined above.

If the `http_auth_enabled` parameter is set to `true`, then the `username` and `password` parameters are required. Otherwise these are optional.

#### `gui`

```bash theme={null}
    "gui": {
        "autoupdate": true,
        "open_on_start": true
    },
```

The `gui` parameter controls the behavior of the MindsDB graphical user interface (GUI) updates.

The `autoupdate` parameter defines whether MindsDB automatically checks for and updates the GUI to the latest version when the application starts. If set to `true`, MindsDB will attempt to fetch the latest available version of the GUI. If set to `False`, MindsDB will not try to update the GUI on startup.

The `open_on_start` parameter defines whether MindsDB automatically opens the GUI on start. If set to `true`, MindsDB will open the GUI automatically. If set to `False`, MindsDB will not open the GUI on startup.

#### `api`

```bash theme={null}
    "api": {
        "http": {
            "host": "127.0.0.1",
            "port": "47334",
            "restart_on_failure": true,
            "max_restart_count": 1,
            "max_restart_interval_seconds": 60,
            "a2wsgi": {
                "workers": 15,
                "send_queue_size": 10
            }
        },
        "mysql": {
            "host": "127.0.0.1",
            "port": "47335",
            "database": "mindsdb",
            "ssl": true,
            "restart_on_failure": true,
            "max_restart_count": 1,
            "max_restart_interval_seconds": 60
        },
    },
```

The `api` parameter contains the configuration settings for running MindsDB APIs.

Currently, the supported APIs are:

* `http`: Configures the HTTP API. It requires the `host` and `port` parameters. Alternatively, configure HTTP authentication for your MindsDB instance by setting the environment variables `MINDSDB_USERNAME` and `MINDSDB_PASSWORD` before starting MindsDB, which is a recommended way for the production systems.
* `mysql`: Configures the MySQL API. It requires the `host` and `port` parameters and additionally the `database` and `ssl` parameters.

<AccordionGroup>
  <Accordion title="HTTP API">
    Connection parameters for the HTTP API include:

    * `host`: Specifies the IP address or hostname where the API should run. For example, `"127.0.0.1"` indicates the API will run locally.

    * `port`: Defines the port number on which the API will listen for incoming requests. The default ports are `47334` for HTTP, and `47335` for MySQL.

    * `restart_on_failure`: If it is set to `true` (and `max_restart_count` is not reached), the restart of MindsDB will be attempted after the MindsDB process was killed - with code 9 on Linux and MacOS, or for any reason on Windows.

    * `max_restart_count`: This defines how many times the restart attempts can be made. Note that 0 stands for no limit.

    * `max_restart_interval_seconds`: This defines the time limit during which there can be no more than `max_restart_count` restart attempts. Note that 0 stands for no time limit, which means there would be a maximum of `max_restart_count` restart attempts allowed.

          <Note>
            Here is a usage example of the restart features:

            Assume the following values:

            * max\_restart\_count = 2
            * max\_restart\_interval\_seconds = 30 seconds

            Assume the following scenario:

            * MindsDB fails at 1000s of its work - the restart attempt succeeds as there were no restarts in the past 30 seconds.
            * MindsDB fails at 1010s of its work - the restart attempt succeeds as there was only 1 restart (at 1000s) in the past 30 seconds.
            * MindsDB fails at 1020s of its work - the restart attempt fails as there were already max\_restart\_count=2 restarts (at 1000s and 1010s) in the past 30 seconds.
            * MindsDB fails at 1031s of its work - the restart attempt succeeds as there was only 1 restart (at 1010s) in the past 30 seconds.
          </Note>

    * `a2wsgi` is an WSGI wrapper with the following parameters: `workers` defines the number of requests that can be processed in parallel, and `send_queue_size` defines the buffer size.
  </Accordion>

  <Accordion title="MySQL API">
    Connection parameters for the MySQL API include:

    * `host`: Specifies the IP address or hostname where the API should run. For example, `"127.0.0.1"` indicates the API will run locally.
    * `port`: Defines the port number on which the API will listen for incoming requests. The default ports are `47334` for HTTP, and `47335` for MySQL.
    * `database`: Specifies the name of the database that MindsDB uses. Users must connect to this database to interact with MindsDB through the respective API.
    * `ssl`: Indicates whether SSL support is enabled for the MySQL API.
    * `restart_on_failure`: If it is set to `true` (and `max_restart_count` is not reached), the restart of MindsDB will be attempted after the MindsDB process was killed - with code 9 on Linux and MacOS, or for any reason on Windows.
    * `max_restart_count`: This defines how many times the restart attempts can be made. Note that 0 stands for no limit.
    * `max_restart_interval_seconds`: This defines the time limit during which there can be no more than `max_restart_count` restart attempts. Note that 0 stands for no time limit, which means there would be a maximum of `max_restart_count` restart attempts allowed.
  </Accordion>

  <Accordion title="MCP API">
    The `mcp` section configures the [MCP server](/model-context-protocol/usage).

    ```json theme={null}
    "api": {
        "mcp": {
            "cors": {
                "enabled": true,
                "allow_origins": [],
                "allow_origin_regex": "https?://(localhost|127\\.0\\.0\\.1)(:\\d+)?",
                "allow_headers": ["*"]
            },
            "rate_limit": {
                "enabled": false,
                "requests_per_minute": 60
            },
            "dns_rebinding_protection": false
        }
    }
    ```

    * `cors.enabled`: Enables CORS headers on MCP endpoints. Can also be set via `MINDSDB_MCP_CORS_ENABLED`.
    * `cors.allow_origins`: List of allowed origins. Can also be set via `MINDSDB_MCP_ALLOW_ORIGINS` (comma-separated).
    * `cors.allow_origin_regex`: Regex pattern for allowed origins. Can also be set via `MINDSDB_MCP_ALLOW_ORIGIN_REGEXP`.
    * `cors.allow_headers`: List of allowed request headers. Can also be set via `MINDSDB_MCP_ALLOW_HEADERS` (comma-separated).
    * `rate_limit.enabled`: Enables per-IP rate limiting. Can also be set via `MINDSDB_MCP_RATE_LIMIT_ENABLED`.
    * `rate_limit.requests_per_minute`: Maximum number of requests per minute per IP. Can also be set via `MINDSDB_MCP_RATE_LIMIT_RPM`.
    * `dns_rebinding_protection`: When `true`, the MCP transport validates the `Host` header against a list of known-safe hosts to prevent DNS rebinding attacks. Disabled by default (`false`). Enable it when running MindsDB locally and you want to restrict MCP access to `localhost` only. Can also be set via `MINDSDB_MCP_DNS_REBINDING_PROTECTION`.
  </Accordion>
</AccordionGroup>

#### `cache`

```bash theme={null}
    "cache": {
        "type": "local",
        "connection": "redis://localhost:6379" # optional, used only if "type": "redis"
    },
```

The `cache` parameter controls how MindsDB stores the results of recent predictions to avoid recalculating them if the same query is run again. Note that recent predictions are cached for ML models, but not in the case of large language models (LLMs), like OpenAI.

The `type` parameter specifies the type of caching mechanism to use for storing prediction results.

* `none`: Disables caching. No prediction results are stored.
* `local` (default): Stores prediction results in the `cache` folder (as defined in the `paths` configuration). This is useful for repeated queries where the result doesn't change.
* `redis`: Stores prediction results in a Redis instance. This option requires the `connection` parameter, which specifies the Redis connection string.

The `connection` parameter is required only if the `type` parameter is set to `redis`. It stores the Redis connection string.

#### `logging`

```bash theme={null}
    "logging": {
        "handlers": {
            "console": {
                "enabled": true,
                "formatter": "default", # optional, available values include default and json
                "level": "INFO" # optional (alternatively, it can be defined in the MINDSDB_CONSOLE_LOG_LEVEL environment variable)
            },
            "file": {
                "enabled": False,
                "level": "INFO", # optional (alternatively, it can be defined in the MINDSDB_FILE_LOG_LEVEL environment variable)
                "filename": "app.log",
                "maxBytes": 524288, # 0.5 Mb
                "backupCount": 3
            }
        }
    },
```

The above parameters are implemented based on [Python's Logging Dictionary Schema](https://docs.python.org/3/library/logging.config.html#logging-config-dictschema).

The `logging` parameter defines the details of output logging, including the logging levels.

The `handler` parameter provides handlers used for logging into streams and files.

* `console`: This parameter defines the setup for saving logs into a stream.

  * If the `enabled` parameter is set to `true`, then the logging output is saved into a stream.
  * Users can define the `formatter` parameter that configures the format of the logs, where the available values include `default` and `json`.
  * Users can also define the logging level in the `level` parameter or in the `MINDSDB_CONSOLE_LOG_LEVEL` environment variable - one of `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`.

* `file`: This parameter defines the setup for saving logs into a file.

  * If the `enabled` parameter is set to `true`, then the logging output is saved into a file.
  * Users can define the logging level in the `level` parameter or in the `MINDSDB_FILE_LOG_LEVEL` environment variable - one of `DEBUG`, `INFO`, `WARNING`, `ERROR`, `CRITICAL`.
  * Additionally, the `filename` parameter stores the name of the file that contains logs.
  * And the `maxBytes` and `backupCount` parameters determine the rollover process of the file - that is, if the file reached the size of `maxBytes`, then the file is closed and a new file is opened, where the number of files is defined by the `backupCount` parameter.

#### `ml_task_queue`

```bash theme={null}
    "ml_task_queue": {
        "type": "local",
        "host": "localhost", # optional, used only if "type": "redis"
        "port": 6379, # optional, used only if "type": "redis"
        "db": 0, # optional, used only if "type": "redis"
        "username": "username", # optional, used only if "type": "redis"
        "password": "password" # optional, used only if "type": "redis"
    },
```

The `ml_task_queue` parameter manages the queueing system for machine learning tasks in MindsDB. ML tasks include operations such as creating, training, predicting, fine-tuning, and retraining models. These tasks can be resource-intensive, and running multiple ML tasks simultaneously may lead to Out of Memory (OOM) errors or performance degradation. To address this, MindsDB uses a task queue to control task execution and optimize resource utilization.

The `type` parameter defines the type of task queue to use.

* `local`: Tasks are processed immediately as they appear, without a queue. This is suitable for environments where resource constraints are not a concern.
* `redis`: Tasks are added to a Redis-based queue, and consumer process (which is run with `--ml_task_consumer`) ensures that tasks are executed only when sufficient resources are available.
  * Using a Redis queue requires additional configuration such as the `host`, `port`, `db`, `username`, and `password` parameters.
  * To use the Redis queue, start MindsDB with the following command to initiate a queue consumer process: `python3 -m mindsdb --ml_task_queue_consumer`. This process will monitor the queue and fetch tasks for execution only when sufficient resources are available.

#### `url_file_upload`

```bash theme={null}
   "url_file_upload": {
           "enabled": true,
           "allowed_origins": ["https://example.com"],
           "disallowed_origins": ["http://example.com"]
    }
```

The `url_file_upload` parameter restricts file uploads to trusted sources by specifying a list of allowed domains. This ensures that users can only upload files from the defined sources, such as S3 or Google Drive.

The `enabled` flag turns this feature on (`true`) or off (`false`).

The `allowed_origins` parameter lists allowed domains. If left empty, then any domain is allowed.

The `disallowed_origins` parameter lists domains that are not allowed. If left empty, then there are no restricted domains.

#### `web_crawling_allowed_sites`

```bash theme={null}
    "web_crawling_allowed_sites": [],
```

The `web_crawling_allowed_sites` parameter restricts web crawling operations to a specified list of allowed IPs or web addresses. This ensures that the application only accesses pre-approved and safe URLs (`"web_crawling_allowed_sites": ["https://example.com", "https://api.mysite.com"]`).

If left empty (`[]`), the application allows access to all URLs by default (marked with a wildcard in the open-source version).

#### `default_llm`

```bash theme={null}
    "default_llm": {
        "provider": "azure_openai",
        "model_name" : "gpt-4o",
        "api_key": "sk-abc123",
        "base_url": "https://ai-6689.openai.azure.com/",
        "api_version": "2024-02-01",
        "method": "multi-class"
    }
```

The `default_llm` parameter specifies the default LLM that will be used with the [`LLM()` function](/mindsdb_sql/functions/llm_function), the [`TO_MARKDOWN()` function](/mindsdb_sql/functions/to_markdown_function), and as a default model for [agents](/mindsdb_sql/agents/agent).

#### `default_embedding_model`

```bash theme={null}
    "default_embedding_model": {
        "provider": "azure_openai",
        "model_name" : "text-embedding-3-large",
        "api_key": "sk-abc123",
        "base_url": "https://ai-6689.openai.azure.com/",
        "api_version": "2024-02-01"
    }
}
```

The `default_embedding_model` parameter specifies the default embedding model used with knowledge bases. Learn more about the parameters following the [documentation of the `embedding_model` of knowledge bases](/mindsdb_sql/knowledge_bases/create#embedding-model).

#### `default_reranking_model`

```bash theme={null}
    "default_reranking_model": {
        "provider": "azure_openai",
        "model_name" : "gpt-4o",
        "api_key": "sk-abc123",
        "base_url": "https://ai-6689.openai.azure.com/",
        "api_version": "2024-02-01",
        "method": "multi-class"
    }
```

The `default_reranking_model` parameter specifies the default reranking model used with knowledge bases. Learn more about the parameters following the [documentation of the `reranking_model` of knowledge bases](/mindsdb_sql/knowledge_bases/create#reranking-model).

#### `data_catalog`

```bash theme={null}
{
    "data_catalog": {
        "enabled": true
    }
}
```

This parameter enables the [data catalog](/data_catalog/overview).

### Example

First, create a `config.json` file.

```bash theme={null}
{
    "permanent_storage": {
        "location": "absent"
    },
    "paths": {
        "root": "/path/to/root/location"
    },
    "auth":{
        "http_auth_enabled": true,
        "username": "username",
        "password": "password"
    },
    "gui": {
        "autoupdate": true
    },
    "api": {
        "http": {
            "host": "127.0.0.1",
            "port": "47334",
            "restart_on_failure": true,
            "max_restart_count": 1,
            "max_restart_interval_seconds": 60
        },
        "mysql": {
            "host": "127.0.0.1",
            "port": "47335",
            "database": "mindsdb",
            "ssl": true,
            "restart_on_failure": true,
            "max_restart_count": 1,
            "max_restart_interval_seconds": 60
        }
    },
    "cache": {
        "type": "local"
    },
    "logging": {
        "handlers": {
            "console": {
                "enabled": true,
                "formatter": "default",
                "level": "INFO"
            },
            "file": {
                "enabled": false,
                "level": "INFO",
                "filename": "app.log",
                "maxBytes": 524288,
                "backupCount": 3
            }
        }
    },
    "ml_task_queue": {
        "type": "local"
    },
    "url_file_upload": {
           "enabled": true,
           "allowed_origins": ["https://example.com"],
           "disallowed_origins": ["http://example.com"]
    },
    "web_crawling_allowed_sites": []
}
```

Next, start MindsDB providing this `config.json` file.

```bash theme={null}
python -m mindsdb --config=/path-to-the-extended-config-file/config.json
```

## Modifying Config Values

Users can modify config values by directly editing the `config.json` file they created.
