> ## Documentation Index
> Fetch the complete documentation index at: https://docs.mindsdb.com/llms.txt
> Use this file to discover all available pages before exploring further.

# How to Evaluate Knowledge Bases

Evaluating knowledge bases verifies how accurate and relevant is the data returned by the knowledge base.

## `EVALUATE KNOWLEDGE_BASE` Syntax

With the `EVALUATE KNOWLEDGE_BASE` command, users can evaluate the relevancy and accuracy of the documents and data returned by the knowledge base.

Below is the complete syntax that includes both required and optional parameters.

```sql theme={null}
EVALUATE KNOWLEDGE_BASE my_kb
USING
    test_table = my_datasource.my_test_table,
    version = 'doc_id',
    generate_data = {
        'from_sql': 'SELECT id, content FROM my_datasource.my_table',
        'count': 100
    }, 
    evaluate = false,
    llm = {
        'provider': 'openai',
        'api_key':'sk-xxx',
        'model_name':'gpt-4'
    },
    save_to = my_datasource.my_result_table; 
```

### `test_table`

This is a required parameter that stores the name of the table from one of the data sources connected to MindsDB. For example, `test_table = my_datasource.my_test_table` defines a table named `my_test_table` from a data source named `my_datasource`.

This test table stores test data commonly in form of questions and answers. Its content depends on the `version` parameter defined below.

Users can provide their own test data or have the test data generated by the `EVALUATE KNOWLEDGE_BASE` command, which is performed when setting the `generate_data` parameter defined below.

### `version`

This is an optional parameter that defines the version of the evaluator. If not defined, its default value is `doc_id`.

* `version = 'doc_id'`
  The evaluator checks whether the document ID returned by the knowledge base matched the expected document ID as defined in the test table.

* `version = 'llm_relevancy'`
  The evaluator uses a language model to rank and evaluate responses from the knowledge base.

### `generate_data`

This is an optional parameter used to configure the test data generation, which is saved into the table defined in the `test_table` parameter. If not defined, its default value is `false`, meaning that no test data is generated.

Available values are as follows:

* A dictionary containing the following values:

  * `from_sql` defines the SQL query that fetches the test data. For example, `'from_sql': 'SELECT id, content FROM my_datasource.my_table'`. If not defined, it fetches test data from the knowledge base on which the `EVALUATE` command is executed: `SELECT chunk_content, id FROM my_kb`.
  * `count` defines the size of the test dataset. For example, `'count': 100`. Its default value is 20.

  <Note>
    When providing the `from_sql` parameter, it requires specific column names as follows:

    * With `version = 'doc_id'`, the `from_sql` parameter should contain a query that returns the `id` and `content` columns, like this: `'from_sql': 'SELECT id_column_name AS id, content_column_names AS content FROM my_datasource.my_table'`

    * With `version = 'llm_relevancy'`, the `from_sql` parameter should contain a query that returns the `content` column, like this: `'from_sql': 'SELECT content_column_names AS content FROM my_datasource.my_table'`
  </Note>

* A value of `true`, such as `generate_data = true`, which implies that default values for `from_sql` and `count` will be used.

### `evaluate`

This is an optional parameter that defines whether to evaluate the knowledge base. If not defined, its default value is `true`.

Users can opt for setting it to false, `evaluate = false`, in order to generate test data into the test table without running the evaluator.

### `llm`

This is an optional parameter that defines a language model to be used for evaluations, if `version` is set to `llm_relevancy`.

If not defined, its default value is the [`reranking_model` defined with the knowledge base](/mindsdb_sql/knowledge_bases/create#reranking-model).

Users can define it with the `EVALUATE KNOWLEDGE_BASE` command in the same manner.

```sql theme={null}
EVALUATE KNOWLEDGE_BASE my_kb
USING
    ...
    llm = {
        "provider": "azure_openai",
        "model_name" : "gpt-4o",
        "api_key": "sk-abc123",
        "base_url": "https://ai-6689.openai.azure.com/",
        "api_version": "2024-02-01",
        "method": "multi-class"
    },
    ...
```

### `save_to`

This is an optional parameter that stores the name of the table from one of the data sources connected to MindsDB. For example, `save_to = my_datasource.my_result_table` defines a table named `my_result_table` from the data source named `my_datasource`. If not defined, the results are not saved into a table.

This table is used to save the evaluation results.

By default, evaluation results are returned after executing the `EVALUATE KNOWLEDGE_BASE` statement.

### Evaluation Results

When using `version = 'doc_id'`, the following columns are included in the evaluation results:

* `total` stores the total number of questions.
* `total_found` stores the number of questions to which the knowledge bases provided correct answers.
* `retrieved_in_top_10` stores the number of top 10 questions to which the knowledge bases provided correct answers.
* `cumulative_recall` stores data that can be used to create a chart.
* `avg_query_time` stores the execution time of a search query of the knowledge base.
* `name` stores the knowledge base name.
* `created_at` stores the timestamp when the evaluation was created.

When using `version = 'llm_relevancy'`, the following columns are included in the evaluation results:

* `avg_relevancy` stores the average relevancy.
* `avg_relevance_score_by_k` stores the average relevancy at k.
* `avg_first_relevant_position` stores the average first relevant position.
* `mean_mrr` stores the Mean Reciprocal Rank (MRR).
* `hit_at_k` stores the Hit\@k value.
* `bin_precision_at_k` stores the Binary Precision\@k.
* `avg_entropy` stores the average relevance score entropy.
* `avg_ndcg` stores the average nDCG.
* `avg_query_time` stores the execution time of a search query of the knowledge base.
* `name` stores the knowledge base name.
* `created_at` stores the timestamp when the evaluation was created.
