Extract JSON Values from JSON Data

The json_extract() function extracts values from the JSON data passed as its argument.

To show how it works, we use the home rentals example. The model returns the predicted value and the explanation of the prediction in the form of JSON data.

SELECT rental_price, rental_price_explain
FROM mindsdb.home_rentals_model
WHERE sqft = 823
AND location='good'
AND neighborhood='downtown'
AND days_on_market=10;

On execution, we get:

+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| rental_price | rental_price_explain                                                                                                                          |
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| 1580         | {"predicted_value": 1580, "confidence": 0.99, "anomaly": null, "truth": null, "confidence_lower_bound": 1490, "confidence_upper_bound": 1670} |
+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------+

Now, if we want to see the confidence value only, we can use the json_extract() function as below.

SELECT rental_price, json_extract(rental_price_explain, '$.confidence') AS confidence
FROM mindsdb.home_rentals_model
WHERE sqft = 823
AND location='good'
AND neighborhood='downtown'
AND days_on_market=10;

On execution, we get:

+--------------+------------+
| rental_price | confidence |
+--------------+------------+
| 1580         | 0.99       |
+--------------+------------+

Extract JSON from Text Data

In this example, we use the OpenAI model to extract data in a predefined JSON format from the input text data.

Default Model

When you create an OpenAI model in MindsDB, it uses the gpt-3.5-turbo model by default. But you can use the gpt-4 model as well by passing it to the model-name parameter.

Let’s create an OpenAI model.

Before creating an OpenAI model, please create an engine, providing your OpenAI API key:

CREATE ML_ENGINE openai_engine
FROM openai
USING
    openai_api_key = 'your-openai-api-key';
CREATE MODEL mindsdb.nlp_model
PREDICT json
USING
    engine = 'openai_engine',
    json_struct = {
        'rental_price': 'rental price',
        'location': 'location',
        'nob': 'number of bathrooms'
    },
    input_text = 'sentence';

We pass three parameters as follows:

  1. The engine parameter ensures we use the OpenAI engine.
  2. The json_struct parameter stores a predefined JSON structure used for the output.
  3. The input_text parameter contains the name of the column that stores input text.

Now we can query the model, passing the input text stored in the sentence column.

SELECT json
FROM mindsdb.nlp_model
WHERE sentence = 'Amazing 3 bedroom apartment located at the heart of Manhattan, has one full bathrooms and one toilet room for just 3000 a month.';

On execution, we get:

+----------------------------------------------------------+
| json                                                     |
+----------------------------------------------------------+
| {"location":"Manhattan","nob":"1","rental_price":"3000"} |
+----------------------------------------------------------+