SELECT Statement
Description
The SELECT
statement fetches predictions from the model table. The data is returned on the fly and the result set is not persisted.
But there are ways to save predictions data! You can save your predictions as a view using the CREATE VIEW
statement. Please note that a view is a saved query and does not store data like a table. Another way is to insert your predictions into a table using the INSERT INTO
statement.
Syntax
Single Prediction
Here is the syntax for fetching a single prediction from the model table:
SELECT target_name, target_name_explain
FROM mindsdb.predictor_name
WHERE column_name = value
AND column_name = value;
Grammar Matters
Here are some points to keep in mind while writing queries in MindsDB:
1. The column_name = value
pairs may be joined by AND
or OR
keywords.
2. Do not use any quotations for numerical values.
3. Use single quotes for strings.
4. The tables and column names are case sensitive.
On execution, we get:
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| target_name | target_name_explain |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| predicted_value | {"predicted_value": 4394, "confidence": 0.99, "anomaly": null, "truth": null, "confidence_lower_bound": 4313, "confidence_upper_bound": 4475} |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
Where:
Name | Description |
---|---|
target_name | Name of the column to be predicted. |
target_name_explain | Object of the JSON type that contains the predicted_value and additional information such as confidence , anomaly , truth , confidence_lower_bound , confidence_upper_bound . |
predictor_name | Name of the model used to make the prediction. |
WHERE column_name = value AND ... | WHERE clause used to pass the data values for which the prediction is made. |
Bulk Predictions
Here is the syntax for making predictions in bulk by joining the data source table with the model table:
SELECT m.target_name, t.column_name, ...
FROM integration_name.table_name AS t
JOIN mindsdb.predictor_name AS m;
On execution, we get:
+----------------------+-------------+
| target_name | column_name |
+----------------------+-------------+
| predicted_value1 | value1 |
| predicted_value2 | value2 |
| predicted_value3 | value3 |
| predicted_value4 | value4 |
| predicted_value5 | value5 |
+----------------------+-------------+
Where:
Name | Description |
---|---|
m.target_name | Name of the column to be predicted. The m. in front indicates that this column comes from the mindsdb.predictor_name table. |
t.column_name, ... | Columns from the data source table (integration_name.table_name ) that you want to see in the output. |
integration_name.table_name | Data source table that is joined with the model table (mindsdb.predictor_name ). |
predictor_name | Name of the model used to make predictions. |
Please note that in the case of bulk predictions, we do not pass the data values for which the prediction is made. It is because bulk predictions use all data available in the data source table.
Example
Single Prediction
Let’s predict the rental_price
value using the home_rentals_model
model for the property having sqft=823
, location='good'
, neighborhood='downtown'
, and days_on_market=10
.
SELECT sqft, location, neighborhood, days_on_market, rental_price, rental_price_explain
FROM mindsdb.home_rentals_model1
WHERE sqft=823
AND location='good'
AND neighborhood='downtown'
AND days_on_market=10;
On execution, we get:
+-------+----------+--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| sqft | location | neighborhood | days_on_market | rental_price | rental_price_explain |
+-------+----------+--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| 823 | good | downtown | 10 | 4394 | {"predicted_value": 4394, "confidence": 0.99, "anomaly": null, "truth": null, "confidence_lower_bound": 4313, "confidence_upper_bound": 4475} |
+-------+----------+--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
Bulk Predictions
Now let’s make bulk predictions to predict the rental_price
value using the home_rentals_model
model joined with the data source table.
SELECT t.sqft, t.location, t.neighborhood, t.days_on_market, t.rental_price AS real_price,
m.rental_price AS predicted_rental_price
FROM example_db.demo_data.home_rentals AS t
JOIN mindsdb.home_rentals_model AS m
LIMIT 5;
On execution, we get:
+-------+----------+-----------------+----------------+--------------+-----------------------------+
| sqft | location | neighborhood | days_on_market | real_price | predicted_rental_price |
+-------+----------+-----------------+----------------+--------------+-----------------------------+
| 917 | great | berkeley_hills | 13 | 3901 | 3886 |
| 194 | great | berkeley_hills | 10 | 2042 | 2007 |
| 543 | poor | westbrae | 18 | 1871 | 1865 |
| 503 | good | downtown | 10 | 3026 | 3020 |
| 1066 | good | thowsand_oaks | 13 | 4774 | 4748 |
+-------+----------+-----------------+----------------+--------------+-----------------------------+
Select from integration
Simple select
In this example query contains only tables from one integration and therefore will be sent to integration database (integration name will be cut from table name)
SELECT location, max(sqft)
FROM example_db.demo_data.home_rentals
GROUP BY location
LIMIT 5;
Raw select from integration
It is also possible to send raw query to integration. It can be useful when query to integration can not be parsed as sql
Syntax:
SELECT ... FROM integration_name ( query goes here )
Example of select from mongo integration using mongo query
SELECT * FROM mongo (
db.house_sales2.find().limit(1)
)
Complex queries
- Subselect on data from integration.
It can be useful in cases when integration engine doesn’t support some functions, for example grouping. In that case all data from raw select are passed to mindsdb and then subselect performs on them inside mindsdb
SELECT type, max(bedrooms), last(MA)
FROM mongo (
db.house_sales2.find().limit(300)
) GROUP BY 1
- Unions
It is possible to use UNION
/ UNION ALL
operators.
It this case every subselect from union will be fetched and merged to one result-set on mindsdb side
SELECT
data.time as date, data.target
FROM datasource.table_name as data
UNION ALL
SELECT
model.time as date, model.target as target
FROM mindsdb.model as model
JOIN datasource.table_name as t
WHERE t.time > LATEST AND t.group = 'value';