Skip to content

Starter Example

This is a basic example of mindsdb_native usage in predicting the real estate prices for an area. If you want to follow out visually, watch bellow video:

Goal

The goal is to be able to predict the best rental_price for new properties given the information that we have in home_rentals.csv.

Learning

from mindsdb import Predictor

# We tell the Predictor what column or key we want to learn and from what data
Predictor(name='real_estate_model').learn(
    from_data="https://s3.eu-west-2.amazonaws.com/mindsdb-example-data/home_rentals.csv", # the path to the file where we can learn from, (note: can be url)
    to_predict='rental_price', # the column we want to learn to predict given all the data in the file
)

Note: that the argument from_data can be a path to a json, csv (or other separators), excel given as a file or as a URL, or a pandas Dataframe

Predicting

mdb = mindsdb.Predictor(name='real_estate_model')

# use the model to make predictions
result = Predictor(name='home_rentals_price').predict(when_data={'number_of_rooms': 1, 'initial_price': 1222, 'sqft': 1190})

# The result will be an array containing predictions for each data point (in this case only one), a confidence for said prediction and a few other extra informations
print('The predicted price is between ${price} with {conf} confidence'.format(price=result[0].explanation['rental_price']['confidence_interval'], conf=result[0].explanation['rental_price']['confidence']))

Notes

About the Learning

The first thing we can do is to learn from the csv file. Learn in the scope of MindsDB is to let it figure out a neural network that can best learn from this data as well as train and test such a model given the data that we have.

When you run this script, note that it will start logging various information about the data and about the training process.

This information can be useful in allowing you to figure out which parts of your data are of low quality or might contain erroneous values.

About getting predictions from the model

Please note the when_data argument, in this case assuming we only know that:

  • 'number_of_rooms': 1,
  • 'initial_price':1222
  • 'sqft': 1190

So, as long as the columns that you pass in the when_data statement exists in the data it learned from it will work (see columns in home_rentals.csv).

Running online

You can follow this example on Google Colab.