Skip to content

Starter Example

This is a basic example of mindsdb usage in predicting the real estate prices for an area.

Goal

The goal is to be able to predict the best rental_price for new properties given the information that we have in home_rentals.csv.

Learning

from mindsdb import Predictor

# We tell the Predictor what column or key we want to learn and from what data
Predictor(name='real_estate_model').learn(
    from_data="https://s3.eu-west-2.amazonaws.com/mindsdb-example-data/home_rentals.csv", # the path to the file where we can learn from, (note: can be url)
    to_predict='rental_price', # the column we want to learn to predict given all the data in the file
)

Note: that the argument from_data can be a path to a json, csv (or other separators), excel given as a file or as a URL, or a pandas Dataframe

Predicting

mdb = mindsdb.Predictor(name='real_estate_model')

# use the model to make predictions
# Note: you can use the `when_data` argument if you want to use a file with one or more rows instead of a python dictionary
result = Predictor(name='home_rentals_price').predict(when={'number_of_rooms': 1, 'initial_price': 1222, 'sqft': 1190})

# The result will be an array containing predictions for each data point (in this case only one), a confidence for said prediction and a few other extra informations
print('The predicted price is between ${price} with {conf} confidence'.format(price=result[0].explanation['rental_price']['confidence_interval'], conf=result[0].explanation['rental_price']['confidence']))

Notes

About the Learning

The first thing we can do is to learn from the csv file. Learn in the scope of MindsDB is to let it figure out a neural network that can best learn from this data as well as train and test such a model given the data that we have (learn more in InsideMindsDB).

When you run this script, note that it will start logging various information about the data and about the training process.

This information can be useful in allowing you to figure out which parts of your data are of low quality or might contain erroneous values.

About getting predictions from the model

Please note the when argument, in this case assuming we only know that:

  • 'number_of_rooms': 1,
  • 'initial_price':1222
  • 'sqft': 1190

So, as long as the columns that you pass in the when statement exists in the data it learned from it will work (see columns in home_rentals.csv).

Running online

You can follow this example on Google Colab.