Skip to content

Medical Diagnosis

Industry Department Role
Health Care Health Business Executive/Physician
Data

Breast Cancer Wisconsin (Diagnostic) Data Set. From the given information of the breast cancer dataset, classify whether it is a malignant cancer or benign cancer.

diagnosis radius_mean texture_mean perimeter_mean area_mean smoothness_mean compactness_mean concavity_mean concave_points_mean symmetry_mean fractal_dimension_mean radius_se texture_se perimeter_se area_se smoothness_se compactness_se concavity_se concave points_se symmetry_se fractal_dimension_se radius_worst texture_worst perimeter_worst area_worst smoothness_worst compactness_worst concavity_worst concave points_worst symmetry_worst fractal_dimension_worst
0 17.99 10.38 122.8 1001 0.1184 0.2776 0.3001 0.1471 0.2419 0.07871 1.095 0.9053 8.589 153.4 0.006399 0.04904 0.05373 0.01587 0.03003 0.006193 25.38 17.33 184.6 2019 0.1622 0.6656 0.7119 0.2654 0.4601 0.1189
0 20.57 17.77 132.9 1326 0.08474 0.07864 0.0869 0.07017 0.1812 0.05667 0.5435 0.7339 3.398 74.08 0.005225 0.01308 0.0186 0.0134 0.01389 0.003532 24.99 23.41 158.8 1956 0.1238 0.1866 0.2416 0.186 0.275 0.08902
0 19.69 21.25 130 1203 0.1096 0.1599 0.1974 0.1279 0.2069 0.05999 0.7456 0.7869 4.585 94.03 0.00615 0.04006 0.03832 0.02058 0.0225 0.004571 23.57 25.53 152.5 1709 0.1444 0.4245 0.4504 0.243 0.3613 0.08758
0 11.42 20.38 77.58 386.1 0.1425 0.2839 0.2414 0.1052 0.2597 0.09744 0.4956 1.156 3.445 27.23 0.00911 0.07458 0.05661 0.01867 0.05963 0.009208 14.91 26.5 98.87 567.7 0.2098 0.8663 0.6869 0.2575 0.6638 0.173
0 20.29 14.34 135.1 1297 0.1003 0.1328 0.198 0.1043 0.1809 0.05883 0.7572 0.7813 5.438 94.44 0.01149 0.02461 0.05688 0.01885 0.01756 0.005115 22.54 16.67 152.2 1575 0.1374 0.205 0.4 0.1625 0.2364 0.07678
0 12.45 15.7 82.57 477.1 0.1278 0.17 0.1578 0.08089 0.2087 0.07613 0.3345 0.8902 2.217 27.19 0.00751 0.03345 0.03672 0.01137 0.02165 0.005082 15.47 23.75 103.4 741.6 0.1791 0.5249 0.5355 0.1741 0.3985 0.1244
0 18.25 19.98 119.6 1040 0.09463 0.109 0.1127 0.074 0.1794 0.05742 0.4467 0.7732 3.18 53.91 0.004314 0.01382 0.02254 0.01039 0.01369 0.002179 22.88 27.66 153.2 1606 0.1442 0.2576 0.3784 0.1932 0.3063 0.08368
Click to expand Features Informations:
1. id ID number
2. diagnosis The diagnosis of breast tissues (M = malignant, B = benign)
3. radius_mean mean of distances from center to points on the perimeter
4. texture_means tandard deviation of gray-scale values
5. perimeter_mean mean size of the core tumor
6. area_mean
7. smoothness_mean mean of local variation in radius lengths
8. compactness_mean mean of perimeter^2 / area - 1.0
9. concavity_mean mean of severity of concave portions of the contour
10. concave points_mean mean for number of concave portions of the contour
11. symmetry_mean
12. fractal_dimension_mean mean for "coastline approximation" - 1
13. radius_sestandard error for the mean of distances from center to points on the perimeter
14. texture_sestandard error for standard deviation of gray-scale values
15. perimeter_se
16. area_se
17. smoothness_sestandard error for local variation in radius lengths
18. compactness_sestandard error for perimeter^2 / area - 1.0
19. concavity_sestandard error for severity of concave portions of the contour
concave points_sestandard error for number of concave portions of the contour
20. symmetry_se
21. fractal_dimension_sestandard error for "coastline approximation" - 1
22. radius_worst"worst" or largest mean value for mean of distances from center to points on the perimeter
23. texture_worst"worst" or largest mean value for standard deviation of gray-scale values
24. perimeter_worst
25. area_worst
26. smoothness_worst "worst" or largest mean value for local variation in radius lengths
27. compactness_worst "worst" or largest mean value for perimeter^2 / area - 1.0
28. concavity_worst "worst" or largest mean value for severity of 29. 29. concave portions of the contour
30. concave points_worst "worst" or largest mean value for number of concave portions of the contour
31. symmetry_worst
32. fractal_dimension_worst"worst" or largest mean value for "coastline approximation" - 1

MindsDB Code example

import mindsdb
import sys
import pandas as pd
from sklearn.metrics import balanced_accuracy_score


def run(sample):
    mdb = mindsdb.Predictor(name='cancer_model')

    mdb.learn(from_data='processed_data/train.csv', to_predict='diagnosis')

    test_df = pd.read_csv('processed_data/test.csv')
    predictions = mdb.predict(when_data='processed_data/test.csv')

    results = [str(x['diagnosis']) for x in predictions]
    real = list(map(str,list(test_df['diagnosis'])))

    accuracy = balanced_accuracy_score(real, results)

    return {
        'accuracy': accuracy
        ,'accuracy_function': 'balanced_accuracy_score'
        ,'backend': backend
    }

if __name__ == '__main__':
    sample = bool(sys.argv[1]) if len(sys.argv) > 1 else False
    result = run(sample)
    print(result)

Mindsdb accuracy

Accuracy Backend Last run MindsDB Version Latest Version
0.9666666666666667 Lightwood 17 April 2020 MindsDB PyPi Version