Welcome to M.A.S.P.!
(Morbus Alzheimerii Sistēma Prognōstica)


A state-of-the-art prediction system that determines if you have Alzheimer's disease


View Now!

Why Predict Alzheimer's Disease?

This is a great place to initially find out about your situation relating to Alzheimer's disease. This system will use an array of highly precise AI models to best possibly answer whether or not you have Alzheimer's. The system was trained on this Kaggle dataset.The system takes inputs about your demographics and brain features (age, years of education, socioeconomic status, estimated total intracranial volume, atlas scaling factor, etc.) and outputs that you either have Alzheimer's or do not.


Alzheimer's disease is a progressive disease in which brain cell connections and the cells themselves degenerate and die, eventually destroying memory and other important mental functions. Memory loss and confusion are the main symptoms. No cure exists, but medications and management strategies may temporarily improve symptoms. When Alzheimer's is diagnosed early on, treatments are more likely to be effective!

Some notable statistics about Alzheimer's disease:

  • More than 6 million Americans of all ages have Alzheimer's.
  • About 1 in 9 people age 65 and older has Alzheimer's.
  • Almost 2 out of 3 Americans with Alzheimer's are women.
  • Deaths from Alzheimer’s have more than doubled between 2000 and 2019.
  • Source: Alzheimer's Association

Resources for Additional Information:

Brain

Exploration

    Pie Chart

    This pie chart supports the fact that women are more likely to get Alzheimer's disease.

    MMSE vs. Age Scatter Plot

    This scatter plot shows how more people in the MMSE range of 25 and have Alzheimer's disease. It also shows how after the age of 90 you are less likely to get Alzheimer's.

    Age and Group Histogram

    This histogram shows that most Alzheimer's disease cases appear in people aged 65 years old through 85 years old.

    3D Scatter Plot

    This 3D scatter plot shows that, as eTIV goes up, ASF goes down and vice versa. Also, nWBV ranges from approximately 0.65 to 0.83.

    Violin Graph

    This violin graph shows that any person that has a MMSE value that is lower than 25 has Alzheimer's disease. In addition, any person that has a CDR value of 1 or more has Alzheimer's disease, but they can have a lower value and still be diagnosed with the disease. On the other hand, any person that has a CDR score of 0 does not have Alzheimer's disease. On the other hand, a person can have a CDR value of 0.5 and not be diagnosed with the disease.

    Heatmap

    This heatmap analyzes the mathematical relationships between all the variables used by the ML models to predict whether a patient has Alzheimer's disease or not. The closer a number is to 1, the more closely related the two variables that made that number are. On the other hand, the opposite effect happens the closer a number is to -1. The audience should pay special attention to the 'Outcome' variable, which is the one that says whether a patient has Alzheimer's disease or not. Clinical dementia rating (CDR) and mini mental state examination (MMSE) seem to be the most influential features on whether or not a person has Alzheimer's.

Machine Learning Models

We used a variety of different machine learning models with various results including:

    K-Nearest Neighbors (KNN):
    KNN works by finding the distances between a new data point and old data points. Depending on what the closest old data points are classified as, the new point will get classified accordingly. The amount of old data points it will look at depends on the specified number K. This model had a 64% accuracy.

    KNN Results


    Support Vector Classifiers (SVC):
    SVC maps the data to a higher dimensional space and then finds the optimal hyperplane that has the highest margins between the data points and the hyperplane. This model had a 98% accuracy.

    SVC Results


    Random Forest Classifier (RFC):
    RFC creates a randomly generated number of datasets that vary in size. It creates a decision tree from each new dataset. It then collects votes from each decision tree for which category the new data point should belong in. Whatever category has the most votes, the data point gets placed in that category. This model had a 100% accuracy.

    RFC Results


    Logistic Regression Classifier (LRC):
    LRC makes predictions based on the Sigmoid function which is a squiggles-like line. Despite the fact that it returns the probabilities, the final output would be a label assigned by comparing the likelihood with a threshold, which makes it eventually a classification algorithm. This model had a 98% accuracy.

    LRC Results


Conclusions

We were not expecting our models to be so accurate when predicting whether or not a person had Alzheimer's based on our training data. We were pleasantly surprised to see such high accuracies, and further hyperparameter tuning for these models ensured that our models were more generally applicable to other data. We expected RFC to be our best machine learning model for this classification system, so the fact that SVC and LRC were nearly just as accurate was interesting. Perhaps because our dataset was small (only 317 instances to train and test on), these models ended up being highly accurate. Demographic, socioeconomic, and brain features (such as estimated total intracranial volume and normalize whole brain volume) prove to be highly effective in determining whether people have Alzheimer's disease.

Such AI detection systems are easy to use and can help patients seek treatments early -- before Alzheimer's disease causes serious harm.

Conclusion Scores

Meet the Team!

We are a team of high schoolers...blah blah blah

Chetan Khairnar (Instructor): Short bio

Ashni Kumar (Product Manager): Short bio

Heath Fry (Machine Learning Engineer): Short bio

Francisco Apraiz (Data Scientist): Short bio

Jacob Hanson (Data Scientist): Short bio

Isaiah Johnson (Web Designer): Short bio