OSTEOPOROSIS FRACTURE PREDICTION

A comprehensive data science project assessing risk factors and predicting bone fractures in women with osteoporosis using various statistical models on the glow_bonemed dataset.

Collaboration: Waleed Amer & Nolan Dulude | Data Science Masters Program

Project Objective

Assess risk factors and predict bone fractures in women with osteoporosis using various statistical models. Compare model performance to identify the best predictive approach for clinical decision-making.

Dataset Overview

Analysis of the glow_bonemed dataset containing clinical variables including BMI, age, prior fractures, bone medications, and demographic factors to predict first-year fracture risk.

Methodology Workflow

1. DATA CLEANUP

Find missing data and replace or omit from dataset

2. EDA

Explore dataset through visualizations to understand interactions

3. SIMPLE MODEL

Create simple Logistic Regression model based on EDA analysis

4. CROSS VALIDATION

Train and test complex model with interactions

5. QDA MODEL

Use QDA to achieve improved model performance

6. RANDOM FOREST

Utilize Random Forest to improve model statistics

Model Performance Results

0.98

Random Forest AUROC

0.83

QDA AUROC

0.75

CV Interaction AUROC

0.57

Simple Logistic AUROC

Model	Sensitivity	Specificity	PPV	NPV	AUROC
CV Interaction Model	0.8027	0.5120	0.8315	0.4638	0.7463
QDA	0.9867	0.6800	0.9024	0.9444	0.8333
Random Forest	1.00	0.96	0.9868	1.00	0.9800

Key Insights & Risk Factors

BMI & Age Impact

Higher BMI and age increase fracture risk, with odds ratios of 1.5 for BMI and 2.0 for age over 65

Smoking & Prior Fractures

Smokers and those with prior fractures have higher risks, with smoking increasing odds by 1.8 times and prior fractures by 2.5 times

Menopause & Family History

Premature menopause and maternal hip fractures increase risk, with odds ratios of 1.6 and 1.9, respectively

Bone Medication Adherence

Adherence to bone medications reduces fracture risk by 50%, as shown in our random forest model

Physical Function

Needing assistive devices to stand increases risk, with an odds ratio of 1.7. Strength and balance training is recommended

Model Performance

Random Forest outperformed all models due to its ability to capture non-linear relationships and feature interactions

Conclusions & Clinical Implications

Model Performance

The Random Forest model achieved the highest AUROC of 0.98, significantly outperforming simpler approaches. The simple logistic regression showed moderate predictive ability (AUROC 0.57), while complex models with interactions and QDA showed progressive improvement.

Clinical Relevance

AUROC and sensitivity are crucial metrics in clinical contexts - AUROC provides balanced performance assessment while high sensitivity minimizes false negatives, ensuring high-risk patients are identified for preventive interventions.

Future Directions

With additional time, the project would benefit from advanced feature engineering, external validation on diverse populations, and exploration of deep learning approaches for enhanced predictive accuracy.

OSTEOPOROSIS FRACTURE PREDICTION

Project Objective

Dataset Overview

Methodology Workflow

Model Performance Results

Key Insights & Risk Factors

Conclusions & Clinical Implications

Model Performance

Clinical Relevance

Future Directions

Site Map

AI Systems

Data Visualization

Statistical Modeling

About / Contact