Sarah McDougall is a senior at Bucknell University, located in Lewisburg, Pennsylvania. She is majoring in Applied Mathematical Sciences with a concentration in statistics. Upon graduation, Sarah intends to pursue a career in biomedical data science / healthcare analytics.
University Email:
Weekly Work Log
Week 1
- Orientation and REU expectations
- Data Science Boot Camp (Python packages, data visualization, and classification methods)
- Initial meeting with project mentor
- Talk: Good Research Practices
- Brainstormed project goals and milestones
- Completed research on opioid abuse/dependence and opioid use disorder (OUD)
Week 2
- Completed Responsible Conduct of Research (RCR) CITI Training
- Completed Medical College of Wisconsin (MCW) Biomedical Research CITI Training
- Talk: Technical Writing
- Completed research on on statistical analyses of OUD
- Completed research on machine learning methods for predicting OUD, overdose readmissions
- Met with project group about preliminary work, next steps
Week 3
- Watched and read tutorials about scikit-learn Python package
- Completed research on machine learning methods for predicting OUD, overdose readmissions
- Met with project group to discuss datasets and exploratory data analysis
- Talk: Research Presentation by Dr. Praveen + Student Check-in
- Compiled BibTex references
- Began data preprocessing on encounter and diagnosis datasets
Week 4
- Completed data wrangling and preprocessing for hospital readmission component of project
- Began Exploratory Data Analysis (EDA) for hospital readmission component of project
- Talk: Research Presentation by Dr. Bialkowski + Student Check-in
- Planned approach for data wrangling and preprocessing for OUD component of project
Week 5
- Completed data wrangling and preprocessing for OUD component of project
- Created mini presentation on contributions made toward project so far
- Talk: Good Presentations with Dr. Brylow
- Recorded all preprocessing documentation, and narrowed down potential predictors for machine learning models
- Delivered mini presentation
- Conducted background research on opioid epidemic specifically in Wisconsin and Milwaukee County
Week 6
- Completed EDA for readmission and OUD study
- Gathered background information for final paper
- Talk: Data Ethics with Dr. Zimmer
- Conducted research on logistic regression in Python
- Conducted research on decision trees and random forest in Python
- Talk: Creating Posters with Dr. Brylow
- Tested logistic regression for both readmission and OUD study
- Completed additional background reading on imbalanced datasets, deep learning with EHR
Week 7
- Completed Random Forest, Support Vector Machine, and AdaBoost classification
- Did research on tuning hyperparameters and feature selection for classification models
- Read additional papers on deep learning with EHR (specifically RNN, LSTM, GRU)
- Completed deep learning tutorials using TensorFlow and Keras
- Added additional fields to datasets for capturing medications
Week 8
- Transformed data frames to desirable format for deep learning model
- Conducted research on RNN, LSTM, and GRU
- Retrained all models and captured prediction results in Excel sheet
- Talk: Grad School - How and Why?
- Created charts of summary statistics for final paper
- Wrote related work, data preprocessing, limitations, and future work sections of paper
Week 9
- Created data preprocessing charts, EDA tables, results table, ROC curve visualizations, confusion matrix visualizations
- Wrote background information, methods, and results sections of paper
- Talk: Industry Panel with Northwestern Mutual
- Added additional features to OUD dataset and reran all ML models
- Brainstormed information to include in poster and final presentation
Week 10
- Deliver final presentation
- Present final poster at virtual poster session
- Finish and submit final paper