Difference between revisions of "User:Smcdougall"
From REU@MU
Smcdougall (Talk | contribs) |
Smcdougall (Talk | contribs) |
||
(6 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
== About == | == About == | ||
− | Sarah McDougall is a senior at [https://www.bucknell.edu/ Bucknell University], located in Lewisburg, Pennsylvania. She is majoring in Applied Mathematical Sciences with a concentration in statistics. Upon graduation, Sarah intends to pursue a career in biomedical data science/ | + | Sarah McDougall is a senior at [https://www.bucknell.edu/ Bucknell University], located in Lewisburg, Pennsylvania. She is majoring in Applied Mathematical Sciences with a concentration in statistics. Upon graduation, Sarah intends to pursue a career in biomedical data science / healthcare analytics. |
University Email: snm009@bucknell.edu | University Email: snm009@bucknell.edu | ||
− | |||
== Weekly Work Log == | == Weekly Work Log == | ||
Line 24: | Line 23: | ||
'''Week 3''' | '''Week 3''' | ||
* Watched and read tutorials about scikit-learn Python package | * Watched and read tutorials about scikit-learn Python package | ||
− | * Completed research on machine | + | * Completed research on machine learning methods for predicting OUD, overdose readmissions |
* Met with project group to discuss datasets and exploratory data analysis | * Met with project group to discuss datasets and exploratory data analysis | ||
* '''Talk: Research Presentation by Dr. Praveen + Student Check-in''' | * '''Talk: Research Presentation by Dr. Praveen + Student Check-in''' | ||
* Compiled BibTex references | * Compiled BibTex references | ||
− | * Began data preprocessing on encounter and diagnosis datasets | + | * Began data preprocessing on encounter and diagnosis datasets |
'''Week 4''' | '''Week 4''' | ||
* Completed data wrangling and preprocessing for hospital readmission component of project | * Completed data wrangling and preprocessing for hospital readmission component of project | ||
Line 51: | Line 50: | ||
* Completed additional background reading on imbalanced datasets, deep learning with EHR | * Completed additional background reading on imbalanced datasets, deep learning with EHR | ||
'''Week 7''' | '''Week 7''' | ||
− | + | * Completed Random Forest, Support Vector Machine, and AdaBoost classification | |
+ | * Did research on tuning hyperparameters and feature selection for classification models | ||
+ | * Read additional papers on deep learning with EHR (specifically RNN, LSTM, GRU) | ||
+ | * Completed deep learning tutorials using TensorFlow and Keras | ||
+ | * Added additional fields to datasets for capturing medications | ||
'''Week 8''' | '''Week 8''' | ||
− | + | * Transformed data frames to desirable format for deep learning model | |
+ | * Conducted research on RNN, LSTM, and GRU | ||
+ | * Retrained all models and captured prediction results in Excel sheet | ||
+ | * '''Talk: Grad School - How and Why?''' | ||
+ | * Created charts of summary statistics for final paper | ||
+ | * Wrote related work, data preprocessing, limitations, and future work sections of paper | ||
'''Week 9''' | '''Week 9''' | ||
− | + | * Created data preprocessing charts, EDA tables, results table, ROC curve visualizations, confusion matrix visualizations | |
+ | * Wrote background information, methods, and results sections of paper | ||
+ | * '''Talk: Industry Panel with Northwestern Mutual''' | ||
+ | * Added additional features to OUD dataset and reran all ML models | ||
+ | * Brainstormed information to include in poster and final presentation | ||
'''Week 10''' | '''Week 10''' | ||
+ | * Deliver final presentation | ||
+ | * Present final poster at virtual poster session | ||
+ | * Finish and submit final paper |
Latest revision as of 20:18, 7 August 2020
About
Sarah McDougall is a senior at Bucknell University, located in Lewisburg, Pennsylvania. She is majoring in Applied Mathematical Sciences with a concentration in statistics. Upon graduation, Sarah intends to pursue a career in biomedical data science / healthcare analytics.
University Email: snm009@bucknell.edu
Weekly Work Log
Week 1
- Orientation and REU expectations
- Data Science Boot Camp (Python packages, data visualization, and classification methods)
- Initial meeting with project mentor
- Talk: Good Research Practices
- Brainstormed project goals and milestones
- Completed research on opioid abuse/dependence and opioid use disorder (OUD)
Week 2
- Completed Responsible Conduct of Research (RCR) CITI Training
- Completed Medical College of Wisconsin (MCW) Biomedical Research CITI Training
- Talk: Technical Writing
- Completed research on on statistical analyses of OUD
- Completed research on machine learning methods for predicting OUD, overdose readmissions
- Met with project group about preliminary work, next steps
Week 3
- Watched and read tutorials about scikit-learn Python package
- Completed research on machine learning methods for predicting OUD, overdose readmissions
- Met with project group to discuss datasets and exploratory data analysis
- Talk: Research Presentation by Dr. Praveen + Student Check-in
- Compiled BibTex references
- Began data preprocessing on encounter and diagnosis datasets
Week 4
- Completed data wrangling and preprocessing for hospital readmission component of project
- Began Exploratory Data Analysis (EDA) for hospital readmission component of project
- Talk: Research Presentation by Dr. Bialkowski + Student Check-in
- Planned approach for data wrangling and preprocessing for OUD component of project
Week 5
- Completed data wrangling and preprocessing for OUD component of project
- Created mini presentation on contributions made toward project so far
- Talk: Good Presentations with Dr. Brylow
- Recorded all preprocessing documentation, and narrowed down potential predictors for machine learning models
- Delivered mini presentation
- Conducted background research on opioid epidemic specifically in Wisconsin and Milwaukee County
Week 6
- Completed EDA for readmission and OUD study
- Gathered background information for final paper
- Talk: Data Ethics with Dr. Zimmer
- Conducted research on logistic regression in Python
- Conducted research on decision trees and random forest in Python
- Talk: Creating Posters with Dr. Brylow
- Tested logistic regression for both readmission and OUD study
- Completed additional background reading on imbalanced datasets, deep learning with EHR
Week 7
- Completed Random Forest, Support Vector Machine, and AdaBoost classification
- Did research on tuning hyperparameters and feature selection for classification models
- Read additional papers on deep learning with EHR (specifically RNN, LSTM, GRU)
- Completed deep learning tutorials using TensorFlow and Keras
- Added additional fields to datasets for capturing medications
Week 8
- Transformed data frames to desirable format for deep learning model
- Conducted research on RNN, LSTM, and GRU
- Retrained all models and captured prediction results in Excel sheet
- Talk: Grad School - How and Why?
- Created charts of summary statistics for final paper
- Wrote related work, data preprocessing, limitations, and future work sections of paper
Week 9
- Created data preprocessing charts, EDA tables, results table, ROC curve visualizations, confusion matrix visualizations
- Wrote background information, methods, and results sections of paper
- Talk: Industry Panel with Northwestern Mutual
- Added additional features to OUD dataset and reran all ML models
- Brainstormed information to include in poster and final presentation
Week 10
- Deliver final presentation
- Present final poster at virtual poster session
- Finish and submit final paper