Difference between revisions of "User:Smcdougall"

From REU@MU
Jump to: navigation, search
 
(8 intermediate revisions by the same user not shown)
Line 2: Line 2:
 
== About ==
 
== About ==
  
Sarah McDougall is a senior at [https://www.bucknell.edu/ Bucknell University], located in Lewisburg, Pennsylvania. She is majoring in Applied Mathematical Sciences with a concentration in statistics. Upon graduation, Sarah intends to pursue a career in biomedical data science/data analytics.
+
Sarah McDougall is a senior at [https://www.bucknell.edu/ Bucknell University], located in Lewisburg, Pennsylvania. She is majoring in Applied Mathematical Sciences with a concentration in statistics. Upon graduation, Sarah intends to pursue a career in biomedical data science / healthcare analytics.
  
 
University Email: snm009@bucknell.edu
 
University Email: snm009@bucknell.edu
 
  
 
== Weekly Work Log ==
 
== Weekly Work Log ==
Line 24: Line 23:
 
'''Week 3'''
 
'''Week 3'''
 
* Watched and read tutorials about scikit-learn Python package
 
* Watched and read tutorials about scikit-learn Python package
* Completed research on machine earning methods for predicting OUD, overdose readmissions
+
* Completed research on machine learning methods for predicting OUD, overdose readmissions
 
* Met with project group to discuss datasets and exploratory data analysis
 
* Met with project group to discuss datasets and exploratory data analysis
 
* '''Talk: Research Presentation by Dr. Praveen + Student Check-in'''
 
* '''Talk: Research Presentation by Dr. Praveen + Student Check-in'''
 
* Compiled BibTex references
 
* Compiled BibTex references
* Began data preprocessing on encounter and diagnosis datasets'
+
* Began data preprocessing on encounter and diagnosis datasets
 
'''Week 4'''
 
'''Week 4'''
 
* Completed data wrangling and preprocessing for hospital readmission component of project
 
* Completed data wrangling and preprocessing for hospital readmission component of project
Line 35: Line 34:
 
* Planned approach for data wrangling and preprocessing for OUD component of project
 
* Planned approach for data wrangling and preprocessing for OUD component of project
 
'''Week 5'''
 
'''Week 5'''
 
+
* Completed data wrangling and preprocessing for OUD component of project
 +
* Created mini presentation on contributions made toward project so far
 +
* '''Talk: Good Presentations with Dr. Brylow'''
 +
* Recorded all preprocessing documentation, and narrowed down potential predictors for machine learning models
 +
* '''Delivered mini presentation'''
 +
* Conducted background research on opioid epidemic specifically in Wisconsin and Milwaukee County
 
'''Week 6'''
 
'''Week 6'''
 
+
* Completed EDA for readmission and OUD study
 +
* Gathered background information for final paper
 +
* '''Talk: Data Ethics with Dr. Zimmer'''
 +
* Conducted research on logistic regression in Python
 +
* Conducted research on decision trees and random forest in Python
 +
* '''Talk: Creating Posters with Dr. Brylow'''
 +
* Tested logistic regression for both readmission and OUD study
 +
* Completed additional background reading on imbalanced datasets, deep learning with EHR
 
'''Week 7'''
 
'''Week 7'''
 
+
* Completed Random Forest, Support Vector Machine, and AdaBoost classification
 +
* Did research on tuning hyperparameters and feature selection for classification models
 +
* Read additional papers on deep learning with EHR (specifically RNN, LSTM, GRU)
 +
* Completed deep learning tutorials using TensorFlow and Keras
 +
* Added additional fields to datasets for capturing medications
 
'''Week 8'''
 
'''Week 8'''
 
+
* Transformed data frames to desirable format for deep learning model
 +
* Conducted research on RNN, LSTM, and GRU
 +
* Retrained all models and captured prediction results in Excel sheet
 +
* '''Talk: Grad School - How and Why?'''
 +
* Created charts of summary statistics for final paper
 +
* Wrote related work, data preprocessing, limitations, and future work sections of paper
 
'''Week 9'''
 
'''Week 9'''
 
+
* Created data preprocessing charts, EDA tables, results table, ROC curve visualizations, confusion matrix visualizations
 +
* Wrote background information, methods, and results sections of paper
 +
* '''Talk: Industry Panel with Northwestern Mutual'''
 +
* Added additional features to OUD dataset and reran all ML models
 +
* Brainstormed information to include in poster and final presentation
 
'''Week 10'''
 
'''Week 10'''
 +
* Deliver final presentation
 +
* Present final poster at virtual poster session
 +
* Finish and submit final paper

Latest revision as of 20:18, 7 August 2020

About

Sarah McDougall is a senior at Bucknell University, located in Lewisburg, Pennsylvania. She is majoring in Applied Mathematical Sciences with a concentration in statistics. Upon graduation, Sarah intends to pursue a career in biomedical data science / healthcare analytics.

University Email: snm009@bucknell.edu

Weekly Work Log

Week 1

  • Orientation and REU expectations
  • Data Science Boot Camp (Python packages, data visualization, and classification methods)
  • Initial meeting with project mentor
  • Talk: Good Research Practices
  • Brainstormed project goals and milestones
  • Completed research on opioid abuse/dependence and opioid use disorder (OUD)

Week 2

  • Completed Responsible Conduct of Research (RCR) CITI Training
  • Completed Medical College of Wisconsin (MCW) Biomedical Research CITI Training
  • Talk: Technical Writing
  • Completed research on on statistical analyses of OUD
  • Completed research on machine learning methods for predicting OUD, overdose readmissions
  • Met with project group about preliminary work, next steps

Week 3

  • Watched and read tutorials about scikit-learn Python package
  • Completed research on machine learning methods for predicting OUD, overdose readmissions
  • Met with project group to discuss datasets and exploratory data analysis
  • Talk: Research Presentation by Dr. Praveen + Student Check-in
  • Compiled BibTex references
  • Began data preprocessing on encounter and diagnosis datasets

Week 4

  • Completed data wrangling and preprocessing for hospital readmission component of project
  • Began Exploratory Data Analysis (EDA) for hospital readmission component of project
  • Talk: Research Presentation by Dr. Bialkowski + Student Check-in
  • Planned approach for data wrangling and preprocessing for OUD component of project

Week 5

  • Completed data wrangling and preprocessing for OUD component of project
  • Created mini presentation on contributions made toward project so far
  • Talk: Good Presentations with Dr. Brylow
  • Recorded all preprocessing documentation, and narrowed down potential predictors for machine learning models
  • Delivered mini presentation
  • Conducted background research on opioid epidemic specifically in Wisconsin and Milwaukee County

Week 6

  • Completed EDA for readmission and OUD study
  • Gathered background information for final paper
  • Talk: Data Ethics with Dr. Zimmer
  • Conducted research on logistic regression in Python
  • Conducted research on decision trees and random forest in Python
  • Talk: Creating Posters with Dr. Brylow
  • Tested logistic regression for both readmission and OUD study
  • Completed additional background reading on imbalanced datasets, deep learning with EHR

Week 7

  • Completed Random Forest, Support Vector Machine, and AdaBoost classification
  • Did research on tuning hyperparameters and feature selection for classification models
  • Read additional papers on deep learning with EHR (specifically RNN, LSTM, GRU)
  • Completed deep learning tutorials using TensorFlow and Keras
  • Added additional fields to datasets for capturing medications

Week 8

  • Transformed data frames to desirable format for deep learning model
  • Conducted research on RNN, LSTM, and GRU
  • Retrained all models and captured prediction results in Excel sheet
  • Talk: Grad School - How and Why?
  • Created charts of summary statistics for final paper
  • Wrote related work, data preprocessing, limitations, and future work sections of paper

Week 9

  • Created data preprocessing charts, EDA tables, results table, ROC curve visualizations, confusion matrix visualizations
  • Wrote background information, methods, and results sections of paper
  • Talk: Industry Panel with Northwestern Mutual
  • Added additional features to OUD dataset and reran all ML models
  • Brainstormed information to include in poster and final presentation

Week 10

  • Deliver final presentation
  • Present final poster at virtual poster session
  • Finish and submit final paper