Revision as of 21:52, 1 August 2020 by Slogan (Talk | contribs)

Jump to: navigation, search

About Me

My name is Sarah Logan and I am a rising senior at Siena College, which is located in upstate New York. I am majoring in Applied Data Science with a concentration in Biology. For the Summer 2020 REU at Marquette University, I am working on a project entitled "Ethical and Privacy Concerns with Suicide Risk Prediction Algorithms" with Dr. Michael Zimmer.

Project Description

(From Marquette REU Home Page)

"Suicide is the tenth leading cause of death in the United States, with more than 40,000 deaths and over one million attempts estimated annually. Despite ongoing efforts to reduce the burden of suicide and suicidal behavior, suicide rates have been increasing (by more than 25% since 1999), and 50 years of research has failed to identify any robust predictors. Even though nearly half of all suicide decedents have contact with a healthcare professional in the month before their death, suicide risk is rarely detected in such cases. There is thus a valuable and largely untapped opportunity to help at-risk individuals who interact with the healthcare system shortly before their suicide attempt. New research has used machine learning methods to analyze electronic health record (EHR) data to successfully detect more than 1/3 of first-episode suicidal behavior cases, on average 3 years in advance, with at least 90% specificity. Given that these predictions are based only on the structured data elements in the EHR (medications, diagnostic codes, procedure codes, etc., extracted from Partners Research Patient Data Registry, RPDR), there exists a desire for improvement by integrating additional information, such as public records datasets containing financial, legal, life event and sociodemographic data. This project will explore the ethical and privacy implications of integrating publicly available “socioeconomic health attributes” to assist healthcare organizations with their suicide risk analytics and predictive modeling activities. The project will be multi-method, combing a conceptual investigation of ethical dimensions, a technical analysis of proposed algorithmic modeling, and a study of user perceptions about potential privacy threats of such efforts."

Work Log

Week 1

  • Attended REU Orientation
  • Mini Data Science Bootcamp
  • Attended good research practices talk by Dr. Brylow
  • Met with Dr. Zimmer to discuss planning of research goals and survey methodologies
  • Developed research questions
  • Began literature review of Big Data and data ethics
  • Read about factorial vignettes

Week 2

  • Responsible Conduct of Research Training with Dr. Brylow and Dr. Praveen
  • Completed CITI modules
  • Continued literature review of Big Data and data ethics
  • Looked at examples of surveys and how they analyzed their data
  • Met with Dr. Zimmer to discuss readings and survey logistics

Week 3

  • Examined the structure of previous privacy surveys and the accompanying data analysis
  • Explored the Qualtrics platform
  • Reviewed statistical analysis procedures
  • Researched articles related to online privacy concerns, privacy concerns in a healthcare setting, and privacy regulations
  • Wrote questions for our survey
  • Attended research talk by Dr. Praveen
  • Met with Dr. Zimmer to discuss survey questions
  • Revised survey questions

Week 4

  • Continued revising survey questions throughout the week
  • Read literature about different frameworks of understanding privacy
  • Met with Dr. Zimmer to discuss privacy readings and survey questions
  • Read literature about Big Data
  • Attended research talk by Dr. Bialkowski + REU group meeting
  • Entered survey questions into Qualtrics
  • Began working on mini presentation

Week 5

  • Made edits to the survey in Qualtrics
  • Met with Dr. Zimmer to discuss survey and presentation
  • REU group meeting on Good Presentations
  • Prepared for mini presentation
  • ESP/SPARK meeting
  • Attended and presented at the Mini Presentations REU meeting
  • Reviewed R
  • Examined data analysis of a previous survey conducted by Dr. Zimmer

Week 6

  • Presentation by Dr. Zimmer on data ethics
  • Set up outline for data analysis in Excel and R
  • Learned how to do cross tabs and chi square in R
  • REU group meeting on creating posters
  • Read more privacy articles
  • Completed paperwork for MGH
  • Researched articles about suicide prediction factors

Week 7

  • Started writing Methods section of paper
  • Survey Instrument
  • Data Collection
  • Data Analysis
  • Continued working on Excel spreadsheet and R script for data analysis
  • ESP/SPARK Meeting
  • REU Group meeting
  • Wrote down ideas for Introduction section
  • Reviewed some privacy articles to help with Introduction
  • Survey soft launch

Week 8

  • Used soft launch data to run through data analysis plan
  • Survey officially launched
  • Revised data analysis plan
  • Revised Methods and Introduction sections
  • REU Group meeting on graduate school
  • Reviewed data from survey to look for errors
  • Met with Dr. Zimmer to discuss survey data
  • Inputted frequency counts into Excel for every question

Week 9

  • Northwestern Mutual Data Science Institute Industry Panel
  • Split data into a 'high' and 'low' concern group and inputted frequency counts for survey questions into Excel
  • Made bar charts for all questions for total respondents, high concern group, and low concern group
  • Summed the total responses to the data attribute concern questions, vignette concern and factor questions for total respondents, high concern group, and low concern group
  • Conducted chi-square tests between privacy concern and demographics and privacy concern and vignette concern
  • Aggregated useful comments from survey
  • Made boxplots and conducted ANOVAs for demographic questions and privacy score
  • REU Group meeting
  • Met with Dr. Zimmer to discuss survey data and analysis
  • Created stacked bar charts to compare knowledge of data category with concern over data category
  • Broke down each vignette based on concern and compared how concern groups responded to the factor question of that vignette
  • Divided suicide prediction vignette based on concern level and analyzed how people in the no concern and some concern groups responded to the data attribute questions
  • Continued working on paper
  • Patient Focus Groups meeting