Difference between revisions of "User:ZFarahany"
From REU@MU
(Created page with "Zach Farahany's profile") |
|||
(14 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
Zach Farahany's profile | Zach Farahany's profile | ||
+ | |||
+ | About me: Hi, I'm Zach. I'm a Data Science and Computational Mathematics major at Marquette University. My research project is Predicting Risk of Opioid Use Disorder with a focus on the impact from Covid-19. | ||
+ | |||
+ | == Work Log == | ||
+ | ===Week 1=== | ||
+ | '''''Tuesday''''' | ||
+ | * Attended REU Orientation and learned beginner python | ||
+ | '''''Wednesday''''' | ||
+ | *Learned data visualization and basic machine learning | ||
+ | '''''Thursday''''' | ||
+ | *Good research practices talk from Dr. Brylow | ||
+ | *Meeting with Dr. Praveen for project expectations | ||
+ | '''''Friday''''' | ||
+ | *Made personal webpage and completed research plan | ||
+ | '''''Sunday''''' | ||
+ | *Read "Predicting Opioid Overdose Readmission and Opioid Use Disorder with Machine Learning" | ||
+ | |||
+ | ===Week 2=== | ||
+ | '''''Monday''''' | ||
+ | *Attended data ethics talk from Dr. Brylow | ||
+ | *Read "Predicting Opioid Use Disorder using Random Forest" by Wadekar et. al. | ||
+ | '''''Tuesday''''' | ||
+ | *Meeting w/ Dr. Praveen | ||
+ | '''''Wednesday''''' | ||
+ | *Completed Basic RCR CITI module | ||
+ | '''''Thursday''''' | ||
+ | *Completed Biomedical CITI module | ||
+ | *Read "A clash of epidemics: Impact of the COVID-19 pandemic response on opioid overdose" by Linas et al. | ||
+ | *Read "COVID-19 risk and outcomes in patients with substance use disorders: analyses from electronic health records in the United States" by Wang et al. | ||
+ | '''''Friday''''' | ||
+ | *Researched ML terms such as association rule, support, confidence, lift, conviction | ||
+ | *Read "The Opioid Crisis: a Comprehensive Overview" | ||
+ | |||
+ | ===Week 3=== | ||
+ | '''''Monday''''' | ||
+ | *Completed CITI Session 1-3 | ||
+ | *Completed Assignment 1-3 | ||
+ | '''''Tuesday''''' | ||
+ | *Read "A comprehensive review of COVID-19 characteristics" | ||
+ | *Meeting with Dr. Praveen | ||
+ | '''''Wednesday''''' | ||
+ | *Short presentation from Dr. Praveen | ||
+ | '''''Thursday''''' | ||
+ | *Read "The Hidden Epidemic of Opioid Overdoses During the Coronavirus Disease 2019 Pandemic" | ||
+ | ===Week 4=== | ||
+ | '''''Monday''''' | ||
+ | *Read "Prediction Modeling Using EHR Data: Challenges, Strategies, and a Comparison of Machine Learning Approaches" | ||
+ | *Read "Psychological aspects of Covid-19" | ||
+ | '''''Tuesday''''' | ||
+ | *Read "Review of Covid-19 vaccine clinical trials – A puzzle with missing pieces" | ||
+ | *Data lab meeting with Dr. Praveen | ||
+ | '''''Wednesday''''' | ||
+ | *Talk from a faculty member | ||
+ | *Trying to sort out Marquette VPN and remote access to a computer | ||
+ | '''''Thursday''''' | ||
+ | *Read "Predictors of COVID-19 Vaccine Hesitancy: Socio-demographics, Co-Morbidity, and Past Experience of Racial Discrimination " | ||
+ | |||
+ | ===Week 5=== | ||
+ | '''''Monday''''' | ||
+ | *Prepared presentation for Tuesday | ||
+ | *Attended good research presentation talk from Dr. Brylow | ||
+ | '''''Tuesday''''' | ||
+ | *DataLab meeting | ||
+ | *Presented, received feedback | ||
+ | *Updated presentation after feedback | ||
+ | *Looking into Google Cloud service or CoLab for computation power | ||
+ | *Put my computer on downloading data overnight | ||
+ | '''''Wednesday''''' | ||
+ | *Purchased Colab Pro because of extra RAM needed | ||
+ | *Started to figure out how to preprocess data | ||
+ | '''''Thursday''''' | ||
+ | *More data preprocessing | ||
+ | '''''Friday''''' | ||
+ | *Loop preprocessing will not work need other methods | ||
+ | *Looking into various pandas functions for preprocessing, groupby etc. | ||
+ | ===Week 6=== | ||
+ | *Used diagnosis file from SUD dataset for dummy data | ||
+ | *Sort by unique instead of using a loop | ||
+ | *Create diagnosis preprocessing pipeline that can be a template for all other preprocessing | ||
+ | ===Week 7=== | ||
+ | *Focus on social demographic preprocessing | ||
+ | *Very cumbersome preprocessing | ||
+ | *Also did vitals preprocessing | ||
+ | ===Week 8=== | ||
+ | *Focus on social demographic preprocessing | ||
+ | *Very cumbersome preprocessing | ||
+ | *Also did vitals preprocessing | ||
+ | ===Week 9=== | ||
+ | *Procedures and problem list preprocessing | ||
+ | ===Week 10=== | ||
+ | *Finish paper poster and presentation | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | Some information below may be improperly paraphrased from an article so do not copy | ||
+ | == Literature Summaries == | ||
+ | ===Predicting Opioid Overdose Readmission and Opioid Use Disorder with Machine Learning=== | ||
+ | Objectives | ||
+ | *Use multiple machine learning models and multiple data types to predict the likelihood of hospital readmission following an opioid overdose and diagnosis of opioid use disorder after being prescribed an opioid | ||
+ | Useful Info | ||
+ | *AUC value is a rate of correct prediction | ||
+ | *Hospital info is meant to be anonymized, patient identifiers must be removed | ||
+ | *T40 codes are used as identifiers of various conditions including Covid and OUD | ||
+ | *Various methods of cleaning and compiling hospital records into more useful data frames | ||
+ | *Various viable machine learning models that could be used on my data | ||
+ | *10 fold cross-validation methods of machine learning | ||
+ | *SMOTE used for class balancing | ||
+ | *Various limitations of hospital data, the data does not include non-registered opioid use or addiction | ||
+ | *"Black box" structure of machine learning models | ||
+ | *Deep learning models such as RNN GRU and LSTM | ||
+ | *Doctor AI used for EHR(Electronic Health Record) data | ||
+ | Questions | ||
+ | *Why is SMOTE necessary? | ||
+ | *What is "gain"? | ||
+ | *How to fix "data noise"? | ||
+ | |||
+ | ===Predicting Opioid Use Disorder using Random Forest=== | ||
+ | Objectives | ||
+ | *To use Random Forest on a public dataset to make a predictive model for determining OUD diagnosis | ||
+ | Useful info | ||
+ | *First age of marijuana consumption, mental illness status, and age in that order are the biggest predictors of OUD from this study | ||
+ | *Useful references for the existence of the Opioid crisis | ||
+ | Questions | ||
+ | *What is downsampling? | ||
+ | |||
+ | ===A clash of epidemics: Impact of the COVID-19 pandemic response on opioid overdose=== | ||
+ | *Created a simple model (RESPOND - researching effective strategies to prevent opioid death) to show how social distancing could harm people with OUD and worsen the OUD population | ||
+ | *Discussed possible problems pandemic poses for people with OUD such as drug supply shortages, mental health problems from social isolation, relapses from lack of community, people not seeking medical help because of distancing, etc. | ||
+ | *Concluded that the pandemic will have a disproportionate effect on the OUD population because of compounding mental, physical, economic, and social problems | ||
+ | *Rough estimate of how much covid harms the OUD population | ||
+ | |||
+ | ===COVID-19 risk and outcomes in patients with substance use disorders: analyses from electronic health records in the United States=== | ||
+ | |||
+ | Useful info | ||
+ | *OUD is the worst substance abuse problem in terms of the additional likelihood of getting covid | ||
+ | *Opioids and Covid both weaken the respiratory system, Opioid overdose deaths are from failures in the respiratory system | ||
+ | *Other SUD(Substance Use Disorders) target cardiovascular, pulmonary, and metabolic all of which are risk factors of Covid | ||
+ | *DSM-5 contains standardized definitions of OUD and other SUD | ||
+ | *OUD often have comorbidities from their drug use. Many of these are risk factors for Covid | ||
+ | *Use of MOUDs do not have a significant effect on the prevalence of Covid | ||
+ | *Among the factors analyzed race was the strongest influence over the prevalence of Covid | ||
+ | *List of specific limitations of EHR data | ||
+ | |||
+ | ===The Opioid Crisis: a Comprehensive Overview=== | ||
+ | |||
+ | Useful info | ||
+ | *Comprehensive history of the opioid crisis | ||
+ | *Driving forces of the crisis | ||
+ | *Groups at risk (Middle-aged Women, Pregnant women, Veterans, children in sports) | ||
+ | *Comprehensive list of adverse events | ||
+ | *Discussion of legislature used to combat the epidemic, criminalization | ||
+ | |||
+ | ===A comprehensive review of COVID-19 characteristics=== | ||
+ | Useful info | ||
+ | *Virology of Covid | ||
+ | *Comparing to different coronaviruses throughout history | ||
+ | *Comprehensive symptomatology of Covid | ||
+ | *Areas of the body targeted by covid | ||
+ | *Lung Hemmorages main cause of death | ||
+ | *Description of transmission, community-based | ||
+ | *Various treatments for Covid(Outdated) | ||
+ | *Non-standard traditional medicine works against Covid? Qingfei Paidu? Ayurveda, Siddha, Unani? | ||
+ | |||
+ | ===Artificial intelligence and machine learning to fight COVID-19=== | ||
+ | Useful info | ||
+ | *Increased presence of ACE2 enzyme could lead to worse reactions of Covid | ||
+ | *Various ML and NN methods applied to different data types to better treat covid | ||
+ | *Emphasized importance of universal databases for Covid data to optimize treatment etc. | ||
+ | |||
+ | ===The Hidden Epidemic of Opioid Overdoses During the Coronavirus Disease 2019 Pandemic=== | ||
+ | Useful info | ||
+ | *Regular opioid use produces respiratory depression in patients | ||
+ | *Anti-fentanyl vaccine is being used to block overdoses | ||
+ | *Description of fentanyl vaccine | ||
+ | *Social distancing has increased unwitnessed overdoses in Kentucky, Kentucky suffers from very high opioid use | ||
+ | *Highlights the stigmatization of addiction and how it reduces support for fentanyl vaccine | ||
+ | *Stigmatization of hospitals causes people not to get the addiction treatment | ||
+ | |||
+ | ===Prediction Modeling Using EHR Data: Challenges, Strategies, and a Comparison of Machine Learning Approaches=== | ||
+ | *There can be difficulty in defining your outcome variables as in this study | ||
+ | *Operationalizing variables | ||
+ | *Some procedures for fixing missing data and explanation of how missing data arise in an EHR | ||
+ | *Various variable selection methods that may be appropriate for my project, AIC BIC | ||
+ | *Explanation of SVM, Boosting various feature selection techniques | ||
+ | |||
+ | ===Review of Covid-19 vaccine clinical trials – A puzzle with missing pieces=== | ||
+ | *Making a long-lasting vaccine is hard | ||
+ | *Further papers about mRNA vaccines | ||
+ | *The scientific community to convince people of the safety of the vaccine | ||
+ | *Shows that there is a problem that people with comorbidities don't know if the vaccine is safe for them or not | ||
+ | *Link to clinical trials of people with comorbidities | ||
+ | *Clinical trials were going at the time of this study for the elderly, pregnant women, and children | ||
+ | *Testing efficacy against the other variants, UK variant does not seem promising | ||
+ | |||
+ | ===Psychological aspects of Covid-19=== | ||
+ | *List of risk factors for mental health problems, including frequent social media use and "misinformation, often aided by sensational popular media head-lines and foci" | ||
+ | *Other populations at risk during the pandemic were healthcare workers, who obviously could be put under a lot of stress due to the pandemic | ||
+ | *List of various problems arising from excessive health anxiety | ||
+ | *Adjustment disorder from people struggling to adjust from the pandemic | ||
+ | *Some people have healthy psychological responses to the onset of the pandemic, increases community care etc. | ||
+ | *Depressive disorder resulting from pandemic | ||
+ | *Neurocognitive disorder resulting from catching covid | ||
+ | *At-risk populations: caregivers of the elderly, people who lost family to the virus, parents | ||
+ | *Statistical increase in non-abusive violence in parent-child relationships and increased bonding between parent-children | ||
+ | *Possible risks for new mothers, PMD, child-parent recognition with masks | ||
+ | *The "second wave" of the pandemic is mental health-related | ||
+ | |||
+ | ===Predictors of COVID-19 Vaccine Hesitancy: Socio-demographics, Co-Morbidity, and Past Experience of Racial Discrimination=== | ||
+ | *The most vaccine-hesitant group is African Americans | ||
+ | *A significant feature of vaccine hesitancy is having reported unfair treatment from authorities |
Latest revision as of 16:47, 7 August 2021
Zach Farahany's profile
About me: Hi, I'm Zach. I'm a Data Science and Computational Mathematics major at Marquette University. My research project is Predicting Risk of Opioid Use Disorder with a focus on the impact from Covid-19.
Contents
- 1 Work Log
- 2 Literature Summaries
- 2.1 Predicting Opioid Overdose Readmission and Opioid Use Disorder with Machine Learning
- 2.2 Predicting Opioid Use Disorder using Random Forest
- 2.3 A clash of epidemics: Impact of the COVID-19 pandemic response on opioid overdose
- 2.4 COVID-19 risk and outcomes in patients with substance use disorders: analyses from electronic health records in the United States
- 2.5 The Opioid Crisis: a Comprehensive Overview
- 2.6 A comprehensive review of COVID-19 characteristics
- 2.7 Artificial intelligence and machine learning to fight COVID-19
- 2.8 The Hidden Epidemic of Opioid Overdoses During the Coronavirus Disease 2019 Pandemic
- 2.9 Prediction Modeling Using EHR Data: Challenges, Strategies, and a Comparison of Machine Learning Approaches
- 2.10 Review of Covid-19 vaccine clinical trials – A puzzle with missing pieces
- 2.11 Psychological aspects of Covid-19
- 2.12 Predictors of COVID-19 Vaccine Hesitancy: Socio-demographics, Co-Morbidity, and Past Experience of Racial Discrimination
Work Log
Week 1
Tuesday
- Attended REU Orientation and learned beginner python
Wednesday
- Learned data visualization and basic machine learning
Thursday
- Good research practices talk from Dr. Brylow
- Meeting with Dr. Praveen for project expectations
Friday
- Made personal webpage and completed research plan
Sunday
- Read "Predicting Opioid Overdose Readmission and Opioid Use Disorder with Machine Learning"
Week 2
Monday
- Attended data ethics talk from Dr. Brylow
- Read "Predicting Opioid Use Disorder using Random Forest" by Wadekar et. al.
Tuesday
- Meeting w/ Dr. Praveen
Wednesday
- Completed Basic RCR CITI module
Thursday
- Completed Biomedical CITI module
- Read "A clash of epidemics: Impact of the COVID-19 pandemic response on opioid overdose" by Linas et al.
- Read "COVID-19 risk and outcomes in patients with substance use disorders: analyses from electronic health records in the United States" by Wang et al.
Friday
- Researched ML terms such as association rule, support, confidence, lift, conviction
- Read "The Opioid Crisis: a Comprehensive Overview"
Week 3
Monday
- Completed CITI Session 1-3
- Completed Assignment 1-3
Tuesday
- Read "A comprehensive review of COVID-19 characteristics"
- Meeting with Dr. Praveen
Wednesday
- Short presentation from Dr. Praveen
Thursday
- Read "The Hidden Epidemic of Opioid Overdoses During the Coronavirus Disease 2019 Pandemic"
Week 4
Monday
- Read "Prediction Modeling Using EHR Data: Challenges, Strategies, and a Comparison of Machine Learning Approaches"
- Read "Psychological aspects of Covid-19"
Tuesday
- Read "Review of Covid-19 vaccine clinical trials – A puzzle with missing pieces"
- Data lab meeting with Dr. Praveen
Wednesday
- Talk from a faculty member
- Trying to sort out Marquette VPN and remote access to a computer
Thursday
- Read "Predictors of COVID-19 Vaccine Hesitancy: Socio-demographics, Co-Morbidity, and Past Experience of Racial Discrimination "
Week 5
Monday
- Prepared presentation for Tuesday
- Attended good research presentation talk from Dr. Brylow
Tuesday
- DataLab meeting
- Presented, received feedback
- Updated presentation after feedback
- Looking into Google Cloud service or CoLab for computation power
- Put my computer on downloading data overnight
Wednesday
- Purchased Colab Pro because of extra RAM needed
- Started to figure out how to preprocess data
Thursday
- More data preprocessing
Friday
- Loop preprocessing will not work need other methods
- Looking into various pandas functions for preprocessing, groupby etc.
Week 6
- Used diagnosis file from SUD dataset for dummy data
- Sort by unique instead of using a loop
- Create diagnosis preprocessing pipeline that can be a template for all other preprocessing
Week 7
- Focus on social demographic preprocessing
- Very cumbersome preprocessing
- Also did vitals preprocessing
Week 8
- Focus on social demographic preprocessing
- Very cumbersome preprocessing
- Also did vitals preprocessing
Week 9
- Procedures and problem list preprocessing
Week 10
- Finish paper poster and presentation
Some information below may be improperly paraphrased from an article so do not copy
Literature Summaries
Predicting Opioid Overdose Readmission and Opioid Use Disorder with Machine Learning
Objectives
- Use multiple machine learning models and multiple data types to predict the likelihood of hospital readmission following an opioid overdose and diagnosis of opioid use disorder after being prescribed an opioid
Useful Info
- AUC value is a rate of correct prediction
- Hospital info is meant to be anonymized, patient identifiers must be removed
- T40 codes are used as identifiers of various conditions including Covid and OUD
- Various methods of cleaning and compiling hospital records into more useful data frames
- Various viable machine learning models that could be used on my data
- 10 fold cross-validation methods of machine learning
- SMOTE used for class balancing
- Various limitations of hospital data, the data does not include non-registered opioid use or addiction
- "Black box" structure of machine learning models
- Deep learning models such as RNN GRU and LSTM
- Doctor AI used for EHR(Electronic Health Record) data
Questions
- Why is SMOTE necessary?
- What is "gain"?
- How to fix "data noise"?
Predicting Opioid Use Disorder using Random Forest
Objectives
- To use Random Forest on a public dataset to make a predictive model for determining OUD diagnosis
Useful info
- First age of marijuana consumption, mental illness status, and age in that order are the biggest predictors of OUD from this study
- Useful references for the existence of the Opioid crisis
Questions
- What is downsampling?
A clash of epidemics: Impact of the COVID-19 pandemic response on opioid overdose
- Created a simple model (RESPOND - researching effective strategies to prevent opioid death) to show how social distancing could harm people with OUD and worsen the OUD population
- Discussed possible problems pandemic poses for people with OUD such as drug supply shortages, mental health problems from social isolation, relapses from lack of community, people not seeking medical help because of distancing, etc.
- Concluded that the pandemic will have a disproportionate effect on the OUD population because of compounding mental, physical, economic, and social problems
- Rough estimate of how much covid harms the OUD population
COVID-19 risk and outcomes in patients with substance use disorders: analyses from electronic health records in the United States
Useful info
- OUD is the worst substance abuse problem in terms of the additional likelihood of getting covid
- Opioids and Covid both weaken the respiratory system, Opioid overdose deaths are from failures in the respiratory system
- Other SUD(Substance Use Disorders) target cardiovascular, pulmonary, and metabolic all of which are risk factors of Covid
- DSM-5 contains standardized definitions of OUD and other SUD
- OUD often have comorbidities from their drug use. Many of these are risk factors for Covid
- Use of MOUDs do not have a significant effect on the prevalence of Covid
- Among the factors analyzed race was the strongest influence over the prevalence of Covid
- List of specific limitations of EHR data
The Opioid Crisis: a Comprehensive Overview
Useful info
- Comprehensive history of the opioid crisis
- Driving forces of the crisis
- Groups at risk (Middle-aged Women, Pregnant women, Veterans, children in sports)
- Comprehensive list of adverse events
- Discussion of legislature used to combat the epidemic, criminalization
A comprehensive review of COVID-19 characteristics
Useful info
- Virology of Covid
- Comparing to different coronaviruses throughout history
- Comprehensive symptomatology of Covid
- Areas of the body targeted by covid
- Lung Hemmorages main cause of death
- Description of transmission, community-based
- Various treatments for Covid(Outdated)
- Non-standard traditional medicine works against Covid? Qingfei Paidu? Ayurveda, Siddha, Unani?
Artificial intelligence and machine learning to fight COVID-19
Useful info
- Increased presence of ACE2 enzyme could lead to worse reactions of Covid
- Various ML and NN methods applied to different data types to better treat covid
- Emphasized importance of universal databases for Covid data to optimize treatment etc.
Useful info
- Regular opioid use produces respiratory depression in patients
- Anti-fentanyl vaccine is being used to block overdoses
- Description of fentanyl vaccine
- Social distancing has increased unwitnessed overdoses in Kentucky, Kentucky suffers from very high opioid use
- Highlights the stigmatization of addiction and how it reduces support for fentanyl vaccine
- Stigmatization of hospitals causes people not to get the addiction treatment
Prediction Modeling Using EHR Data: Challenges, Strategies, and a Comparison of Machine Learning Approaches
- There can be difficulty in defining your outcome variables as in this study
- Operationalizing variables
- Some procedures for fixing missing data and explanation of how missing data arise in an EHR
- Various variable selection methods that may be appropriate for my project, AIC BIC
- Explanation of SVM, Boosting various feature selection techniques
Review of Covid-19 vaccine clinical trials – A puzzle with missing pieces
- Making a long-lasting vaccine is hard
- Further papers about mRNA vaccines
- The scientific community to convince people of the safety of the vaccine
- Shows that there is a problem that people with comorbidities don't know if the vaccine is safe for them or not
- Link to clinical trials of people with comorbidities
- Clinical trials were going at the time of this study for the elderly, pregnant women, and children
- Testing efficacy against the other variants, UK variant does not seem promising
Psychological aspects of Covid-19
- List of risk factors for mental health problems, including frequent social media use and "misinformation, often aided by sensational popular media head-lines and foci"
- Other populations at risk during the pandemic were healthcare workers, who obviously could be put under a lot of stress due to the pandemic
- List of various problems arising from excessive health anxiety
- Adjustment disorder from people struggling to adjust from the pandemic
- Some people have healthy psychological responses to the onset of the pandemic, increases community care etc.
- Depressive disorder resulting from pandemic
- Neurocognitive disorder resulting from catching covid
- At-risk populations: caregivers of the elderly, people who lost family to the virus, parents
- Statistical increase in non-abusive violence in parent-child relationships and increased bonding between parent-children
- Possible risks for new mothers, PMD, child-parent recognition with masks
- The "second wave" of the pandemic is mental health-related
Predictors of COVID-19 Vaccine Hesitancy: Socio-demographics, Co-Morbidity, and Past Experience of Racial Discrimination
- The most vaccine-hesitant group is African Americans
- A significant feature of vaccine hesitancy is having reported unfair treatment from authorities