Difference between revisions of "User:KonradJ"

From REU@MU
Jump to: navigation, search
(2023 Week 5:( July 3rd - July 7th ))
(2023 Week 5:( July 3rd - July 7th ))
Line 131: Line 131:
  
 
== 2023 Week 5:( July 3rd - July 7th ) ==
 
== 2023 Week 5:( July 3rd - July 7th ) ==
'''Monday (6/26)'''
+
'''Monday (7/3)'''
 
* Reviewed two research papers for content and ideas on methodology  
 
* Reviewed two research papers for content and ideas on methodology  
 
* Worked to complete BOW models after slowing down and attempting to take it step by step
 
* Worked to complete BOW models after slowing down and attempting to take it step by step
Line 137: Line 137:
 
* Word count has also been a problem and trying to find out if its in the preprocessing or model itself  
 
* Word count has also been a problem and trying to find out if its in the preprocessing or model itself  
  
'''Tuesday (6/27)'''
+
'''Tuesday (7/4)'''
  
'''Wednesday (6/28)'''
+
'''Wednesday (7/5)'''
 
* Watched the rest of my peers' research presentations
 
* Watched the rest of my peers' research presentations
 
* Looked at some other tokenization methods to help with current preprocessing bug
 
* Looked at some other tokenization methods to help with current preprocessing bug
 
* Reviewd Coursera notes and videos they have to refresh on concepts
 
* Reviewd Coursera notes and videos they have to refresh on concepts
  
'''Thursday (6/29)'''
+
'''Thursday (7/6)'''
 
* Reviewed more of the coursera materials
 
* Reviewed more of the coursera materials
 
* Conintued to look into different tokenizers
 
* Conintued to look into different tokenizers
  
'''Friday (6/30)'''
+
'''Friday (7/7)'''
 +
* Ran into some trouble with stemming so looked into different methods.
 +
* Still have't been able to deal with a tokenization error that effects BOW
 +
* Continued review of Coursera materials

Revision as of 19:40, 8 July 2023

2023 Week 1: (May 30th - June 4th)

Tuesday (5/30)

  • Attended REU oreintation
  • Met with peers and mentors

Wednesday (5/31)

  • Attended lecture lead by Brylow about research papers and how to log progress properly
  • Met with mentor to talk about goals and milestones and shared research papers to read (will update tomorrow)

Thursday (6/1)

  • Sadly most of my day was consumed with moving into my apartment so minimized my ability to work
  • Went over my first research paper with a surface level reading.
    • Assessment of Medical Reports Uncertainity through Topic Modeling and Machine Learning

Friday (6/2)

  • Read various research articles including the one from yesterday more thoroughly
    • Effects of Negation and Uncertainty Stratification on Text-Derived Patient Profile SImilarity
    • Extracting Medical Information From Free-Text and Unstructured Patient-Generated Health Data Using Natural Language Processing Methods: Feasibility Study With Real-world Data
  • Took notes on what I wrote later

Sunday (6/4)

  • Took more notes on the research papers I've already read
  • Began a lengthy research paper
    • Negation and uncertainty detection in clinical texts written in Spanish: a deep learning-based approach
  • Prepared Log for this upcoming week


2023 Week 2: (June 5th - June 11th)

Monday (6/5)

  • Met with group for federally mandated training.

Tuesday (6/6)

  • Met with peers and discussed where evyerone was at in their process
  • Finished the lengthy research paper from the other day
  • Read the last of my research papers
    • Challenges and opportunities beyond structured data in analysis of electronic health records
  • Got up to the first lab of a coursera course recommended by my mentor

Wednesday (6/7)

  • Completed 2 coursera labs
  • Met with peers and discussed milestones before hearing a lecture on technical writing and presenting from Dr. Brylow
  • Met with my mentor and talked about goals for this week
    • Reviewing the relevent data in the form of csv files


Thursday (6/8)

  • Got to the end of the first section of coursera and began the lab associated
  • Began looking at relevent files for the summer project
    • Csv files that contain patient data that we will use to practice uncertainty quantifying strategies

Friday (6/9)

  • Dug deeper into research papers I had read to find what the most perteinent files were for the MIMIC-III dataset
    • Seems like the best dataset for us would be the D_ICD_DIAGNOSES.csv which has ICD-9 codes for each diagnoses
  • Continued online natural language processing course through coursera
    • Naive Bayes, Bayes Rule, Laplacian smoothing, and log likelihood

Sunday (6/11)

  • Completed week 2 on Naive Bayes in Coursera
    • Got passing grade on week 2 quiz on material and completed all labs
  • Started Week 3 course on Vector Space Models
    • Completed a couple labs
  • Prepared log for upcoming week

2023 Week 3: (June 12th - June 16th)

Monday (6/12)

  • Completed Week 3 courses for the Coursera class I am taking
    • Vector Space Models
  • Dug more into the ICD_9 records

Tuesday (6/13)

  • Completed labs from week 1 and week 2 of coursera course
    • Logistic Regression and Naive Bayes

Wednesday (6/14)

  • Heard research talk from Dr. Praveen
  • Talked with peers about where they are in their projects
  • Continued with week 4 of coursera lectures

Thursday (6/15)

  • Finished week 4 of coursera lectures
    • Machine Translation and Document Search
  • Did week 3 lab for coursera
    • Vector Space Models

Friday (6/16)

  • Began messing with preprocessing functions with some columns of the dataset (MIMIC-III, D_ICD_DIAGNOSES)
    • Removed punctuation, made lowercase, tokenized, then stemmed the columns
  • Continued to read up on relevent information about how to preprocess and word2vector strategies

2023 Week 3: (June 19th - June 23rd )

Tuesday (6/20)

  • Re-read one of the more valuable research papers to get an idea of where to go from here after learning more about the relevent technologies and algorithms
  • Reviewed coursera materials to better understand the steps in the process of natural language processing
  • Continued to tinker with the data with preprocessing before I get more in depth this week

Wednesday (6/21)

  • Listened to a great research lecture from Dr. Walt Bialkowski on using data science to predict food shortages at local pantries
  • Talked with peers about their projects after the lecture
  • Began creating some slides and reviewing materials for mini-presentation

Thursday (6/22)

  • Met with my mentor to talk about mini-presentation coming up next week as well as adjustments to my pre-processing methods and what to do next
  • Continued with creation of the mini-presentation and script to go along with it

Friday (6/23)

2023 Week 4: (June 26th - June 30th)

Monday (6/26)

  • Reviewed papers and other materials for writing script out for presentation
  • Created some slides

Tuesday (6/27)

  • Wrote the script for talk
  • Finished presentation slides
  • Practiced until I felt comfortable with the material

Wednesday (6/28)

  • Gave presentation to my peers
  • Talked with them after about their projects and upcoming plans

Thursday (6/29)

  • Met with my mentor to establish game plan for this week and the upcoming weeks
  • Watched some videos on approaches to the BERT model and picked out some artciles

Friday (6/30)

  • Read thorough artciles with implmentation notes and code
  • Started to attempt implementing myself
  • Found a new implementation I liked, confused on how to split up the data or if I should be even doing that before I apply the model
  • Talked with mentor about slowing down and just working on BOW model, too big picture right now

2023 Week 5:( July 3rd - July 7th )

Monday (7/3)

  • Reviewed two research papers for content and ideas on methodology
  • Worked to complete BOW models after slowing down and attempting to take it step by step
    • Had some problems with tokenization and stemming attempting to iron those out
  • Word count has also been a problem and trying to find out if its in the preprocessing or model itself

Tuesday (7/4)

Wednesday (7/5)

  • Watched the rest of my peers' research presentations
  • Looked at some other tokenization methods to help with current preprocessing bug
  • Reviewd Coursera notes and videos they have to refresh on concepts

Thursday (7/6)

  • Reviewed more of the coursera materials
  • Conintued to look into different tokenizers

Friday (7/7)

  • Ran into some trouble with stemming so looked into different methods.
  • Still have't been able to deal with a tokenization error that effects BOW
  • Continued review of Coursera materials