User:KonradJ

2023 Week 1: (May 30th - June 4th)

Tuesday (5/30)

Attended REU oreintation
Met with peers and mentors

Wednesday (5/31)

Attended lecture lead by Brylow about research papers and how to log progress properly
Met with mentor to talk about goals and milestones and shared research papers to read (will update tomorrow)

Thursday (6/1)

Sadly most of my day was consumed with moving into my apartment so minimized my ability to work
Went over my first research paper with a surface level reading.
- Assessment of Medical Reports Uncertainity through Topic Modeling and Machine Learning

Friday (6/2)

Read various research articles including the one from yesterday more thoroughly
- Effects of Negation and Uncertainty Stratification on Text-Derived Patient Profile SImilarity
- Extracting Medical Information From Free-Text and Unstructured Patient-Generated Health Data Using Natural Language Processing Methods: Feasibility Study With Real-world Data
Took notes on what I wrote later

Sunday (6/4)

Took more notes on the research papers I've already read
Began a lengthy research paper
- Negation and uncertainty detection in clinical texts written in Spanish: a deep learning-based approach
Prepared Log for this upcoming week

2023 Week 2: (June 5th - June 11th)

Monday (6/5)

Met with group for federally mandated training.

Tuesday (6/6)

Met with peers and discussed where evyerone was at in their process
Finished the lengthy research paper from the other day
Read the last of my research papers
- Challenges and opportunities beyond structured data in analysis of electronic health records
Got up to the first lab of a coursera course recommended by my mentor

Wednesday (6/7)

Completed 2 coursera labs
Met with peers and discussed milestones before hearing a lecture on technical writing and presenting from Dr. Brylow
Met with my mentor and talked about goals for this week
- Reviewing the relevent data in the form of csv files

Thursday (6/8)

Got to the end of the first section of coursera and began the lab associated
Began looking at relevent files for the summer project
- Csv files that contain patient data that we will use to practice uncertainty quantifying strategies

Friday (6/9)

Dug deeper into research papers I had read to find what the most perteinent files were for the MIMIC-III dataset
- Seems like the best dataset for us would be the D_ICD_DIAGNOSES.csv which has ICD-9 codes for each diagnoses
Continued online natural language processing course through coursera
- Naive Bayes, Bayes Rule, Laplacian smoothing, and log likelihood

Sunday (6/11)

Completed week 2 on Naive Bayes in Coursera
- Got passing grade on week 2 quiz on material and completed all labs
Started Week 3 course on Vector Space Models
- Completed a couple labs
Prepared log for upcoming week

2023 Week 3: (June 12th - June 16th)

Monday (6/12)

Completed Week 3 courses for the Coursera class I am taking
- Vector Space Models
Dug more into the ICD_9 records

Tuesday (6/13)

Completed labs from week 1 and week 2 of coursera course
- Logistic Regression and Naive Bayes

Wednesday (6/14)

Heard research talk from Dr. Praveen
Talked with peers about where they are in their projects
Continued with week 4 of coursera lectures

Thursday (6/15)

Finished week 4 of coursera lectures
- Machine Translation and Document Search
Did week 3 lab for coursera
- Vector Space Models

Friday (6/16)

Began messing with preprocessing functions with some columns of the dataset (MIMIC-III, D_ICD_DIAGNOSES)
- Removed punctuation, made lowercase, tokenized, then stemmed the columns
Continued to read up on relevent information about how to preprocess and word2vector strategies

2023 Week 4: (June 19th - June 23rd )

Tuesday (6/20)

Re-read one of the more valuable research papers to get an idea of where to go from here after learning more about the relevent technologies and algorithms
Reviewed coursera materials to better understand the steps in the process of natural language processing
Continued to tinker with the data with preprocessing before I get more in depth this week

Wednesday (6/21)

Listened to a great research lecture from Dr. Walt Bialkowski on using data science to predict food shortages at local pantries
Talked with peers about their projects after the lecture
Began creating some slides and reviewing materials for mini-presentation

Thursday (6/22)

Met with my mentor to talk about mini-presentation coming up next week as well as adjustments to my pre-processing methods and what to do next
Continued with creation of the mini-presentation and script to go along with it

Friday (6/23)

2023 Week 5: (June 26th - June 30th)

Monday (6/26)

Reviewed papers and other materials for writing script out for presentation
Created some slides

Tuesday (6/27)

Wrote the script for talk
Finished presentation slides
Practiced until I felt comfortable with the material

Wednesday (6/28)

Gave presentation to my peers
Talked with them after about their projects and upcoming plans

Thursday (6/29)

Met with my mentor to establish game plan for this week and the upcoming weeks
Watched some videos on approaches to the BERT model and picked out some artciles

Friday (6/30)

Read thorough artciles with implmentation notes and code
Started to attempt implementing myself
Found a new implementation I liked, confused on how to split up the data or if I should be even doing that before I apply the model
Talked with mentor about slowing down and just working on BOW model, too big picture right now

2023 Week 6:( July 3rd - July 7th )

Monday (7/3)

Reviewed two research papers for content and ideas on methodology
Worked to complete BOW models after slowing down and attempting to take it step by step
- Had some problems with tokenization and stemming attempting to iron those out
Word count has also been a problem and trying to find out if its in the preprocessing or model itself

Tuesday (7/4)

Wednesday (7/5)

Watched the rest of my peers' research presentations
Looked at some other tokenization methods to help with current preprocessing bug
Reviewd Coursera notes and videos they have to refresh on concepts

Thursday (7/6)

Reviewed more of the coursera materials
Conintued to look into different tokenizers

Friday (7/7)

Ran into some trouble with stemming so looked into different methods.
Still have't been able to deal with a tokenization error that effects BOW
Continued review of Coursera materials

2023 Week 7:( July 10th - July 14th )

Monday (7/10)

Tinkered with preprocessed text attempting to get it into vector form
Looked into alternatives to CountVectorizer()
Made a new notebook and tried different ways of preprocessing data when CountVectorizer() is involved

Tuesday (7/11)

Continued looking into vecotirzation methods that allow for stemming (hitting bit of a roadblock)

Wednesday (7/12)

Learned about do's and don'ts of poster cration from Brylow
Talked to some peers about progress and what we want to do with our papers and posters
Looked into some poster templates and started preparing an outline for the poster content

Thursday (7/13)

Connected with a colleasgue of my mentor
- Sent me some useful materials I will review tomorrow
Called him and talked about BERT and my project

Friday (7/14)

Reviewed resourses from mentor's colleague on Youtube
- Best BERT content I've found so far takes its time and breaks it down
Downloaded some notebook files from there and started tinkering with them in Jupyter

User:KonradJ

Contents

2023 Week 1: (May 30th - June 4th)

2023 Week 2: (June 5th - June 11th)

2023 Week 3: (June 12th - June 16th)

2023 Week 4: (June 19th - June 23rd )

2023 Week 5: (June 26th - June 30th)

2023 Week 6:( July 3rd - July 7th )

2023 Week 7:( July 10th - July 14th )

Navigation menu

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Tools