User:Feidler
From REU@MU
Week 1
5/31
- Attended orientation
- Met with advisor (Dr. Michael Zimmer)
- Explored research papers related to Zuckerberg Files
6/1
- Completed CITI modules
- Reviewed Dr. Zimmer's paper
- Explored analytics tools
- Continued reviewing papers related to Zuckerberg Files
6/2
- Met with Dr. Zimmer for further discussion, followed by lunch
- Continued exploring potential tools
- Continued reviewing papers related to Zuckerberg Files
6/3
- Reviewed papers on quantitative textual analysis
- Explored WordStat tool and quanteda R package
Week 2
6/6
- Attended RCR training
- Watched tutorials on WordStat tool
6/7
- Read Yazeed Alhumaidan's dissertation methodology
- Met with mentor to discuss potential tools
- Explored possible tools for analysis and data cleaning
6/8
- Attended talk on technical writing
- Continued searching for possible tools
6/9
- Met with adviser and touched base
- Set in motion ordering for WordStat
- Reviewed WordStat tutorials
6/10
- Installed trial version of WordStat
Week 3
6/13
- Obtained xml file of the archive
- Created archive account
- Began work on script to automate download and collection of transcripts
6/14
- Continued working on script
- Tested draft documents in WordStat
6/15
- Attended research presentation given by Dr. Madiraju
- Began debugging script
6/16
- Finished debugging script
- Organized all blog posts into one file for analysis
6/17
- Automated conversion of files from .pdf to .txt files
Week 4
6/20
- Cleaned set of files, so no interviews/video transcripts
- Obtained official license for WordStat
6/21
- Ran initial analysis in WordStat
- Fine-tuned text processing
- Refined some transcript errors
6/22
- Attended research talk given by Dr. Bialkowski
- Finished cleaning typos/errors in transcripts
- Explored frequencies and topic extraction
6/23
- Explored dendrogram anaylsis
- Explored proximity plots
- Experimented with different aspects of preprocessing
6/24
- Explored link analysis
- Experimented with postprocessing of text
- Tested graphing for frequencies
Week 5
6/27
- Attended talk on presentation
- Produced graphs of initial findings
- Reviewed relevant literature for presentation
6/28
- Made PowerPoint for presentation
- Rehearsed presentation
- Revised relevant graphs
6/29
- Gave presentation of work so far
- Gathered creation dates of posts
- Began process of changing post creation dates
6/30
- Continued coding the change of creation dates
- Experimented with cluster mapping
7/1
- Creation dates successfully changed
- Began experimenting with crosstab tool
Week 6
7/5
- Continued experimenting with crosstab
- Graphed frequencies over time
- Explored bubble plotting
7/6
- Attended student check-in
- More graphing, trying different intervals of time
- Explored heatmap tool
7/7
- Explored deviation table
- Examined key words in context
7/8
- Gathered more transcripts from the archive (non social media)
- Continued graphing different clusters
- Continued examining key words in context
Week 7
7/11
- Converted new PDFs into .txt files
- Began graphing topics instead of words
7/12
- Continued graphing topics
- Began cleaning new transcripts
7/13
- Attended student check-in
- Continued cleaning transcripts
7/14
- Redid conversion of PDFs to .txt with new packages
- Continued cleaning transcripts
7/15
- Attended Dr. Zimmer's talk on Data Ethics
- Continued cleaning transcripts