The Zuckerberg Files: Analyzing the Discourse of Mark Zuckerberg

From REU@MU
Revision as of 15:23, 13 May 2021 by Praveen (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Title: The Zuckerberg Files: Analyzing the Discourse of Mark Zuckerberg

Mentor: Dr. Michael Zimmer

Approach:.

Summary: The Zuckerberg Files is a digital archive of all public utterances of Facebook’s founder and CEO, Mark Zuckerberg. The archive includes over 1,200 text transcripts and 200 videos from the voice and words of Zuckerberg, including his Facebook posts, media interviews, product launches, corporate events, congressional testimonies, and other public material. Curated by Dr. Michael Zimmer, The Zuckerberg Files has become a valuable tool for researchers seeking to download, analyze, and scrutinize Zuckerberg’s discourse on topics including online privacy, algorithmic filtering, content moderation, the spread of misinformation, and other topics related to Facebook’s role in society. This project involves the textual analysis of the data within The Zuckerberg Files, utilizing both quantitative (clustering, topic modeling, natural language processing, etc) and qualitative (thematic coding, discourse analysis, etc) methods to better understand the discourse of Mark Zuckerberg over time and over a range of topics. Other forms of analysis (visual, temporal, etc) are also welcome.

Student Research Activities: The REU fellows will perform the following major tasks: • Assist in the processing, standardization, and publication of textual data in the corpus, including transforming to xml and other machine-readable formats. • Utilize both quantitative and qualitive analytics tools, including packages form Python or R, or also thematic coding using Atlas.ti or Nvivo. • Assist in summarizing results for dissemination.

Student Background: Students need to have basic computing skills and introductory knowledge of the social and/or data science methodologies. Experience with R or natural language processing techniques helpful but not necessary.