Assessing the Framing and Priorities of Data Science Programs

Jump to: navigation, search

Mentor: Dr. Michael Zimmer

Approach: Using both quantitative and qualitative tools, a large corpus of text data will be analyzed to identify clusters, trends, and outliers based on various themes and metrics.

Summary: Hundreds of data science programs have sprouted across universities in the United States, each with their own disciplinary home, curricular focus, and learning outcomes. Some programs rely heavily on applied statistics, others have a business decision-support focus, and some might include the study of social and ethical implications of data science; countless other variations exist. This project utilizes both quantitative (clustering, topic modeling, natural language processing, etc) and qualitative (thematic coding, content analysis, etc) methods to analyze an existing dataset of text from over 600 data science curricular program websites. The project will seek to identify trends and outliers across the various programs, disciplines, and insitutions included in the dataset.

Student Research Activities: The REU fellows will perform the following major tasks: • Engage in data cleaning, analysis, and visualization of a large text corpus • Utilize both quantitative and qualitative analytics tools, including packages from Python or R, or also thematic coding using Atlas.ti • Assist in summarizing results for dissemination.

Student Background: Students need to have basic computing skills and introductory knowledge of the social and/or data science methodologies. Experience with R or natural language processing techniques helpful but not necessary.