User:Carolinearnold
From REU@MU
Revision as of 16:57, 8 June 2023 by Carolinearnold (Talk | contribs)
Week 1: 5/30/23 - 6/2/23
Tuesday:
- Attended REU orientation.
- Toured Dr. Xu's lab and discussed his intentions for the project.
- Downloaded Windows Terminal and Python 3.11 to the lab computer.
- Installed OpenAI's command-line interface (CLI) to use in training a fine-tuned model.
- Completed Unit 1 and the first half of Unit 2 of RCR/RECR training as mandated by the NSF.
Wednesday:
- Attended Dr. Brylow's presentation on good research practices and the importance of keeping logs.
- Completed Units 2 and 3 of RCR/RECR training.
- Researched factors that influence perceived credibility in human and AI-generated communication.
- Examined methodologies used by other researchers to evaluate the degree of trust in unknown authors.
Thursday:
- Narrowed research to large language models (LLMs) and their capacity for social awareness.
- Attempted to connect with Alex Fischmann, who completed a related project under Dr. Xu's mentorship.
- Familiarized myself with PyCharm and began this tutorial for fine-tuning a model.
- Examined deliverables produced by Fischmann during the course of her project.
Friday:
- Explored case studies of fine-tuned LLMs and their applications.
- Began troubleshooting the fine-tuning process using this guide as a reference.
- Met with Dr. Xu to discuss a rough timeline of the project.
- Drafted a summary of research goals according to our proposed timeline.
Week 2: 6/5/23 - 6/9/23
Monday:
- Attended RCR training with Dr. Brylow.
Tuesday:
- Installed PyCharm 2022.1.4 and required packages for web scraping.
- Fixed warnings in Fischmann's web scraper.
- Adapted the aforementioned script to GoFundMe's medical crowdfunding homepage.
- Created a .csv file in which to store campaign data.
- Retrieved the following data: title, URL, description, organizer(s), and launch date.
Wednesday:
- Attended Dr. Brylow's presentation on technical writing and effective research talks.
- Modified my script to retrieve the following data: amount raised, goal, beneficiary, and number of donations.
- Added code to store campaign data in the aforementioned .csv file.
- Implemented try/catch statements to record "NA" values in the event that data isn't found.
- In order to store campaign descriptions in a .csv file, I had to remove commas, which will interfere with fine-tuning later in the project. Possible solutions include using a different file extension (e.g., .tsv) or replacing commas with punctuation unlikely to appear elsewhere in the description.
- TODO: If a campaign was launched within the last week, the launch date retrieved by the web scraper is expressed relative to the current time. For example, a campaign published yesterday will return "[launched] one day ago" as opposed to "[launched] June 7 2023." Work is needed to express all dates in the latter form OR throw out campaigns launched within the week.