Difference between revisions of "User:Carolinearnold"
From REU@MU
(124 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
− | == Week 1 | + | == Week 1: 5/30/23 - 6/2/23 == |
− | 5/30: | + | '''Tuesday:''' |
+ | * Attended REU orientation. | ||
+ | * Toured Dr. Xu's lab and discussed his intentions for the project. | ||
+ | * Downloaded Windows Terminal and Python 3.11 to the lab computer. | ||
+ | * Installed OpenAI's command-line interface (CLI) to use in training a fine-tuned model. | ||
+ | * Completed Unit 1 and the first half of Unit 2 of RCR/RECR training as mandated by the NSF. | ||
+ | '''Wednesday:''' | ||
+ | * Attended Dr. Brylow's presentation on good research practices and the importance of keeping logs. | ||
+ | * Completed Units 2 and 3 of RCR/RECR training. | ||
+ | * Researched factors that influence perceived credibility in human and AI-generated communication. | ||
+ | * Examined methodologies used by other researchers to evaluate the degree of trust in unknown authors. | ||
+ | '''Thursday:''' | ||
+ | * Narrowed research to large language models (LLMs) and their capacity for social awareness. | ||
+ | * Attempted to connect with Alex Fischmann, who completed a related project under Dr. Xu's mentorship. | ||
+ | * Familiarized myself with PyCharm and began this [https://platform.openai.com/docs/guides/fine-tuning tutorial] for fine-tuning a model. | ||
+ | * Examined deliverables produced by Fischmann during the course of her project. | ||
+ | '''Friday:''' | ||
+ | * Explored case studies of fine-tuned LLMs and their applications. | ||
+ | * Began troubleshooting the fine-tuning process using this [https://towardsdatascience.com/unleashing-the-power-of-gpt-how-to-fine-tune-your-model-da35c90766c4 guide] as a reference. | ||
+ | * Met with Dr. Xu to discuss a rough timeline of the project. | ||
+ | * Drafted a [[Research Goals|summary]] of research goals according to our proposed timeline. | ||
+ | |||
+ | == Week 2: 6/5/23 - 6/9/23 == | ||
+ | '''Monday:''' | ||
+ | * Attended RCR training with Dr. Brylow. | ||
+ | '''Tuesday:''' | ||
+ | * Installed PyCharm 2022.1.4 and required packages for web scraping. | ||
+ | * Fixed warnings in Fischmann's web scraper. | ||
+ | * Adapted the aforementioned script to GoFundMe's medical crowdfunding homepage. | ||
+ | * Created a .csv file in which to store campaign data. | ||
+ | * Retrieved the following data: title, URL, description, organizer(s), and launch date. | ||
+ | '''Wednesday:''' | ||
+ | * Attended Dr. Brylow's presentation on technical writing and effective research talks. | ||
+ | * Modified my script to retrieve the following data: amount raised, goal, beneficiary, and number of donations. | ||
+ | * Added code to store campaign data in the aforementioned .csv file. | ||
+ | ** Implemented try/catch statements to record "NA" values in the event that data isn't found. | ||
+ | ** In order to store campaign descriptions in a .csv file, I had to remove commas, which will interfere with fine-tuning later in the project. Possible solutions include using a different file extension (e.g., .tsv) or replacing commas with punctuation unlikely to appear elsewhere in the description. | ||
+ | * Discovered the following bugs: | ||
+ | ** If a campaign was launched within the last week, the launch date retrieved by the web scraper is expressed relative to the current time. For example, a campaign published yesterday will return "[launched] one day ago" as opposed to "[launched] June 7 2023." Work is needed to express all dates in the latter form OR throw out campaigns launched within the week. | ||
+ | ** Long descriptions are hidden by a "Read more" button on GoFundMe. My script appends "Read more" to the visible description as opposed to retrieving hidden text. Code is needed to click the "Read more" button and retrieve the full description. | ||
+ | '''Thursday:''' | ||
+ | * Fixed bugs in description and organizer retrieval. | ||
+ | * Attempted to reformat launch dates. I might have to throw out campaigns without proper dates attached. | ||
+ | * Replaced commas in campaign descriptions with semicolons. I should be able to revert back to commas before fine-tuning my LLM. If I can't, semicolons are a decent substitution because they preserve tone and readability despite grammatical incorrectness. | ||
+ | * Fixed formatting issues in output.csv. | ||
+ | * Copied data from 504 campaigns to output.csv. | ||
+ | ** Doesn't include number of donations due to a bug. | ||
+ | '''Friday:''' | ||
+ | * Called Dr. Xu to discuss my progress and appropriate next steps. | ||
+ | |||
+ | == Week 3: 6/12/23 - 6/16/23 == | ||
+ | '''Monday:''' | ||
+ | * Reformatted launch dates for campaigns older than 24 hours. | ||
+ | ** I think we could reasonably discard newer campaigns due to their relative infrequency. | ||
+ | * Fixed bug in number of donations. | ||
+ | * Scraped data from 1000 campaigns. | ||
+ | ** ''n'' = 1000 is the maximum number of campaigns I can retrieve data from at once. | ||
+ | * Removed non-English campaigns (''n'' = 23) and campaigns launched within 24 hours (''n'' = 25) from the dataset. | ||
+ | ** ''n'' = 952 | ||
+ | '''Tuesday:''' | ||
+ | * Stored data from the first ten campaigns in a separate .csv file. | ||
+ | * Wrote a script to rewrite campaign descriptions using GPT-3. | ||
+ | ** Reached the maximum number of tokens provided by OpenAI. | ||
+ | ** Each rewrite was appended to the output file as a new row, as opposed to in-line with the corresponding campaign. I attempted to fix this before I ran out of tokens. | ||
+ | '''Wednesday:''' | ||
+ | * Adapted Fischmann's script to analyze campaign descriptions according to the NRC Emotion Lexicon. | ||
+ | * Evaluated the relative strengths of NRC and LIWC using this [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9565755/ article] as a reference. | ||
+ | * Explored sentiment [https://nlp.stanford.edu/~socherr/EMNLP2013_RNTN.pdf analysis] as a potential method by which to compare campaigns. | ||
+ | '''Thursday:''' | ||
+ | * Researched the extent to which LLMs convey emotion when prompted to perform various tasks. | ||
+ | '''Friday:''' | ||
+ | * Reviewed how previous researchers have used NRC and LIWC in their work. | ||
+ | * Met with Dr. Xu to discuss my progress and appropriate next steps. | ||
+ | * Combined campaign data and NRC values into a single .csv file. | ||
+ | |||
+ | == Week 4: 6/19/23 - 6/23/23 == | ||
+ | '''Monday:''' | ||
+ | * Added code to count the number of words in each description. | ||
+ | * Adapted my script to express NRC values as a ratio with respect to word count. | ||
+ | * Generated NRC ratios for all 952 descriptions. | ||
+ | * Searched for literature pertaining to the tone and word choice expressed in LLMs. | ||
+ | ** Are there any patterns researchers have noticed thus far? | ||
+ | '''Tuesday:''' | ||
+ | * Created a Tableau Public account. | ||
+ | * Purchased LIWC and a paid OpenAI subscription with Dr. Xu's permission. | ||
+ | ** Discussed work to complete before my mini-presentation. | ||
+ | * Discovered corrupted punctuation in campaign data as a result of UTF-8 encoding. | ||
+ | * Fixed formatting issues in the AI-generated output file. | ||
+ | * Rebuilt my web scraper to preserve original punctuation. | ||
+ | ** Due to the overwhelming amount of encoding errors in the original dataset, I reran my web scraper to collect data from 1000 new campaigns. | ||
+ | '''Wednesday:''' | ||
+ | * Attended a research presentation by Dr. Bialkowski. | ||
+ | * Discussed NLP techniques and challenges with Amal Khan. | ||
+ | * Wrote a script to remove non-English campaigns (''n'' = 55) from the dataset using Google's Compact Language Detector. | ||
+ | ** ''n'' = 945 | ||
+ | * Wrote a script to remove campaigns launched within 24 hours (''n'' = 27) from the dataset. | ||
+ | ** ''n'' = 918 | ||
+ | '''Thursday:''' | ||
+ | * Prompted GPT-3 to rewrite each campaign description in the new dataset. | ||
+ | ** Not all campaigns (''n'' = 19) replicated successfully. This may have to do with the content of these campaigns; ChatGPT refuses to engage with heavily politicized topics, so GPT-3 could be programmed similarly. (For example, a campaign launched on behalf of a man injured while providing humanitarian aid in Ukraine wouldn't replicate.) | ||
+ | * Applied NRC and LIWC analyses to both sets of descriptions. | ||
+ | '''Friday:''' | ||
+ | * Fine-tuned GPT-3 from 918 original descriptions. | ||
+ | ** While the model is complete, the output is inconsistent. Some replications hallucinate new details, while others include only sentence fragments. | ||
+ | |||
+ | == Week 5: 6/26/23 - 6/30/23 == | ||
+ | '''Monday:''' | ||
+ | * Applied paired (human vs. AI-generated) t-tests to NRC/LIWC categories to determine statistical significance between means. | ||
+ | * Drew boxplots to visualize the distribution of statistically significant categories. | ||
+ | '''Tuesday:''' | ||
+ | * Met with Dr. Xu to discuss a rough outline of my presentation and strategies for effective delivery. | ||
+ | * Researched healthcare [https://www.debt.org/medical/hospital-surgery-costs/ costs] and the [https://www.washington.edu/news/2022/02/03/for-the-uninsured-crowdfunding-provides-little-help-in-paying-for-health-care-and-deepens-inequities/ pitfalls] of medical crowdfunding to provide context for my project. | ||
+ | * Wrote slides to accompany my presentation. | ||
+ | * Rehearsed my talking points. | ||
+ | '''Wednesday:''' | ||
+ | * Delivered my presentation. | ||
+ | * Watched others' presentations. | ||
+ | '''Thursday:''' | ||
+ | * Brainstormed hypotheses based on the results of my statistical analysis. | ||
+ | ** Knowing that AI-generated campaigns are more likely to address the reader directly and suggest a more emotional tone, is the public more inclined to receive these messages positively? | ||
+ | '''Friday:''' | ||
+ | * Discussed survey design and logistics with Dr. Xu. | ||
+ | * Drafted survey questions in Qualtrics. | ||
+ | |||
+ | == Week 6: 7/3/23 - 7/7/23 == | ||
+ | '''Monday:''' | ||
+ | * Joined a Zoom call with Dr. Xu and a Qualtrics representative to discuss the logistics of survey distribution. | ||
+ | ** The Qualtrics representative didn't show due to miscommunication. | ||
+ | '''Thursday:''' | ||
+ | * Visited the Harley-Davidson Museum with other REU students. | ||
+ | '''Friday:''' | ||
+ | * Discussed a timeline for the remainder of the project with Dr. Xu. | ||
+ | ** We hope to publish our research in a journal and/or present it at a conference. | ||
+ | |||
+ | == Week 7: 7/10/23 - 7/14/23 == | ||
+ | '''Monday:''' | ||
+ | * Refined Qualtrics survey questions. | ||
+ | * Attempted to randomize campaign messages in the survey. | ||
+ | * Met with Dr. Xu to discuss a revision of our experiment. | ||
+ | ** Instead of distributing all 900 human-AI campaign pairs, we can distribute a subset in order to reduce the effect of confounding variables. | ||
+ | '''Tuesday:''' | ||
+ | * Completed Human Subjects Research training as mandated by the IRB. | ||
+ | * Read [https://leeds-faculty.colorado.edu/dahe7472/OB%202022/glickson%202021.pdf this] article on trust in AI agents. | ||
+ | '''Wednesday:''' | ||
+ | * | ||
+ | '''Thursday:''' | ||
+ | * | ||
+ | '''Friday:''' | ||
+ | * |
Latest revision as of 16:49, 13 July 2023
Contents
Week 1: 5/30/23 - 6/2/23
Tuesday:
- Attended REU orientation.
- Toured Dr. Xu's lab and discussed his intentions for the project.
- Downloaded Windows Terminal and Python 3.11 to the lab computer.
- Installed OpenAI's command-line interface (CLI) to use in training a fine-tuned model.
- Completed Unit 1 and the first half of Unit 2 of RCR/RECR training as mandated by the NSF.
Wednesday:
- Attended Dr. Brylow's presentation on good research practices and the importance of keeping logs.
- Completed Units 2 and 3 of RCR/RECR training.
- Researched factors that influence perceived credibility in human and AI-generated communication.
- Examined methodologies used by other researchers to evaluate the degree of trust in unknown authors.
Thursday:
- Narrowed research to large language models (LLMs) and their capacity for social awareness.
- Attempted to connect with Alex Fischmann, who completed a related project under Dr. Xu's mentorship.
- Familiarized myself with PyCharm and began this tutorial for fine-tuning a model.
- Examined deliverables produced by Fischmann during the course of her project.
Friday:
- Explored case studies of fine-tuned LLMs and their applications.
- Began troubleshooting the fine-tuning process using this guide as a reference.
- Met with Dr. Xu to discuss a rough timeline of the project.
- Drafted a summary of research goals according to our proposed timeline.
Week 2: 6/5/23 - 6/9/23
Monday:
- Attended RCR training with Dr. Brylow.
Tuesday:
- Installed PyCharm 2022.1.4 and required packages for web scraping.
- Fixed warnings in Fischmann's web scraper.
- Adapted the aforementioned script to GoFundMe's medical crowdfunding homepage.
- Created a .csv file in which to store campaign data.
- Retrieved the following data: title, URL, description, organizer(s), and launch date.
Wednesday:
- Attended Dr. Brylow's presentation on technical writing and effective research talks.
- Modified my script to retrieve the following data: amount raised, goal, beneficiary, and number of donations.
- Added code to store campaign data in the aforementioned .csv file.
- Implemented try/catch statements to record "NA" values in the event that data isn't found.
- In order to store campaign descriptions in a .csv file, I had to remove commas, which will interfere with fine-tuning later in the project. Possible solutions include using a different file extension (e.g., .tsv) or replacing commas with punctuation unlikely to appear elsewhere in the description.
- Discovered the following bugs:
- If a campaign was launched within the last week, the launch date retrieved by the web scraper is expressed relative to the current time. For example, a campaign published yesterday will return "[launched] one day ago" as opposed to "[launched] June 7 2023." Work is needed to express all dates in the latter form OR throw out campaigns launched within the week.
- Long descriptions are hidden by a "Read more" button on GoFundMe. My script appends "Read more" to the visible description as opposed to retrieving hidden text. Code is needed to click the "Read more" button and retrieve the full description.
Thursday:
- Fixed bugs in description and organizer retrieval.
- Attempted to reformat launch dates. I might have to throw out campaigns without proper dates attached.
- Replaced commas in campaign descriptions with semicolons. I should be able to revert back to commas before fine-tuning my LLM. If I can't, semicolons are a decent substitution because they preserve tone and readability despite grammatical incorrectness.
- Fixed formatting issues in output.csv.
- Copied data from 504 campaigns to output.csv.
- Doesn't include number of donations due to a bug.
Friday:
- Called Dr. Xu to discuss my progress and appropriate next steps.
Week 3: 6/12/23 - 6/16/23
Monday:
- Reformatted launch dates for campaigns older than 24 hours.
- I think we could reasonably discard newer campaigns due to their relative infrequency.
- Fixed bug in number of donations.
- Scraped data from 1000 campaigns.
- n = 1000 is the maximum number of campaigns I can retrieve data from at once.
- Removed non-English campaigns (n = 23) and campaigns launched within 24 hours (n = 25) from the dataset.
- n = 952
Tuesday:
- Stored data from the first ten campaigns in a separate .csv file.
- Wrote a script to rewrite campaign descriptions using GPT-3.
- Reached the maximum number of tokens provided by OpenAI.
- Each rewrite was appended to the output file as a new row, as opposed to in-line with the corresponding campaign. I attempted to fix this before I ran out of tokens.
Wednesday:
- Adapted Fischmann's script to analyze campaign descriptions according to the NRC Emotion Lexicon.
- Evaluated the relative strengths of NRC and LIWC using this article as a reference.
- Explored sentiment analysis as a potential method by which to compare campaigns.
Thursday:
- Researched the extent to which LLMs convey emotion when prompted to perform various tasks.
Friday:
- Reviewed how previous researchers have used NRC and LIWC in their work.
- Met with Dr. Xu to discuss my progress and appropriate next steps.
- Combined campaign data and NRC values into a single .csv file.
Week 4: 6/19/23 - 6/23/23
Monday:
- Added code to count the number of words in each description.
- Adapted my script to express NRC values as a ratio with respect to word count.
- Generated NRC ratios for all 952 descriptions.
- Searched for literature pertaining to the tone and word choice expressed in LLMs.
- Are there any patterns researchers have noticed thus far?
Tuesday:
- Created a Tableau Public account.
- Purchased LIWC and a paid OpenAI subscription with Dr. Xu's permission.
- Discussed work to complete before my mini-presentation.
- Discovered corrupted punctuation in campaign data as a result of UTF-8 encoding.
- Fixed formatting issues in the AI-generated output file.
- Rebuilt my web scraper to preserve original punctuation.
- Due to the overwhelming amount of encoding errors in the original dataset, I reran my web scraper to collect data from 1000 new campaigns.
Wednesday:
- Attended a research presentation by Dr. Bialkowski.
- Discussed NLP techniques and challenges with Amal Khan.
- Wrote a script to remove non-English campaigns (n = 55) from the dataset using Google's Compact Language Detector.
- n = 945
- Wrote a script to remove campaigns launched within 24 hours (n = 27) from the dataset.
- n = 918
Thursday:
- Prompted GPT-3 to rewrite each campaign description in the new dataset.
- Not all campaigns (n = 19) replicated successfully. This may have to do with the content of these campaigns; ChatGPT refuses to engage with heavily politicized topics, so GPT-3 could be programmed similarly. (For example, a campaign launched on behalf of a man injured while providing humanitarian aid in Ukraine wouldn't replicate.)
- Applied NRC and LIWC analyses to both sets of descriptions.
Friday:
- Fine-tuned GPT-3 from 918 original descriptions.
- While the model is complete, the output is inconsistent. Some replications hallucinate new details, while others include only sentence fragments.
Week 5: 6/26/23 - 6/30/23
Monday:
- Applied paired (human vs. AI-generated) t-tests to NRC/LIWC categories to determine statistical significance between means.
- Drew boxplots to visualize the distribution of statistically significant categories.
Tuesday:
- Met with Dr. Xu to discuss a rough outline of my presentation and strategies for effective delivery.
- Researched healthcare costs and the pitfalls of medical crowdfunding to provide context for my project.
- Wrote slides to accompany my presentation.
- Rehearsed my talking points.
Wednesday:
- Delivered my presentation.
- Watched others' presentations.
Thursday:
- Brainstormed hypotheses based on the results of my statistical analysis.
- Knowing that AI-generated campaigns are more likely to address the reader directly and suggest a more emotional tone, is the public more inclined to receive these messages positively?
Friday:
- Discussed survey design and logistics with Dr. Xu.
- Drafted survey questions in Qualtrics.
Week 6: 7/3/23 - 7/7/23
Monday:
- Joined a Zoom call with Dr. Xu and a Qualtrics representative to discuss the logistics of survey distribution.
- The Qualtrics representative didn't show due to miscommunication.
Thursday:
- Visited the Harley-Davidson Museum with other REU students.
Friday:
- Discussed a timeline for the remainder of the project with Dr. Xu.
- We hope to publish our research in a journal and/or present it at a conference.
Week 7: 7/10/23 - 7/14/23
Monday:
- Refined Qualtrics survey questions.
- Attempted to randomize campaign messages in the survey.
- Met with Dr. Xu to discuss a revision of our experiment.
- Instead of distributing all 900 human-AI campaign pairs, we can distribute a subset in order to reduce the effect of confounding variables.
Tuesday:
- Completed Human Subjects Research training as mandated by the IRB.
- Read this article on trust in AI agents.
Wednesday:
Thursday:
Friday: