Difference between revisions of "User:Grberlstein"
From REU@MU
Grberlstein (Talk | contribs) |
Grberlstein (Talk | contribs) |
||
Line 10: | Line 10: | ||
*[https://arxiv.org/pdf/1704.01347.pdf Quantifying Search Bias] | *[https://arxiv.org/pdf/1704.01347.pdf Quantifying Search Bias] | ||
*[https://pdfs.semanticscholar.org/e092/65ed8eee4c7b35e3ebe53b5d75492b4628a2.pdf Understanding and Designing around Users' Interaction with Hidden Algorithms in Sociotechnical Systems] | *[https://pdfs.semanticscholar.org/e092/65ed8eee4c7b35e3ebe53b5d75492b4628a2.pdf Understanding and Designing around Users' Interaction with Hidden Algorithms in Sociotechnical Systems] | ||
+ | *[http://journals.sagepub.com/doi/full/10.1177/2053951716679679 The ethics of algorithms: Mapping the debate] | ||
=== Clustering and Data Science === | === Clustering and Data Science === | ||
*[http://homepages.inf.ed.ac.uk/rbf/BOOKS/JAIN/Clustering_Jain_Dubes.pdf Algorithms for Clustering Data] | *[http://homepages.inf.ed.ac.uk/rbf/BOOKS/JAIN/Clustering_Jain_Dubes.pdf Algorithms for Clustering Data] | ||
*[https://datasciencelab.wordpress.com/tag/k-means/ K-means Clustering in Python] | *[https://datasciencelab.wordpress.com/tag/k-means/ K-means Clustering in Python] | ||
+ | * | ||
Line 69: | Line 71: | ||
=='''Week Three (6/12 - 6/16)'''== | =='''Week Three (6/12 - 6/16)'''== | ||
+ | ==='''Day 1 (6/12)'''=== | ||
+ | *Fixed point plotting to align with shapefile | ||
+ | *Added choropleth coloring by neighborhood | ||
+ | *Started reading [http://journals.sagepub.com/doi/full/10.1177/2053951716679679 The ethics of algorithms: Mapping the debate] | ||
+ | ==='''Day 2 (6/13)'''=== | ||
+ | *Finished [http://journals.sagepub.com/doi/full/10.1177/2053951716679679 The ethics of algorithms: Mapping the debate] | ||
+ | *Started reading ''Weapons of Math Destruction'' | ||
+ | *Started implementation of website from GitHub | ||
+ | *Established the needed dependencies to run a local instance of Jekyll | ||
+ | ==='''Day 3 (6/14)'''=== | ||
+ | *Finished website framework | ||
+ | *Uploaded initial map version | ||
+ | *Started on the second version of the map | ||
+ | ==='''Day 4 (6/14)'''=== | ||
+ | *Split the data into multiple sets | ||
+ | *Used K-Means to sort in a variety of ways | ||
+ | *Wrote a python script to run K-Means multiple times and aggregate the results for display | ||
+ | ==='''Day 5 (6/15)'''=== | ||
+ | *Put modified data into D3 setup for the new map | ||
+ | *Tweaked basic settings | ||
+ | *Added ability to display different variations of K-Means on the map |
Revision as of 00:22, 18 June 2017
Contents
Griffin Berlstein
Nominally a person.
Readings
Background
Algorithmic Ethics
- Ethics of Algorithms
- Is There an Ethics of Algorithms?
- Toward an Ethics of Algorithms
- Quantifying Search Bias
- Understanding and Designing around Users' Interaction with Hidden Algorithms in Sociotechnical Systems
- The ethics of algorithms: Mapping the debate
Clustering and Data Science
Project Log For Summer 2017
Week One (5/30 - 6/2)
Day 1 (5/30)
- Attended REU orientation
- Obtained ID card and computer access
- Met with Dr. Guha and discussed broad ideas surrounding the project
Day 2 (5/31)
- Attended Library orientation
- Finished reading Ethics of Algorithms by Thijs Slot. This was the last of the pre-REU reading.
- Started reviewing the basics of Python
- Given crime data sets to review by Dr. Guha
Day 3 (6/1)
- Attended a meeting on proper research practices by Dr. Factor
- Set up direct deposit
- Reviewed the basics of GitHub
- Continued to review Python
- Examined crime data and the various ways it was made publically available
Day 4 (6/2)
- Moved mentor meeting to Wednesday due to scheduling issue
- Started reading background information provided by Dr. Guha
- Set up Jupyter notebook and the various dependent libraries
- Created rough implementation of K-means clustering on random data
- Obtained card access to Dr. Guha's lab
- Posted rough, pre-discussion milestones
Week Two (6/5 - 6/9)
Day 1 (6/5)
- Refined K-means implementation with the K-means++ seeding described in the Data Science Lab article
- Tested the algorithm on random Gaussian distributions, rather than random points
- Experimented with visual plotting of the algorithm using Seaborn and Matplotlib
Day 2 (6/6)
- Attended RCR training
- Finished reading the relevant sections of Algorithms for Clustering Data
- Experimented with Scikit-learn's implementation of K-means
Day 3 (6/7)
- Met with Dr. Guha and discussed the immediate future
- Set the goal to produce an interactive crime map by next Wednesday
- Gathered data from website and began sorting
Day 4 (6/8)
- Created a script to aggregate the data from multiple spreadsheets into a single usable file
- Looked into potential libraries needed to create the interactive map
- Ran into issues with the format of the data location
- Converted the addresses in the data into latitude/longitude coordinates
Day 5 (6/9)
- Found a publically available shape file of the city
- Set up the necessary scripts to display the file
- Ran into an issue with the points not being in the same coordinate system as the shape file
Week Three (6/12 - 6/16)
Day 1 (6/12)
- Fixed point plotting to align with shapefile
- Added choropleth coloring by neighborhood
- Started reading The ethics of algorithms: Mapping the debate
Day 2 (6/13)
- Finished The ethics of algorithms: Mapping the debate
- Started reading Weapons of Math Destruction
- Started implementation of website from GitHub
- Established the needed dependencies to run a local instance of Jekyll
Day 3 (6/14)
- Finished website framework
- Uploaded initial map version
- Started on the second version of the map
Day 4 (6/14)
- Split the data into multiple sets
- Used K-Means to sort in a variety of ways
- Wrote a python script to run K-Means multiple times and aggregate the results for display
Day 5 (6/15)
- Put modified data into D3 setup for the new map
- Tweaked basic settings
- Added ability to display different variations of K-Means on the map