Difference between revisions of "User:Grberlstein"

From REU@MU
Jump to: navigation, search
Line 10: Line 10:
 
*[https://arxiv.org/pdf/1704.01347.pdf Quantifying Search Bias]
 
*[https://arxiv.org/pdf/1704.01347.pdf Quantifying Search Bias]
 
*[https://pdfs.semanticscholar.org/e092/65ed8eee4c7b35e3ebe53b5d75492b4628a2.pdf Understanding and Designing around Users' Interaction with Hidden Algorithms in Sociotechnical Systems]
 
*[https://pdfs.semanticscholar.org/e092/65ed8eee4c7b35e3ebe53b5d75492b4628a2.pdf Understanding and Designing around Users' Interaction with Hidden Algorithms in Sociotechnical Systems]
 +
*[http://journals.sagepub.com/doi/full/10.1177/2053951716679679 The ethics of algorithms: Mapping the debate]
 
=== Clustering and Data Science ===
 
=== Clustering and Data Science ===
 
*[http://homepages.inf.ed.ac.uk/rbf/BOOKS/JAIN/Clustering_Jain_Dubes.pdf Algorithms for Clustering Data]
 
*[http://homepages.inf.ed.ac.uk/rbf/BOOKS/JAIN/Clustering_Jain_Dubes.pdf Algorithms for Clustering Data]
 
*[https://datasciencelab.wordpress.com/tag/k-means/ K-means Clustering in Python]
 
*[https://datasciencelab.wordpress.com/tag/k-means/ K-means Clustering in Python]
 +
*
  
  
Line 69: Line 71:
  
 
=='''Week Three (6/12 - 6/16)'''==
 
=='''Week Three (6/12 - 6/16)'''==
 +
==='''Day 1 (6/12)'''===
 +
*Fixed point plotting to align with shapefile
 +
*Added choropleth coloring by neighborhood
 +
*Started reading [http://journals.sagepub.com/doi/full/10.1177/2053951716679679 The ethics of algorithms: Mapping the debate]
 +
==='''Day 2 (6/13)'''===
 +
*Finished [http://journals.sagepub.com/doi/full/10.1177/2053951716679679 The ethics of algorithms: Mapping the debate]
 +
*Started reading ''Weapons of Math Destruction''
 +
*Started implementation of website from GitHub
 +
*Established the needed dependencies to run a local instance of Jekyll
 +
==='''Day 3 (6/14)'''===
 +
*Finished website framework
 +
*Uploaded initial map version
 +
*Started on the second version of the map
 +
==='''Day 4 (6/14)'''===
 +
*Split the data into multiple sets
 +
*Used K-Means to sort in a variety of ways
 +
*Wrote a python script to run K-Means multiple times and aggregate the results for display
 +
==='''Day 5 (6/15)'''===
 +
*Put modified data into D3 setup for the new map
 +
*Tweaked basic settings
 +
*Added ability to display different variations of K-Means on the map

Revision as of 00:22, 18 June 2017

Griffin Berlstein

Nominally a person.

Readings

Background

Algorithmic Ethics

Clustering and Data Science


Project Log For Summer 2017

Week One (5/30 - 6/2)

Day 1 (5/30)

  • Attended REU orientation
  • Obtained ID card and computer access
  • Met with Dr. Guha and discussed broad ideas surrounding the project

Day 2 (5/31)

  • Attended Library orientation
  • Finished reading Ethics of Algorithms by Thijs Slot. This was the last of the pre-REU reading.
  • Started reviewing the basics of Python
  • Given crime data sets to review by Dr. Guha

Day 3 (6/1)

  • Attended a meeting on proper research practices by Dr. Factor
  • Set up direct deposit
  • Reviewed the basics of GitHub
  • Continued to review Python
  • Examined crime data and the various ways it was made publically available

Day 4 (6/2)

  • Moved mentor meeting to Wednesday due to scheduling issue
  • Started reading background information provided by Dr. Guha
  • Set up Jupyter notebook and the various dependent libraries
  • Created rough implementation of K-means clustering on random data
  • Obtained card access to Dr. Guha's lab
  • Posted rough, pre-discussion milestones

Week Two (6/5 - 6/9)

Day 1 (6/5)

  • Refined K-means implementation with the K-means++ seeding described in the Data Science Lab article
  • Tested the algorithm on random Gaussian distributions, rather than random points
  • Experimented with visual plotting of the algorithm using Seaborn and Matplotlib

Day 2 (6/6)

  • Attended RCR training
  • Finished reading the relevant sections of Algorithms for Clustering Data
  • Experimented with Scikit-learn's implementation of K-means

Day 3 (6/7)

  • Met with Dr. Guha and discussed the immediate future
  • Set the goal to produce an interactive crime map by next Wednesday
  • Gathered data from website and began sorting

Day 4 (6/8)

  • Created a script to aggregate the data from multiple spreadsheets into a single usable file
  • Looked into potential libraries needed to create the interactive map
  • Ran into issues with the format of the data location
  • Converted the addresses in the data into latitude/longitude coordinates

Day 5 (6/9)

  • Found a publically available shape file of the city
  • Set up the necessary scripts to display the file
  • Ran into an issue with the points not being in the same coordinate system as the shape file

Week Three (6/12 - 6/16)

Day 1 (6/12)

Day 2 (6/13)

Day 3 (6/14)

  • Finished website framework
  • Uploaded initial map version
  • Started on the second version of the map

Day 4 (6/14)

  • Split the data into multiple sets
  • Used K-Means to sort in a variety of ways
  • Wrote a python script to run K-Means multiple times and aggregate the results for display

Day 5 (6/15)

  • Put modified data into D3 setup for the new map
  • Tweaked basic settings
  • Added ability to display different variations of K-Means on the map