Difference between revisions of "User:Cnapun"

From REU@MU
Jump to: navigation, search
(Week 0 (30 May - 2 Jun))
m
 
(27 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
== Personal Information ==
 
== Personal Information ==
 +
* Case Western Reserve University Class of 2019
 +
* Applied Mathematics and Computer Science Major
 +
 
== Weekly Log ==
 
== Weekly Log ==
 
=== Week 0 (30 May - 2 Jun) ===
 
=== Week 0 (30 May - 2 Jun) ===
Line 12: Line 15:
 
* Attended responsible conduct of research training
 
* Attended responsible conduct of research training
 
* Continued to read papers
 
* Continued to read papers
 +
* Decided to use TensorFlow and Keras for now
 +
* Got Anaconda, TensorFlow, and Keras installed on a couple lab computers
 +
* Learned how to access customer data
 +
 +
=== Week 2 (12 Jun - 16 Jun) ===
 +
* Started playing with TensorFlow, mostly using the GEFCom2014-E dataset to allow me to continue working when not at the lab
 +
* Things I got working in Tensorflow:
 +
** n-layer Sequence to sequence (seq2seq) model (encoder-deocder architecture)
 +
** Autoregressive seq2seq model (using slower ops)
 +
** Multilayer Perceptron
 +
** 1D ConvNet on weather inputs, in parallel with MLP to process load and time inputs
 +
* Will try ConvNet on all 3 inputs, feeding into RNN, with the seed state being the month, day, and day of week at the initial timestamp
 +
* Will try Clockwork RNN
 +
 +
=== Week 3 (19 Jun - 23 Jun) ===
 +
* Reinstalled CUDA
 +
* Started training seq2seq models on GasDay (electric) data
 +
* Got TensorBoard to work
 +
* Trained AR LSTM on GasDay gas data
 +
 +
=== Week 4 (26 Jun - 30 Jun) ===
 +
* Continued to experiment with hyperparameters
 +
* Achieved good performance on both gas and electricity datasets
 +
* Prepared a presentation of my research for other REU students ([[:File:RecurrentNeuralNetworksforEnergyForecasting.pdf|slides]])
 +
* Attempted to train a model with objective that was not at all convex; it failed miserably
 +
* Still can't figure out why Dropout makes everything worse
 +
 +
=== Week 5 (3 Jul - 7 Jul) ===
 +
* Continued to experiment with different models
 +
* Incorporated both long- and short-term dependencies into an encoder-decoder architecture
 +
* Presented to GasDay information about what I have been doing and neural networks in general ([[:File:RecurrentNeuralNetworksforEnergyForecasting(GasDay).pdf|slides]]).
 +
 +
=== Week 6 (10 Jul - 14 Jul) ===
 +
* Did not finish draft of paper by deadline
 +
* Worked on determining the best model
 +
* Reincorporated BatchNorm and DropOut with good results
 +
* Experimented with convolution layers, but they rapidly overfit data
 +
 +
=== Week 7 (17 Jul - 21 Jul) ===
 +
* Started creating poster
 +
* Tried to evaluate models on test data, but results were highly subpar
 +
* Spent the rest of the week trying to figure out why errors on test data were inconsistent with val errors
 +
** It is probably because the model is trained on years 0-14, validated on 14-18, and tested on 18-20
 +
** 4 years skip between training data and test might cause issues
 +
** Tried using detrended data, but results did not improve significantly
 +
 +
=== Week 8 (24 Jul - 28 Jul) ===
 +
* Finished poster ([[:File:Poster-2017-Pancha.pdf|pdf here]])
 +
* Finalized models for gas forecasting. I decided to not use GasDay's electricity data at all.
 +
* Started running models on electric data from GefCom 2014 so that results may be reproduced.
 +
* Finished first draft of paper (which I may try to submit to a conference)
 +
* Presented my work to GasDay
 +
 +
=== Week 9 (31 Jul - 4 Aug) ===
 +
* Gave draft to Dr. Povinelli for feedback
 +
* Presented to REU students
 +
* Poster session
 +
* Continued to tune hyperparameters on GefCom data
 +
* Finished Responsible Conduct of Research training
 +
 +
== Reading ==
 +
Here are some of the papers I have read, skimmed, and partially read:
 +
=== Forecasting with Deep Learning ===
 +
* [https://arxiv.org/abs/1610.09460 "Building Energy Load Forecasting using Deep Neural Networks"]
 +
** Discusses the direct application of seq2seq LSTMs to load forecasting
 +
* [http://ieeexplore.ieee.org/document/6796853/ "Training Recurrent Networks by Evolino"]
 +
** Describes the use of genetic algorithms to train RNNs
 +
** Should offer better performance than Echo State Networks (ESNs)
 +
** No description of how crossover and mutate operations work; I referred to [http://davidmontana.net/papers/hybrid.pdf "Neural Network Weight Selection Using Genetic Algorithms"] for a more complete explanation
 +
 +
=== Hybrid Methods ===
 +
* [https://openreview.net/pdf?id=ByD6xlrFe "Hybrid Neural Networks Over Time Series For Trend Forecasting"]
 +
* [http://www.sciencedirect.com/science/article/pii/S0925231201007020 "Time series forecasting using a hybrid ARIMA and neural network model"]
 +
* [http://ieeexplore.ieee.org/document/5433249/ "Intelligent Hybrid Wavelet Models for Short-Term Load Forecasting"]
 +
* [http://ieeexplore.ieee.org/document/5340640/ "Short-Term Load Forecasting: Similar Day-Based Wavelet Neural Networks"]
 +
=== Review Papers ===
 +
* [http://ieeexplore.ieee.org/document/7581373/ "Short-Term Load Forecasting Methods: A Review"]
 +
* [https://arxiv.org/abs/1705.04378 "An overview and comparative analysis of Recurrent Neural Networks for Short Term Load Forecasting"], which gives a good, current overview of methods
 +
=== LSTMs, Training, and Possible Improvements ===
 +
* [https://arxiv.org/abs/1409.2329 "Recurrent Neural Network Regularization"]
 +
* [https://arxiv.org/abs/1512.05287 "A Theoretically Grounded Application of Dropout in Recurrent Neural Networks"]
 +
* [https://arxiv.org/abs/1609.07959 "Multiplicative LSTM for sequence modelling"]
 +
* [https://arxiv.org/abs/1610.09513 "Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences"]
 +
* [https://arxiv.org/abs/1402.3511 "A Clockwork RNN"]
 +
* [https://arxiv.org/abs/1511.01432 "Semi-supervised Sequence Learning"]

Latest revision as of 03:27, 11 August 2017

Personal Information

  • Case Western Reserve University Class of 2019
  • Applied Mathematics and Computer Science Major

Weekly Log

Week 0 (30 May - 2 Jun)

  • Met Dr. Povinelli 4 times: to discuss possible project topics, to get oriented with the lab, to decide on a project topic, and to establish weekly milestones
  • Obtained MU and MSCS account logins and ID card
  • Read most of "Learning Deep Architectures for AI"
  • Read section on Sequence Modeling (Ch 10) of The Deep Learning book
  • Read various other papers

Week 1 (5 Jun - 9 Jun)

  • Attended GasDay camp and learned about what GasDay does
  • Attended responsible conduct of research training
  • Continued to read papers
  • Decided to use TensorFlow and Keras for now
  • Got Anaconda, TensorFlow, and Keras installed on a couple lab computers
  • Learned how to access customer data

Week 2 (12 Jun - 16 Jun)

  • Started playing with TensorFlow, mostly using the GEFCom2014-E dataset to allow me to continue working when not at the lab
  • Things I got working in Tensorflow:
    • n-layer Sequence to sequence (seq2seq) model (encoder-deocder architecture)
    • Autoregressive seq2seq model (using slower ops)
    • Multilayer Perceptron
    • 1D ConvNet on weather inputs, in parallel with MLP to process load and time inputs
  • Will try ConvNet on all 3 inputs, feeding into RNN, with the seed state being the month, day, and day of week at the initial timestamp
  • Will try Clockwork RNN

Week 3 (19 Jun - 23 Jun)

  • Reinstalled CUDA
  • Started training seq2seq models on GasDay (electric) data
  • Got TensorBoard to work
  • Trained AR LSTM on GasDay gas data

Week 4 (26 Jun - 30 Jun)

  • Continued to experiment with hyperparameters
  • Achieved good performance on both gas and electricity datasets
  • Prepared a presentation of my research for other REU students (slides)
  • Attempted to train a model with objective that was not at all convex; it failed miserably
  • Still can't figure out why Dropout makes everything worse

Week 5 (3 Jul - 7 Jul)

  • Continued to experiment with different models
  • Incorporated both long- and short-term dependencies into an encoder-decoder architecture
  • Presented to GasDay information about what I have been doing and neural networks in general (slides).

Week 6 (10 Jul - 14 Jul)

  • Did not finish draft of paper by deadline
  • Worked on determining the best model
  • Reincorporated BatchNorm and DropOut with good results
  • Experimented with convolution layers, but they rapidly overfit data

Week 7 (17 Jul - 21 Jul)

  • Started creating poster
  • Tried to evaluate models on test data, but results were highly subpar
  • Spent the rest of the week trying to figure out why errors on test data were inconsistent with val errors
    • It is probably because the model is trained on years 0-14, validated on 14-18, and tested on 18-20
    • 4 years skip between training data and test might cause issues
    • Tried using detrended data, but results did not improve significantly

Week 8 (24 Jul - 28 Jul)

  • Finished poster (pdf here)
  • Finalized models for gas forecasting. I decided to not use GasDay's electricity data at all.
  • Started running models on electric data from GefCom 2014 so that results may be reproduced.
  • Finished first draft of paper (which I may try to submit to a conference)
  • Presented my work to GasDay

Week 9 (31 Jul - 4 Aug)

  • Gave draft to Dr. Povinelli for feedback
  • Presented to REU students
  • Poster session
  • Continued to tune hyperparameters on GefCom data
  • Finished Responsible Conduct of Research training

Reading

Here are some of the papers I have read, skimmed, and partially read:

Forecasting with Deep Learning

Hybrid Methods

Review Papers

LSTMs, Training, and Possible Improvements