- 1 Personal Information
- 2 Weekly Log
- 3 Reading
- Case Western Reserve University Class of 2019
- Applied Mathematics and Computer Science Major
Week 0 (30 May - 2 Jun)
- Met Dr. Povinelli 4 times: to discuss possible project topics, to get oriented with the lab, to decide on a project topic, and to establish weekly milestones
- Obtained MU and MSCS account logins and ID card
- Read most of "Learning Deep Architectures for AI"
- Read section on Sequence Modeling (Ch 10) of The Deep Learning book
- Read various other papers
Week 1 (5 Jun - 9 Jun)
- Attended GasDay camp and learned about what GasDay does
- Attended responsible conduct of research training
- Continued to read papers
- Decided to use TensorFlow and Keras for now
- Got Anaconda, TensorFlow, and Keras installed on a couple lab computers
- Learned how to access customer data
Week 2 (12 Jun - 16 Jun)
- Started playing with TensorFlow, mostly using the GEFCom2014-E dataset to allow me to continue working when not at the lab
- Things I got working in Tensorflow:
- n-layer Sequence to sequence (seq2seq) model (encoder-deocder architecture)
- Autoregressive seq2seq model (using slower ops)
- Multilayer Perceptron
- 1D ConvNet on weather inputs, in parallel with MLP to process load and time inputs
- Will try ConvNet on all 3 inputs, feeding into RNN, with the seed state being the month, day, and day of week at the initial timestamp
- Will try Clockwork RNN
Week 3 (19 Jun - 23 Jun)
- Reinstalled CUDA
- Started training seq2seq models on GasDay (electric) data
- Got TensorBoard to work
- Trained AR LSTM on GasDay gas data
Week 4 (26 Jun - 30 Jun)
- Continued to experiment with hyperparameters
- Achieved good performance on both gas and electricity datasets
- Prepared a presentation of my research for other REU students (slides)
- Attempted to train a model with objective that was not at all convex; it failed miserably
- Still can't figure out why Dropout makes everything worse
Week 5 (3 Jul - 7 Jul)
- Continued to experiment with different models
- Incorporated both long- and short-term dependencies into an encoder-decoder architecture
- Presented to GasDay information about what I have been doing and neural networks in general (slides).
Week 6 (10 Jul - 14 Jul)
- Did not finish draft of paper by deadline
- Worked on determining the best model
- Reincorporated BatchNorm and DropOut with good results
- Experimented with convolution layers, but they rapidly overfit data
Week 7 (17 Jul - 21 Jul)
- Started creating poster
- Tried to evaluate models on test data, but results were highly subpar
- Spent the rest of the week trying to figure out why errors on test data were inconsistent with val errors
- It is probably because the model is trained on years 0-14, validated on 14-18, and tested on 18-20
- 4 years skip between training data and test might cause issues
- Tried using detrended data, but results did not improve significantly
Week 8 (24 Jul - 28 Jul)
- Finished poster (pdf here)
- Finalized models for gas forecasting. I decided to not use GasDay's electricity data at all.
- Started running models on electric data from GefCom 2014 so that results may be reproduced.
- Finished first draft of paper (which I may try to submit to a conference)
- Presented my work to GasDay
Week 9 (31 Jul - 4 Aug)
- Gave draft to Dr. Povinelli for feedback
- Presented to REU students
- Poster session
- Continued to tune hyperparameters on GefCom data
- Finished Responsible Conduct of Research training
Here are some of the papers I have read, skimmed, and partially read:
Forecasting with Deep Learning
- "Building Energy Load Forecasting using Deep Neural Networks"
- Discusses the direct application of seq2seq LSTMs to load forecasting
- "Training Recurrent Networks by Evolino"
- Describes the use of genetic algorithms to train RNNs
- Should offer better performance than Echo State Networks (ESNs)
- No description of how crossover and mutate operations work; I referred to "Neural Network Weight Selection Using Genetic Algorithms" for a more complete explanation
- "Hybrid Neural Networks Over Time Series For Trend Forecasting"
- "Time series forecasting using a hybrid ARIMA and neural network model"
- "Intelligent Hybrid Wavelet Models for Short-Term Load Forecasting"
- "Short-Term Load Forecasting: Similar Day-Based Wavelet Neural Networks"
- "Short-Term Load Forecasting Methods: A Review"
- "An overview and comparative analysis of Recurrent Neural Networks for Short Term Load Forecasting", which gives a good, current overview of methods
LSTMs, Training, and Possible Improvements
- "Recurrent Neural Network Regularization"
- "A Theoretically Grounded Application of Dropout in Recurrent Neural Networks"
- "Multiplicative LSTM for sequence modelling"
- "Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences"
- "A Clockwork RNN"
- "Semi-supervised Sequence Learning"