Difference between revisions of "User:Cnapun"
From REU@MU
(→Weekly Log) |
m |
||
(21 intermediate revisions by the same user not shown) | |||
Line 21: | Line 21: | ||
=== Week 2 (12 Jun - 16 Jun) === | === Week 2 (12 Jun - 16 Jun) === | ||
* Started playing with TensorFlow, mostly using the GEFCom2014-E dataset to allow me to continue working when not at the lab | * Started playing with TensorFlow, mostly using the GEFCom2014-E dataset to allow me to continue working when not at the lab | ||
− | * | + | * Things I got working in Tensorflow: |
+ | ** n-layer Sequence to sequence (seq2seq) model (encoder-deocder architecture) | ||
+ | ** Autoregressive seq2seq model (using slower ops) | ||
+ | ** Multilayer Perceptron | ||
+ | ** 1D ConvNet on weather inputs, in parallel with MLP to process load and time inputs | ||
+ | * Will try ConvNet on all 3 inputs, feeding into RNN, with the seed state being the month, day, and day of week at the initial timestamp | ||
+ | * Will try Clockwork RNN | ||
+ | |||
+ | === Week 3 (19 Jun - 23 Jun) === | ||
+ | * Reinstalled CUDA | ||
+ | * Started training seq2seq models on GasDay (electric) data | ||
+ | * Got TensorBoard to work | ||
+ | * Trained AR LSTM on GasDay gas data | ||
+ | |||
+ | === Week 4 (26 Jun - 30 Jun) === | ||
+ | * Continued to experiment with hyperparameters | ||
+ | * Achieved good performance on both gas and electricity datasets | ||
+ | * Prepared a presentation of my research for other REU students ([[:File:RecurrentNeuralNetworksforEnergyForecasting.pdf|slides]]) | ||
+ | * Attempted to train a model with objective that was not at all convex; it failed miserably | ||
+ | * Still can't figure out why Dropout makes everything worse | ||
+ | |||
+ | === Week 5 (3 Jul - 7 Jul) === | ||
+ | * Continued to experiment with different models | ||
+ | * Incorporated both long- and short-term dependencies into an encoder-decoder architecture | ||
+ | * Presented to GasDay information about what I have been doing and neural networks in general ([[:File:RecurrentNeuralNetworksforEnergyForecasting(GasDay).pdf|slides]]). | ||
+ | |||
+ | === Week 6 (10 Jul - 14 Jul) === | ||
+ | * Did not finish draft of paper by deadline | ||
+ | * Worked on determining the best model | ||
+ | * Reincorporated BatchNorm and DropOut with good results | ||
+ | * Experimented with convolution layers, but they rapidly overfit data | ||
+ | |||
+ | === Week 7 (17 Jul - 21 Jul) === | ||
+ | * Started creating poster | ||
+ | * Tried to evaluate models on test data, but results were highly subpar | ||
+ | * Spent the rest of the week trying to figure out why errors on test data were inconsistent with val errors | ||
+ | ** It is probably because the model is trained on years 0-14, validated on 14-18, and tested on 18-20 | ||
+ | ** 4 years skip between training data and test might cause issues | ||
+ | ** Tried using detrended data, but results did not improve significantly | ||
+ | |||
+ | === Week 8 (24 Jul - 28 Jul) === | ||
+ | * Finished poster ([[:File:Poster-2017-Pancha.pdf|pdf here]]) | ||
+ | * Finalized models for gas forecasting. I decided to not use GasDay's electricity data at all. | ||
+ | * Started running models on electric data from GefCom 2014 so that results may be reproduced. | ||
+ | * Finished first draft of paper (which I may try to submit to a conference) | ||
+ | * Presented my work to GasDay | ||
+ | |||
+ | === Week 9 (31 Jul - 4 Aug) === | ||
+ | * Gave draft to Dr. Povinelli for feedback | ||
+ | * Presented to REU students | ||
+ | * Poster session | ||
+ | * Continued to tune hyperparameters on GefCom data | ||
+ | * Finished Responsible Conduct of Research training | ||
== Reading == | == Reading == | ||
Line 27: | Line 79: | ||
=== Forecasting with Deep Learning === | === Forecasting with Deep Learning === | ||
* [https://arxiv.org/abs/1610.09460 "Building Energy Load Forecasting using Deep Neural Networks"] | * [https://arxiv.org/abs/1610.09460 "Building Energy Load Forecasting using Deep Neural Networks"] | ||
− | ** Discusses the direct application of LSTMs to load forecasting | + | ** Discusses the direct application of seq2seq LSTMs to load forecasting |
* [http://ieeexplore.ieee.org/document/6796853/ "Training Recurrent Networks by Evolino"] | * [http://ieeexplore.ieee.org/document/6796853/ "Training Recurrent Networks by Evolino"] | ||
** Describes the use of genetic algorithms to train RNNs | ** Describes the use of genetic algorithms to train RNNs | ||
** Should offer better performance than Echo State Networks (ESNs) | ** Should offer better performance than Echo State Networks (ESNs) | ||
** No description of how crossover and mutate operations work; I referred to [http://davidmontana.net/papers/hybrid.pdf "Neural Network Weight Selection Using Genetic Algorithms"] for a more complete explanation | ** No description of how crossover and mutate operations work; I referred to [http://davidmontana.net/papers/hybrid.pdf "Neural Network Weight Selection Using Genetic Algorithms"] for a more complete explanation | ||
+ | |||
=== Hybrid Methods === | === Hybrid Methods === | ||
− | + | * [https://openreview.net/pdf?id=ByD6xlrFe "Hybrid Neural Networks Over Time Series For Trend Forecasting"] | |
− | + | * [http://www.sciencedirect.com/science/article/pii/S0925231201007020 "Time series forecasting using a hybrid ARIMA and neural network model"] | |
− | + | * [http://ieeexplore.ieee.org/document/5433249/ "Intelligent Hybrid Wavelet Models for Short-Term Load Forecasting"] | |
− | + | * [http://ieeexplore.ieee.org/document/5340640/ "Short-Term Load Forecasting: Similar Day-Based Wavelet Neural Networks"] | |
=== Review Papers === | === Review Papers === | ||
− | + | * [http://ieeexplore.ieee.org/document/7581373/ "Short-Term Load Forecasting Methods: A Review"] | |
− | + | * [https://arxiv.org/abs/1705.04378 "An overview and comparative analysis of Recurrent Neural Networks for Short Term Load Forecasting"], which gives a good, current overview of methods | |
=== LSTMs, Training, and Possible Improvements === | === LSTMs, Training, and Possible Improvements === | ||
− | + | * [https://arxiv.org/abs/1409.2329 "Recurrent Neural Network Regularization"] | |
− | + | * [https://arxiv.org/abs/1512.05287 "A Theoretically Grounded Application of Dropout in Recurrent Neural Networks"] | |
− | + | * [https://arxiv.org/abs/1609.07959 "Multiplicative LSTM for sequence modelling"] | |
− | + | * [https://arxiv.org/abs/1610.09513 "Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences"] | |
− | ** [https://arxiv.org/abs/1511.01432 "Semi-supervised Sequence Learning"] | + | * [https://arxiv.org/abs/1402.3511 "A Clockwork RNN"] |
+ | * [https://arxiv.org/abs/1511.01432 "Semi-supervised Sequence Learning"] |
Latest revision as of 03:27, 11 August 2017
Contents
Personal Information
- Case Western Reserve University Class of 2019
- Applied Mathematics and Computer Science Major
Weekly Log
Week 0 (30 May - 2 Jun)
- Met Dr. Povinelli 4 times: to discuss possible project topics, to get oriented with the lab, to decide on a project topic, and to establish weekly milestones
- Obtained MU and MSCS account logins and ID card
- Read most of "Learning Deep Architectures for AI"
- Read section on Sequence Modeling (Ch 10) of The Deep Learning book
- Read various other papers
Week 1 (5 Jun - 9 Jun)
- Attended GasDay camp and learned about what GasDay does
- Attended responsible conduct of research training
- Continued to read papers
- Decided to use TensorFlow and Keras for now
- Got Anaconda, TensorFlow, and Keras installed on a couple lab computers
- Learned how to access customer data
Week 2 (12 Jun - 16 Jun)
- Started playing with TensorFlow, mostly using the GEFCom2014-E dataset to allow me to continue working when not at the lab
- Things I got working in Tensorflow:
- n-layer Sequence to sequence (seq2seq) model (encoder-deocder architecture)
- Autoregressive seq2seq model (using slower ops)
- Multilayer Perceptron
- 1D ConvNet on weather inputs, in parallel with MLP to process load and time inputs
- Will try ConvNet on all 3 inputs, feeding into RNN, with the seed state being the month, day, and day of week at the initial timestamp
- Will try Clockwork RNN
Week 3 (19 Jun - 23 Jun)
- Reinstalled CUDA
- Started training seq2seq models on GasDay (electric) data
- Got TensorBoard to work
- Trained AR LSTM on GasDay gas data
Week 4 (26 Jun - 30 Jun)
- Continued to experiment with hyperparameters
- Achieved good performance on both gas and electricity datasets
- Prepared a presentation of my research for other REU students (slides)
- Attempted to train a model with objective that was not at all convex; it failed miserably
- Still can't figure out why Dropout makes everything worse
Week 5 (3 Jul - 7 Jul)
- Continued to experiment with different models
- Incorporated both long- and short-term dependencies into an encoder-decoder architecture
- Presented to GasDay information about what I have been doing and neural networks in general (slides).
Week 6 (10 Jul - 14 Jul)
- Did not finish draft of paper by deadline
- Worked on determining the best model
- Reincorporated BatchNorm and DropOut with good results
- Experimented with convolution layers, but they rapidly overfit data
Week 7 (17 Jul - 21 Jul)
- Started creating poster
- Tried to evaluate models on test data, but results were highly subpar
- Spent the rest of the week trying to figure out why errors on test data were inconsistent with val errors
- It is probably because the model is trained on years 0-14, validated on 14-18, and tested on 18-20
- 4 years skip between training data and test might cause issues
- Tried using detrended data, but results did not improve significantly
Week 8 (24 Jul - 28 Jul)
- Finished poster (pdf here)
- Finalized models for gas forecasting. I decided to not use GasDay's electricity data at all.
- Started running models on electric data from GefCom 2014 so that results may be reproduced.
- Finished first draft of paper (which I may try to submit to a conference)
- Presented my work to GasDay
Week 9 (31 Jul - 4 Aug)
- Gave draft to Dr. Povinelli for feedback
- Presented to REU students
- Poster session
- Continued to tune hyperparameters on GefCom data
- Finished Responsible Conduct of Research training
Reading
Here are some of the papers I have read, skimmed, and partially read:
Forecasting with Deep Learning
- "Building Energy Load Forecasting using Deep Neural Networks"
- Discusses the direct application of seq2seq LSTMs to load forecasting
- "Training Recurrent Networks by Evolino"
- Describes the use of genetic algorithms to train RNNs
- Should offer better performance than Echo State Networks (ESNs)
- No description of how crossover and mutate operations work; I referred to "Neural Network Weight Selection Using Genetic Algorithms" for a more complete explanation
Hybrid Methods
- "Hybrid Neural Networks Over Time Series For Trend Forecasting"
- "Time series forecasting using a hybrid ARIMA and neural network model"
- "Intelligent Hybrid Wavelet Models for Short-Term Load Forecasting"
- "Short-Term Load Forecasting: Similar Day-Based Wavelet Neural Networks"
Review Papers
- "Short-Term Load Forecasting Methods: A Review"
- "An overview and comparative analysis of Recurrent Neural Networks for Short Term Load Forecasting", which gives a good, current overview of methods
LSTMs, Training, and Possible Improvements
- "Recurrent Neural Network Regularization"
- "A Theoretically Grounded Application of Dropout in Recurrent Neural Networks"
- "Multiplicative LSTM for sequence modelling"
- "Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences"
- "A Clockwork RNN"
- "Semi-supervised Sequence Learning"