Difference between revisions of "User:Abby.Martin"
From REU@MU
Abby.Martin (Talk | contribs) (→Week Four) |
Abby.Martin (Talk | contribs) (→Week Four) |
||
Line 104: | Line 104: | ||
**Began work on mini presentation | **Began work on mini presentation | ||
**Began testing data using Weka to create model trees | **Began testing data using Weka to create model trees | ||
+ | *Tuesday, June 23rd | ||
+ | **Created and tested 10 fold cross-validation tests with a minimum of 10 instances at each leaf for each data set | ||
*Thursday, June 25th | *Thursday, June 25th | ||
**Meet for lunch and discuss progress and challenges thus far | **Meet for lunch and discuss progress and challenges thus far |
Revision as of 16:31, 23 June 2015
Contents
Project
Mentor: Dr. Richard Povinelli
I will be researching the application and accuracy of linear regression model trees. I aim to test the effectiveness of this method in assisting with electric load forecasting. I also plan on comparing this method of forecasting to a multitude of other methods that have also been attempted.
Goals & Milestones
- Test the influence of linear regression model trees on the accuracy of electrical use forecasting.
- Determine if linear regression model trees are a better method of forecasting electric load forecasting than other methods.
- Create a linear regression model tree using MATLAB or WEKA for use in the electric portion of GasDay/ apply data and methods found/created to data from the GasDay lab.
- Research linear regression model trees and electrical usage.
- Continue research on linear regression model trees, electrical usage, WEKA and MATLAB. Start creating methods for forecasting electrical use using linear regression model trees.
- Continue research on linear regression model trees and electrical usage. Also learn how to effectively use MATLAB and WEKA.
- Test my linear regression model tree with real data and compare its effectiveness with that of other forecasting methods.
Weekly Goals
Week One
- Read and research papers that address the following topics:
- Decision Trees
- Machine Learning
- Model Trees
- Linear Regression Model Trees
- Electric Load Forecasting and other methods that have been used
Week Two
- Continue reading about linear regression model trees
- Begin testing various datasets using the WEKA software
- Begin reading some of the source code and documentation to better understand WEKA
- Begin learning, using, and applying MATLAB
Week Three
- Test real data using the WEKA software to create various model trees.
- Read Data Mining: Practical Machine Learning Tools and Techniques by Ian H. Witten, Eibe Frank, and Mark A. Hall to better understand model trees and the WEKA software
- Become comfortable with the MATLAB software
Week Four
- Begin testing converted data using WEKA software
- Create various model trees and compare results
- Prepare mini presentation
Week Five
- Give mini presentation
- Continue testing and comparing model trees
- Create other forecasting models for comparison
Week Six
- Continue comparisons of models
- Begin writing paper and creating poster
Week Seven
- Continue research and comparisons
- Continue work on paper and poster
Week Eight
- Complete poster
- Continue work on paper
Week Nine
- Complete paper
- Prepare for final talk
Week Ten
- Finalize paper
- Poster session
- Formal talk
Weekly Log
Week One
- Orientation activities and forms
- Pre-REU Survey
- Attended GasDay Camp
- Met Dr.Povinelli and decided on research topic
- Read papers on my topic to discover:
- the definition and application of decision trees
- difference between classification trees, regression trees, and model trees
- machine learning and how trees split
Week Two
- Met with Dr. Povinelli to further discuss concepts and goals
- Explained:
- the "greedy" approach
- the various ways to determine the "best" variable and tree
- suggested reading about the M5P Model
- Explained:
- Read about the M5P Model and learned:
- Splits using a Standard Deviation Reduction Method
- Uses a smoothing method for leaves
- Article contained helpful pseudocode for understanding the process of creating a linear regression model tree
- Began working with the WEKA software to create linear regression model trees
- Read some of the source code from WEKA to understand the linear regression model tree creation process
- Began working with and learning MATLAB
Week Three
- Monday, June 15th
- Read Data Mining: Practical Machine Learning Tools and Techniques and gained a much fuller and comprehensive understanding of data mining, model trees, and the WEKA software
- Tuesday, June 16th
- Attend required talk on Good Presentations, Good Technical Writing, and the Difference Between Them
- Continued reading Data Mining: Practical Machine Learning Tools and Techniques
- Organized research and created goals for upcoming weeks
- Wednesday, June 17th
- Tested data using the Weka software to compare linear regression model trees against standard linear regression
- Found a plugin that will give MATLAB the capability of generating linear regression model trees
- Began learning and practicing how to implement MATLAB
- Learned about and implemented the M5PrimeLab plugin for MATLAB
- Thursday, June 18th
- Attend talk on Responsible Conduct of Research
- Did the interactive movie project that was assigned to accompany RCR training
- Successfully implemented the M5PrimeLab plugin to create linear regression model trees
- Friday, June 19th
- Continued with reading of Data Mining: Practical Machine Learning Tools and Techniques
- Used Weka to compare the M5P model against standard linear regression and also to compare the cross-validation techniques against the percentage split technique.
- Downloaded and began learning how to use the typesetting software LaTeX
- Met with Dr. Povinelli to discuss progress
Week Four
- Monday, June 22nd
- Converted all MATLAB data files to arff format
- Began work on mini presentation
- Began testing data using Weka to create model trees
- Tuesday, June 23rd
- Created and tested 10 fold cross-validation tests with a minimum of 10 instances at each leaf for each data set
- Thursday, June 25th
- Meet for lunch and discuss progress and challenges thus far
Week Five
- Thursday, July 2nd
- Mini Presentations
- Informal(though serious) description of what we have been doing to receive feedback in preparation for the formal presentations in Week 10
- Mini Presentations
Week Six
- Thursday, July 9th
- Meet for lunch and discuss progress and challenges thus far
Week Seven
- Thursday, July 16th
- Meet for lunch and discuss progress and challenges thus far
Week Eight
- Thursday, July 23rd
- Meet for lunch and discuss progress and challenges thus far
Week Nine
- Wednesday, July 29th
- Electronic version of poster due
- Thursday, July 30th
- Meet for lunch and discuss progress and challenges thus far
Week Ten
- Tuesday, August 4th
- Poster Session
- Wednesday, August 5th
- First half of the formal presentations
- Thursday, August 6th
- Second half of the formal presentations
- Friday, August 7th
- Post REU Survey and Final Instructions
- Research Papers Due