Thanks to everyone who joined the most recent OpenEEmeter Working Group Meeting. In this meeting, the group discussed the latest developments and optimizations for the OpenEEmeter in handling both solar and non-solar data. The meeting began with a recap of progress so far. Adam Scheer reviewed the development of the hourly model and associated documentation, noting significant progress and ongoing work to finalize and document new models. Travis Sikes then explained how changes in the model handle hourly data better, including the incorporation of solar irradiance (GHI) for more accurate modeling of solar PV customers, and how the new model uses an elastic net approach to optimize computational efficiency and incorporate various input data. There was a discussion of challenges such as temperature bias and how the model requires interpolation to handle missing data. The discussion concluded with a comparison of the new models against CalTRACK 2.0. The new model significant improvements in computational efficiency and better handling of solar data. The team also compared the idea of using two models, one for data without solar and one with solar included, versus a joint model. They concluded that the joint model effectively combines the capability of handling both solar and non-solar customers without needing two separate models. This greatly simplifies implementation and maintenance. The meeting ended with a discussion of next steps, including finalizing model validation, incorporating feedback, and updating and completing any necessary documentation. Watch the complete meeting below.
0 Comments
Thanks to everyone who joined the most recent OpenEEmeter Working Group Meeting. The meeting began with a discussion of the documentation that has been developed for all of the OpenEEmeter. This documentation can be accessed at the OpenEEmeter Github site: https://openeemeter.github.io/eemeter/ Documentation is complete for the daily and billing models, and will soon be completed for the hourly model. Travis Sikes then gave a recap of previous meetings, and addressed a question that came up regarding a change from the OpenEEmeter 3.0 hourly model, which output for each individual hour, to the OpenEEmeter 4.1 hourly model, which outputs 24 hours at a time. Travis explained that this is because with the elastic net approach, each day must have a complete set of features (data) or it must be excluded from the data set. Armin Aligholian followed up with a more detailed discussion of the 24 hour approach, explaining that it is both faster and yields improved prediction over the individual hour approach. Armin then went into the question of how to fix bias (the difference between observed and modeled). Fixing this bias is important, is it was slightly larger than the previous model. Armin explained that there were two approaches to fixing this bias--binning by temperature, and linearizing temperature response. He then gave a detailed explanation of how temperature binning works. Armin then discussed a new module, clustering daily usage patterns, and advantage of adding it. The main benefit of clustering is that it captures the behavior of each meter, speeds up the model, and reduces noise. Travis then discussed progress in hyperparameter optimization. The meeting concluded with a discussion of next steps, which include finishing the hyperparameter optimization and fulling integrating the hourly model into OpenEEmeter. Watch the complete presentation below. Thanks to everyone who joined the most recent OpenEEmeter technical working group meeting. In this meeting, the team proposed a new naming convention, in which the name there would be an umbrella term that would encompass EEmeter, EEweather, and OpenEEmeter as submodules, along with the Recurve project GRIDmeter, which would be donated to LFEnergy. The advantage of this approach is that it would allow much more flexibility in the future to add additional features to the library, without making the library focused entirely around AMI meter-bases savings. It would also reduce duplicative work in updating functions that exist in more than one place. After a recap of recent working group meetings, the discussion moved to the a discussion of recent work on interpolation, including changing the method to autocorrelation interpolation. The conversation moved toward a discussion of population-level results, which has improved over the previous model, as well as bias in the model and approaches to fix it. The preferred approach to fixing this is through binning by temperature, but there second option of linearizing the temperature response, which is not well conceived yet, but would require multiple fits. Next steps include fixing temperature bias (including determining if binning is sufficient or linearization is necessary updating the objective function to include PNMBE, and reoptimizing hyperparameters) and moving the hourly model fully into the OpenEEmeter. Watch the full discussion below. Thanks to everyone who joined the most recent OpenEEmeter Working Group Meeting. The June meeting began with a recap of the previous meeting in which the team had discussed the many models that had been tested for the OpenEEmeter hourly model and how they landed on an elastic net model. The elastic net model is the least computationally expensive approach while providing significant benefits over the previous model. This new model is approximately 11 times faster than the previous version of OpenEEmeter and can take GHI (solar irradiance) and other supplemental data (such as pumping schedules) as new model inputs. Adam Scheer also reiterated that this refers to OpenEEmeter 4.1; while a more flexible model is also available, it is not appropriate to market this as OpenEEmeter 4.1 while incorporating variables that have not been tested or validated by the working group. Current goals are to finalize the hourly model (tuning hyperparmeters and translating the R&D code into final code), making sure the model it compatible with the revamped API, and debugging any bugs that crop up through testing. Travis Sikes then discussed the model's new approach to interpolation. This change is important, because while the earlier model was based on individual hours, the input and output of the new model takes 24 hours at a time. This means that missing data must be interpolated. Travis explained the different types of interpolation (univariate vs multivariate, linear, cubic, nearest data, etc.) and why the working group has settled on a multivariate RBF interpolator. This led to a detailed Q&A and discussion of why this approach was chosen, what data sufficiency was required, and other topics. The discussion then moved on to the model's ability to incorporate supplemental data. Travis explained that the new model has the same input requirements as the old model with the addition of solar irradiance, and then discussed how the model also has the option for supplemental data inputs which can help in cases when limited data is available. As an example, Travis showed how using PV installation date as supplemental data for commercial buildings (for which there are a limited number that have solar PV) yields imperfect results, but were a way to make the most of existing data. The conversation then moved on to a detailed presentation of recent efforts at hyperparmeter optimization and population results. Next steps include much more analysis on population-level results, and fully incorporating the hourly model into the OpenEEmter. Watch the full presentation below. Thanks to everyone who joined the most recent OpenEEmeter meeting. The meeting began with a recap of recent work, including a reminder that the OpenEEmeter 4.0 daily model is completed and has been released. People who are interested in learning more about OpenEEmeter 4.0, can watch a recent webinar hosted by LF Energy. The meeting then went to a discussion led by Armin Aligholian about the most recent work the team has been performing on the hourly model. Armin discussed the process of transferring the recent R&D work on the OpenEEmeter hourly model to the OpenEEmeter API. The process involved several steps, including optimizing hyperparameters to avoid discrepancies between the R&D code and the OpenEEmeter, and debugging. The presentation led to a discussion clarifying hyperparameter optimization and ways to reduce the effect of anomalous subsamples. Armin then discussed future work, including the need to consider commercial samples, optimizing hyperparameters for the temperature-only model, analyze the abnormal behavior of subsamples, and finalizing the model and data classes in the OpenEEmeter. Next Meeting Scheduled: Tuesday, May 7, 2024. Watch the full presentation below. Thanks to everyone who joined our recent OpenEEmeter Technical Working Group meeting on March 5th, 2023. Travis Sikes kicked off the meeting with an announcement of RetroMeter's use case review of OpenEEmeter 4.0 on Thursday, March 14th at 10am CST, in which they presented some of the work they've been doing adapting the OpenEEmeter for use cased in the U.K., and giving the OpenEEmeter developer community an opportunity to provide feedback on the user experience with the new API and desired features. Travis then announced the full public release of OpenEEmeter 4.0, now available via pip install. You can learn more about OpenEEmeter 4.0 from the recent the Linux Foundation Energy webinar. The discussion then moved on to recent work on the hourly model. In the previous meeting, Armin Aligholian presented results showing the elastic net model outperforming XGBoost, AdaBoost and other regression models usable within scikit-learn in terms of test error, computation time, and reduced overfitting. The elastic net had lower error on cloudy days and lower bias. In this meeting, Armin described how the team explored using an LSTM neural network architecture. While this approach showed some promise, the LSTM model was very computationally expensive, taking 14 minutes per meter on a CPU to achieve test error comparable to the elastic net. The elastic net model is 11x faster than the current OpenEEmeter model, with lower test error and less overfitting. The team also looked at incorporating supplemental data like EV charging and pump schedules. Adding this binary time series data as an input feature improved predictions of energy spikes by 40% in a worst-case scenario. Some key next steps are migrating the new elastic net model into the OpenEEmeter API, exploring adding NMBE to the loss function, analyzing performance on commercial buildings, and revisiting data sufficiency criteria in light of the new model structure. While the new architecture allows for easy incorporation of additional time series inputs, the group will need to be thoughtful about which inputs to allow in the base model to ensure quality and standards. Thanks again to Travis and Armin for leading the group through the latest results and analyses, and to everyone for the great questions and discussion. Next Meeting Scheduled: Tuesday, April 2, 2024 Watch the full presentation below. Thanks to everyone in the working group for all of your hard work and input on developing the LF Energy OpenEEmeter version 4.0. LF Energy OpenEEmeter measures the energy impacts of demand-side interventions in buildings. OpenEEmeter 4.0 provides enhanced performance of the daily model with dramatically reduced seasonal and weekend/weekday bias, along with increased computational efficiency. Among other benefits, OpenEEmeter 4.0:
On March 12, 2024, LF Energy hosted a webinar which explained in detail how OpenEEmeter 4.0 was developed and why these advances are important for measuring the energy impacts of demand-side interventions in buildings. Watch the full webinar below: Travis led off the meeting with an explanation of why the group is integrating the written methods into the OpenEEmeter repository. The group then discussed the many functions (such as weather data) that are duplicated between the EEweather and OpenEEmeter and the advantage of integrating them, including eliminating redundancies and making the process of updating much simpler while providing a more seamless experience for users. Travis then explained a number of simplifications that have been made in OpenEEmeter 4.0 code, specifically including default data settings in the baseline class. As most users use these defaults, this will simplify the use of the software and offer more consistency and standardization, while still allowing more expert users to change defaults for their particular use cases. Travis also proposed some adjustments to data sufficiency requirements, such as removing the requirement that daily data for electric meters not have negative values, as this doesn't necessarily indicate an error (there are many solar users, for example, who may show have negative consumption on regular basis). Instead, a requirement would be added that non-electric meters cannot have negative values. Armin recapped the discussion from last week of the advantage of switching from CVRMSE to PNRMSE as a more reliable model performance measure, especially for solar customers. Armin then explained how the working group has been testing different models with combined data from different weather features (solar irradiance, humidity, and temperature) to determine which perform the best. The team found that an elastic net model performs better than the current hourly model and better than other models tested, including for computational speed. For future work, the team will continue to focus on the challenge of overfitting, load shape analysis for different seasons, and considering the potential of ensemble models. Next Meeting Scheduled: Tuesday, March 5, 2024 Watch the full presentation below. Thanks to everyone who joined the most recent OpenEEmeter working group. Travis Sikes led off this meeting with a recap of the last meeting, in which the goal was to explore how to incorporate a variety of additional data inputs into the OpenEEmeter, such as temperature, humidity, and especially solar irradiance, in addition to contextual time series (day of week and month) data. Travis pointed out that initially the team was using a stratified K-fold scheme within baseline period for cross validation, but has moved on from that due to a concern of information leak; instead, they are now using a rolling test/train approach to minimize model overfitting. Travis then reviewed the previous discussion in which the group had discussed the need to move away from CVRMSC (Coefficient of Variation of Root Mean Squared Error) as metric for calibrating models which doesn't work well for buildings with solar panels. The group discussed instead using PNRMSE (Percentile Normalized Root Mean Squared Error), which appears to correlate well with CVRMSC. Armin Aligholian then went into more detail on the switch from stratified sampling to the three years rolling test/train framework. He went on to explain how the team was exploring the addition of GHI (solar irradiance) and its impact on model performance, specifically for solar customers. Moreover, CCI (cloud cover index) was used as a metric to analyze the importance of GHI specifically on more cloudy days. The meeting ended with a discussion of the need for more models in future work, including more work on neural network models, more input variation, as well as looking more closely at the impacts of cloud coverage, larger datasets for population analysis, and other factors. Next Meeting Scheduled: Tuesday, February 6th, 2023. Watch the full presentation below. Thanks to everyone who attended the most recent OpenEEmeter working group meeting. The meeting began with a discussion by Jason Chulock of coming improvements in the OpenEEmeter 4.0 API, including consolidating usage between all three methods — hourly, daily, and billing — and making certain common configurations the default. Jason then laid out improvements around data sufficiency and methods compliance. The goal of these changes is to make the API more user-friendly and efficient. Travis Sikes then led a recap and discussion of the current issues and progress on the CalTRACK 2.0 model. Key concerns of CalTRACK 2.0 include its tendency to be overfit, its incompleteness for solar PV customers, and the inflexibility in handling input data. Travis explained that the team would be using AMI measurements combined with weather, solar, and categorical data to enhance prediction accuracy. He then discussed evolving the cross-validation methodology from a static 24-hour window to a dynamic rolling test/train approach. There was a consensus on the need for a more robust error metric, suggesting a shift from CVRMSE to PNRMSE. Travis emphasized the need for commercial data to complete the test data sets. Looking ahead, next steps include the continued exploration of advanced modeling techniques like neural networks and the use of larger datasets for a more thorough population analysis. Next Meeting Scheduled: Tuesday December 5th, 2023. Watch the full presentation below. |
The purpose of this blog is to provide a high-level overview of CalTrack progress.
For a deeper understanding or to provide input on technical aspects of CalTrack, refer to the GitHub issues page (https://github.com/CalTRACK-2/caltrack/issues). Recordings
2019 CalTRACK Kick Off:
CalTRACK 2.0 July 19, 2018 June 28, 2018 June 7, 2018 May 24, 2018 May 3, 2018 April 12, 2018 March 29, 2018 March 15, 2018 March 1, 2018 February 15, 2018 February 1, 2018 Archives
April 2024
|