Beyond simply flagging errors, the Recurve Platform actively tests and affirms compliance with the CalTRACK methods, resulting in an audit log, called the CalTRACK Scorecard, which attests to full CalTRACK compliance for each metered site. The CalTRACK Scorecard is surfaced in the Platform for inspection by users and third parties. In addition to the modeling components included in the OpenEEMeter, the CalTRACK methods also address data handling and cleaning that should take place prior to calculating savings. This means that using just the OpenEEmeter does not guarantee CalTRACK compliance. Furthermore, only a subset of CalTRACK methods refer directly to the NMEC calculation alone -- a large portion relates to checks on each individual project and meter, such as overall data quality, sufficiency, and project disqualification and outlier detection. Errors that appear at scale within a complex application require a more sophisticated solution to avoid costly and very slow manual audits. The CalTRACK Scorecard thus requires evidence of successful compliance tests and a detailed record describing any non-compliance (such as insufficient baseline period data for successful model creation), creating a clear and proactive record associated with each project and set of saving calculations. There are over 60 checks that are performed for each metered site.
Raw data is tested for the presence of duplicate or extreme values. Consumption values that are more than three interquartile ranges larger than the median usage are flagged as outliers and manually reviewed. Negative consumption data values are also flagged for review as they indicate the possible unreported presence of net metering. Other data quality issues including impossible dates, non-numeric consumption values and duplicate values are also flagged.
Another major data issue relates to data sufficiency for modeling. CalTRACK tests had confirmed that modeling outputs are sensitive to the lengths of the baseline and reporting periods. In general, for the billing and daily methods, consumption and temperature data should be sufficient to allow for a 365-day baseline period. The number of days of consumption and temperature data missing should not exceed 37 days (10%) for billing and daily methods. For fitting baseline models using the hourly methods, no minimum baseline period length is required. However, baseline consumption data must be available for over 90% of hours in the same calendar month as well as in each of the previous and following calendar months in the previous year. When dealing with billing data, off-cycle reads (spanning less than 25 days) are dropped from analysis as they typically occur due to meter reading problems or changes in occupancy. There are also several modeling checks in the Scorecard related to the success or failure of fitting the regression models. The CalTRACK data preparation and modeling guidelines include detailed information about all of the checks necessary for ensuring CalTRACK compliance.