How to fill gaps in your sustainability data

A standard part of our work is the calculation of energy and carbon footprints. For an energy or carbon footprint, you need to collect sustainability activity data like electricity, natural gas, fuel consumption or waste.

In a perfect world, all required historical and current data would be available in easily accessible form and would always be accurate. Unfortunately, as you may have experienced yourself, this is not always the case. In this blog post, we will show you 3 common ways how you can fill missing sustainability data gaps.

Problems with collecting sustainability data

Common problems with collecting sustainability data include the following:

  1. Incomplete time series: Data may only be available for a few months of the year, it may be available for one year but not another, or the most recent data is not yet available.
  2. Out-dated data: You may require a data set annually, but the data may only be available less frequently. An example for this is waste data based on audits, which are performed infrequently.
  3. Partial data: You may be able to get one data set easily, but not another, or you may only have data for part of your organisation, but not another.
  4. Unreliable data: Data may available, but with obvious inconsistencies.

Three common techniques to overcome sustainability data gaps

In this blog post, we will show you three ways to overcome sustainability data gaps:

  1. Interpolation
  2. Extrapolation
  3. Scaling

You need to carefully evaluate your specific circumstances and determine the best option for your particular case. You may also be able to use more than one method for a specific problem and then make a final decision as to what method gives you the best result.

Interpolation of sustainability data

You can estimate missing data in a timeseries by interpolating between those periods. The method for interpolation can be linear or more sophisticated. Linear interpolation means that you are drawing a straight between the edges of your data gap. More sophisticated methods will allow you to account for more subtle features in your trend.

Figure 1: Using interpolation for data gaps

Please note that if your data fluctuates significantly, using interpolation will not give you the best result. It is good practice to compare interpolated estimates with surrogate/proxy data (see ‘Scaling’ section) as a quality control check.

Extrapolation of sustainability data

You will need to extrapolate your sustainability data to produce estimates for years after your last available data point and before new data is available. Extrapolation is similar to interpolation, but less is known about the trend.

Extrapolation can be conducted either forward (to predict future emissions or energy consumption) or backward, to estimate a base year, for instance. Trend extrapolation assumes that the observed trend during the period for which data is available remains constant over the period of extrapolation. If the trend is changing, you should consider using proxy data (see next section).

Figure 2: Using extrapolation for data gaps

When you use the simple linear method, you extend the line from the end of your known data line. You can also use more sophisticated extrapolation methods to account for more subtle features in the data trend.

The longer the extrapolation projects into the future, the more uncertainty is introduced. However, it is better to have an estimate, than not to have one at all.

It is good practice to update projected graphs with real data as this becomes available and to subsequently update your projections.

Please note that extrapolation is not a good technique when the change in trend is not constant over time. In this case, you may consider using extrapolations based on surrogate data.

Scaling

Scaling works by applying a ratio of known data to your data gap. The ratio is called a ‘scaling factor’. Known data is called surrogate, or proxy data. Surrogate data is strongly correlated to sustainability data that is being extrapolated and is more readily available than the data gap you are trying to fill.

For instance, emissions from transport are related to how many kilometres you travelled. Energy consumption in a building is related to how many people use the building. Emissions from wastewater are related to the population number.

Figure 3: Using scaling for data gaps

In some cases, you may need to use regression analysis to identify the most suitable surrogate data. Using surrogate data can improve the accuracy of estimates developed by interpolation and extrapolation.

Common scaling factors include:

  • number of employees, square metres, operating hours, or population (for community greenhouse gas inventories)
  • economic factors like production output, revenue, or GDP (for community greenhouse gas inventories)
  • weather-related factors like heating degree days or cooling degree days

Case example for extrapolation using scaling

One of our clients was evaluating the adoption of a science-based target. Given that a target is set some time in the future, they needed to find out how much carbon emissions would grow in the absence of abatement measures. Calculating this trend would show the size of the reduction task going forward.

We approached this task by following these steps:

  1. Extrapolation of the available historical greenhouse gas emissions into the future by applying an assumed year-on-year growth scaling factor.
  2. Refinement of the estimated trend by plotting known plant closures and other identified changes onto the timeseries.
  3. Application of estimated future emission factors. Since the grid is getting greener with new renewable energy projects feeding into it, the greenhouse gases associated with electricity consumption for the same underlying use reduce over time.
  4. Development of emission reduction scenarios. Once the baseline emissions growth was estimated, we developed emission reduction scenarios based on energy efficiency and renewable energy opportunities.
  5. Development of a graph to communicate the findings to the management team.

As a result of this extrapolation, our client was able to make an informed decision as to the ambition level of their target, as well as a suitable timeframe.

Conclusion

Choosing the right method depends on an assessment of the volatility of the sustainability data trend, whether surrogate data is available and adequate, and the length of time activity data is missing. If you need help with filling in data gaps, you should consider getting expert advice.

100% Renewables are experts in dealing with data gaps and projecting trends. If you need help with managing your data, please contact  Barbara or Patrick.

Feel free to use an excerpt of this blog on your own site, newsletter, blog, etc. Just send us a copy or link and include the following text at the end of the excerpt: “This content is reprinted from 100% Renewables Pty Ltd’s blog.