By Mohini Bariya, Energy Research Scientist, Molly Hickman, Data Scientist and Genevieve Flaspohler, Chief Data Officer at nLine
How can we estimate SAIDI (System Average Interruption Duration Index)—the average power outage duration experienced across all customers served—from PowerWatch sensor measurements of only a subset of customers? This post describes nLine’s statistical approach to estimate SAIDI from such a dataset. The nLine method has several favorable statistical properties that make it well-suited to calculating SAIDI in the real world where data is generally limited, which traditional SAIDI calculations neglect to consider.
SAIDI is one of a constellation of critical KPIs (key performance indicators) used by electric utilities, regulators, and investors to quantify the reliability of power grid infrastructure. These indicators inform crucial decision making on where money and effort should be directed to improve grid performance. SAIDI stands for “System Average Interruption Duration Index” and measures the total power outage time that an average customer experiences over a given period. For a period of interest and a total of customers in some region of interest, SAIDI is defined as:
where is the total power outage time experienced by customer over period . This metric communicates the average customer experience of grid failure duration. For example, a SAIDI of 5 hours for January 2022 means that customers experienced an average of 5 hours of outage in total across the month.
SAIDI can be calculated over subsets of time (months, weeks, seasons) and space (an entire city, a district, a single distribution feeder) to reveal spatial and temporal trends and variation in grid performance, such as weather impacts, demand effects, or underperforming feeders. In this way, SAIDI provides information about outages that enables electric utilities and regulators to assess overall system performance, quantify the efficacy of upgrade projects, and identify regions for targeted interventions.
To compute SAIDI per its definition, for a given region and time period, an electric utility must know the total number of customers in the region and the total outage time experienced by each customer during the time period. This data is usually provided by devices such as smart meters or network switches, which identify outage start and end times for single or small groups of customers with high temporal resolution and fidelity. If the data is to be adequate for directly calculating SAIDI, these devices must be pervasive throughout the system. In large swathes of western Europe and North America, extensive and costly rollouts of networked grid equipment such as smart meters have provided utilities the granular and comprehensive data needed to compute SAIDI by equation (1) [Dunn, 2019].
The story in low- and middle-income countries (LMICs) is very different. In most LMICs, grid sensing, particularly at the distribution level, is very limited or nonexistent. When grid data fall short of comprehensive outage monitoring, SAIDI can only ever be estimated, never computed exactly. Utilities in LMICs work with what data is available to estimate SAIDI. This data often consists of customer calls (to estimate when a given outage began) and repair records (to estimate when a given outage ended), which are are both noisy and biased -- highly dependent on the time of day an outage occurred, who at the reporting center took the call, the mood of the customer, and many other factors. Given the difficulty of using such data to compute SAIDI, a significant fraction of utilities in LMICs do not compute SAIDI at all [World Bank, 2020], and those that do may produce rough estimates that differ widely from customer perceptions of outage time [Kunaifi, 2017]. However, many international funders, such as the Millennium Challenge Cooperation, require baseline SAIDI data to decide how to invest in grid development projects [MCC, 2021] and SAIDI is a very useful internal tool for prioritizing grid maintenance and upgrades. This puts utilities in LMICs in a challenging position: they are under internal and external pressure to provide SAIDI but often lack the necessary data and processes to do so accurately.
To address this challenge, we must move beyond the formal SAIDI equation of (1) that demands complete data—covering all outages at all customers—that numerous utilities lack. At nLine, we start with the premise that SAIDI must instead be estimated from the data and customer experiences that are available. nLine provides two key innovations for estimating SAIDI when comprehensive grid data is not available: (a) sparse deployments of PowerWatch sensors, which measure outages at customer connection points, and (b) an unbiased and consistent estimation approach for using limited customer experiences to estimate SAIDI. This post introduces and describes our estimation approach (b); our past and upcoming blog posts describe our sensor deployment methodology, projects in countries where we’ve deployed sensors, and how individual customer experiences are clustered to provide estimates of outage start times and durations (a).
nLine captures customer experiences with PowerWatch devices, which are deployed across the electric grid at customer connection points, as visualized in Figure 2. The PowerWatch devices record individual customer experiences of the grid (and through our clustering algorithm, we corroborate the outage information of individual sensors to provide a more robust and reliable picture of grid outages). For the remainder of this blog, it is important we abstract away how nLine knows when outages start and end from PowerWatch devices (a); we will simply assume that we have access to individual outage experiences. These individual outage experiences are exactly the elements that feed into the average in the classical SAIDI of (1).
However, nLine only observes the experiences of a subset of all grid customers. Even for those customers that are instrumented with a PowerWatch device, some moments of their grid experience may be unobserved due to occasional drops in sensor reporting or sensor failure. Lacking the data from all customers and all times in a period, we can not directly compute the classical SAIDI equation using PowerWatch data. Instead, nLine extrapolates SAIDI from subsampled customer experiences.
That might sound abstruse, but this type of approach is very widespread and both familiar and natural to many of us. We know that the polling data, health statistics, or survey responses we see in the news could not have been obtained by asking every single person on earth for their political leanings, medical records, or personal preferences. Instead, they must have been obtained from a manageable subset of people whose results we assume reflect those of the wider population - this is exactly extrapolation from subsampled data.
Now, let’s return to our context. How should we estimate a SAIDI value from this subsampled outage data?
A simple approach (which we do not use) would be to directly plug the data we have into the classical equation of (1), glossing over the unsensed customers and unreported outages.
This equation uses the captured outage time at the sensed customers to generate an estimate of SAIDI for all customers. The reported outage time experienced by customer during the period is denoted and will be less than or equal to the true outage time (denoted ) experienced by the customer.
By directly using the reported outage time in place of the true outage time, this estimator is implicitly assuming that every PowerWatch device was sensing and reporting over the entire period. We know this is not always true. Some sensors, for one reason or another, may have been unable to communicate their data for parts of the period, e.g. because they lost cell connectivity. Others may have only been deployed for part of the period. This reality seriously compromises the results of the estimator.
Consider a simple example: suppose we have two customers in Ablekuma, whose data we will (ambitiously) use to estimate SAIDI for the district. There is Agnes whose power was unfailingly monitored by a PowerWatch device for the whole of month of January. Her complete PowerWatch data shows she experienced a total of hours of outage time in the month. Another customer, Kofi, was given a PowerWatch only in the second half of the month. His PowerWatch data records hours of total outage time. Plugging this data into (2) would give us a SAIDI of:
This result sits uneasily. Intuitively, we can see that by treating Kofi (whose PowerWatch reported for only half the month) in just the same way as Agnes (whose PowerWatch reported for the whole month), we are under-counting the numerator (or over-counting the denominator), leading to a SAIDI underestimate. We know that Kofi likely experienced outages in the first half of the month that are absent from this equation, but this estimator implicitly assumes that power delivery was perfect when PowerWatch was not watching.
This example vividly illustrates the shortcomings of using the simple approach in (2) to estimate SAIDI from subsampled data. Fortunately, at nLine, we do not estimate SAIDI in this way. Instead, we employ a better estimator that accounts for the fact that not all sensors report all the time, and treats each hour sensed equally.
We now introduce our hero, the proportional—aka nLine—estimator:
where is the total duration of outages sensed by PowerWatch during period , and is the total duration sensed by PowerWatch during period . For each sensor, will be less than or equal to — most likely less than, for the reasons mentioned above (the sensor may have been deployed for only part of the period or perhaps it just wasn’t able to report for the whole period). In words, the proportional estimator takes the total observed outage time across all sensors during the period () as a proportion of the total sensed time (), and multiplies that by the length of the period () to obtain the average time spent in outages. Outage time, powered time, reporting time and how these three values are used to estimate nLine’s SAIDI are illustrated in Figure 3, which provides visual intuition for this approach.
Let’s think of the experiences of customers over time as constituting an area of size x . Then, the true SAIDI is the ratio of the grey area (total outage time) to the full yellow + grey area (total time) multiplied by the period duration (). The proportional estimator observes a subset of the full area according to which and when customers are monitored (indicated by the purple boxes). It computes the same ratio of outage to total time but over the area it observes. This is multiplied by the full period duration to produce the estimator’s best guess of SAIDI.
Let’s use the nLine estimator in the same example involving Agnes and Kofi in Ablekuma. With Agnes’s PowerWatch reporting hours of outage time and Kofi’s reporting hours, the total outage time recorded is hours. Agnes’s PowerWatch sensed the full month, while Kofi’s was monitoring for half the month, giving us a total sensed time of months. Finally, the full period duration is month. Plugging these numbers into the nLine estimator (3), we get a SAIDI of:
In effect, the nLine estimator extrapolates from the two weeks that we were observing the power at Kofi’s house. He experienced 2.5 hours without power in the two weeks that PowerWatch was watching. We don’t know what happened in the two weeks we did not observe — maybe he had fewer interruptions, or maybe he had more — but the estimator uses the information we did collect and guesses that his power reliability was the same during sensed and unsensed time periods.
As this example shows, the proportional estimator gives a SAIDI value that intuitively feels more reasonable than the naive estimator of (2). This intuition is reflected in the mathematical properties of the estimator! This estimator has two very desirable statistical qualities. In the following, when we say “true SAIDI,” we mean the value we would obtain if we somehow had data on all customer outage experiences to calculate (1) directly.
- Unbiased [Appendix I]. The bias of an estimator tells us whether it tends to yield values that are higher or lower than the true value. The proportional estimator is unbiased, which means that its SAIDI estimate is just as likely to be a little higher than the true SAIDI as it is to be a little lower: on average, the estimate generated by the proportional estimator equals the true SAIDI. This claim comes with caveats: it assumes our sensor deployments are not themselves biased, i.e., we’re not systematically sampling from only those parts of Accra with unusually poor (or exceptionally good) power reliability. It also assumes that reporting rates are independent of outage rates. See the “Importance of Sampling” section for more discussion on this.
- Consistent [Appendix II]. Statistically, consistency means the following: if we think of the SAIDI estimate produced by the proportional estimator as a random variable, its distribution grows narrower and narrower about the true value as the quantity of data used in the estimator increases (as we observe more sensors over longer time). This means that as we use more data (more PowerWatch observations within the period) in each estimate, our SAIDI estimates will become increasingly close to each other.
That the proportional estimator is both consistent and unbiased means that as the customer data used increases, our SAIDI estimate has a vanishing chance of ever being far off from the true SAIDI. This property is visualized in Figure 4: as observations are amassed, the distribution of the estimate grows ever tighter around the true parameter value.
The proportional estimator extrapolates the observed outage statistics at sensed customers to unobserved times at both sensed and unsensed customers. We’ve already discussed the advantageous statistical properties of this estimator; however, those properties rely on a fundamental assumption that is necessary for the success of any sampling-based estimation method (which includes the majority of political surveys, medical studies, risk assessments, modeling methods, etc.): the statistics of our subsample must reflect the full population statistics.
Practically, this means that our sensor deployments should cover a slice of customers whose collective experience of reliability mirrors the reliability over all Accra. If we deploy sensors only in a neighborhood where a recent upgrade is resulting in exceptionally good reliability, our SAIDI estimate will extrapolate this experience to the entire city, producing a misleadingly rosy picture of reliability. Similarly, if we only deploy sensors on the feeder with the oldest, worst performing infrastructure, the extrapolation will lead to an overly bad SAIDI.
The need for a representative sample holds not just for space, but also for time. If outages are minimal from 1-6 AM, and we only collect outage data in this period, extrapolating the resulting SAIDI to all times will produce a result far from the true SAIDI. This means our unbiased claim also relies on the assumption that outage and reporting rates are independent.
Since our sensors collect data continually and have a high uptime, we have minimal concern that our temporal sampling is skewed. To avoid introducing bias through the customers we select to monitor, careful deployment design is of the utmost importance. We have deployed over 1,200 sensors in Accra, which we believe capture the full range of power experiences in Accra. There are PowerWatch devices in rural Mampong and metropolitan Accra, informal settlements and more affluent neighborhoods, households and businesses. However, we certainly have more sensors deployed in some parts of the Greater Accra Region than others, and it is possible that our sensors are not a perfectly representative sample; in fact, it’s certainly not perfect! Yet, almost all data collection endeavors are afflicted by the possibility of sampling bias, which is extremely challenging to determine, quantify and completely avoid.
To make sense and use of SAIDI, utilities would like to slice and dice it over time and space in several different ways. For example, to obtain a high-level, aggregate picture of system performance, a utility might be interested in a single SAIDI value covering their entire network over a full year. On the other hand, to understand the impacts of weather or demand variation on system performance, a utility may want SAIDI for each month of the year. And to inform where urgent maintenance or upgrades are needed, a utility could turn to SAIDI values at a district or individual distribution feeder level. These are referred to as distinct time and space groupings of SAIDI.
Many time and space SAIDI groupings have significant overlap in terms of the sensor experiences that determine them. For example, for a single network in the year 2021, the annual system-wide SAIDI contains the same set of sensor experiences as 12 monthly system-wide SAIDI values. However, for an arbitrary SAIDI estimator, it is not generally possible to obtain the annual SAIDI directly from the monthly values. This results in an added computational burden: to get SAIDI for the year, we have to start from scratch with the complete dataset of individual sensor experiences, as we are unable to leverage precomputed monthly SAIDI. Considering that we may want to provide lots and lots of different time and space groupings, this burden of re-computation can become significant.
In this context, the proportional estimator has convenient aggregation properties. For each time & space grouping, we don’t have to always restart from the full sensor-level data. Instead, we can build up a more aggregated SAIDI estimate from precomputed SAIDI values for smaller time and space groupings. To do so, we will need an additional piece of data along with the SAIDI values themselves: the aggregate reporting time that went into each SAIDI value.
Let’s say that we have SAIDI for two non-overlapping time periods and , and we are interested in SAIDI across the joined period. For example, could be SAIDI for January to June and could be SAIDI for July to December, and we may want SAIDI for the whole year. Then, given the total reporting times for each period, and , we can obtain SAIDI for from the finer resolution SAIDI directly as follows:
This allows us to easily aggregate SAIDI, for example going from daily SAIDI values to monthly SAIDIs and on to annual SAIDIs without returning to the full sensor-level dataset after the first step. We can perform spatial aggregation very similarly, aggregating up from site-level SAIDI to districts, cities, and countries.
These nice properties of the proportional estimator mean that we can efficiently provide SAIDI for many time and space groupings, which greatly enriches the potential for insights, guidance, and operational value that a utility can obtain from this data.
Recall the equation for our proportional estimator:
We can replace the observed outage durations , with an observed outage proportion denoted : the fraction of its total reporting time that a sensor experienced an outage. The outage proportion is a number between and , inclusive. Expressed in terms of outage proportions, the estimator equation is:
The bias of the estimator can be determined by finding its expectation. If the expectation of the estimator is equal to the value of interest, the estimator is unbiased. The following derives the expectation of the estimator in (5).
where we use the Law of Total Expectation to go from (6) to (7).
The final expression is indeed the target quantity: it is the average outage proportion for the population multiplied by the duration of the period of interest, which gives us the average total outage time experienced over the period. Therefore, the estimator is unbiased.
To understand the consistency of the estimator, we wish to derive how its variance evolves in the limit as the number of samples grows (in this case as time goes to ). If the variance has a limit of , the estimator is consistent. The following derives the variance of the estimator in (5).
This rewrite is enabled by the Law of Total Variance. Let’s simplify each of the terms separately, starting with the easier second term.
Now for the first term:
Let’s consider the limit of each of these terms as the observation time grows larger: .
will be some bounded, finite value. Similarly, will be bounded and finite.
is the ratio of the outage time to reporting time experienced by sensor over period . It is therefore an average and as grows, the variance of this quantity will decrease if the underlying outage probability is constant. Therefore:
Now, for the covariance:
Plugging these results into (20), we can conclude:
which tells us that the estimator is consistent.