Causal Inference Using Bayesian Structural Time-Series Models

Investigating the effect of training activities on the volume of bugs reported by a software engineering team

Nick Cox
Towards Data Science

--

Photo by Erik Karits on Unsplash

In this article I will provide you with a brief introduction to causal inference and why we would need it as a Data Scientist, and then I present a practical example of how to apply the concept using the Python library Causal Impact.

Introduction to Causal Inference

Britannica defines causal inference with a useful example that we can all understand:

In a causal inference, one reasons to the conclusion that something is, or is likely to be, the cause of something else. For example, from the fact that one hears the sound of piano music, one may infer that someone is (or was) playing a piano. But although this conclusion may be likely, it is not certain, since the sounds could have been produced by an electronic synthesizer.

Causal inference is about determining the effect of an event or intervention on a desired outcome metric. It can also be thought of as determining whether a change in the outcome metric was caused by an event or intervention.

For example, 1) what was the effect of a marketing campaign (the intervention) on the sales of our products (the outcome), and 2) did the sales of our products increase because of a marketing campaign or was it because of a different reason? We can use causal inference to answer these questions.

Causal inference is commonly utilized when making decisions that can impact millions of people or involve millions of dollars, such as in healthcare, public policy, science and in business. In such cases, it is important for our analysis to be grounded in reliable statistics and not just casual glances at data and plots.

The challenge with analyzing the effects of an intervention is that we are not then easily able to examine how the series would have trended without that intervention. In our marketing example, we have a record of our sales after the intervention of a campaign, but we do not know what the sales would have been without that intervention.

This is where causal inference using Bayesian structural time-series models can help us. We can use such a model to predict what would have happened without the intervention, which is called the counterfactual. We can then compare the counterfactual with what we actually observed.

Causal Impact Library

In 2014, Google released an R package for causal inference in time series. The Python Causal Impact library, which we use in our example below, is a full implementation of Google’s model with all functionalities fully ported.

The implementation of the library is best explained by its author:

The main goal of the algorithm is to infer the expected effect a given intervention (or any action) had on some response variable by analyzing differences between expected and observed time series data.

Data is divided in two parts: the first one is what is known as the “pre-intervention” period and the concept of Bayesian Structural Time Series is used to fit a model that best explains what has been observed. The fitted model is used in the second part of data (“post-intervention” period) to forecast what the response would look like had the intervention not taken place. The inferences are based on the differences between observed response to the predicted one which yields the absolute and relative expected effect the intervention caused on data.

Causal Inference by Example

Here is the scenario we will work with:

  • A streaming service, WebFlix, delivers its content through several channels: iOS app, Android app, Roku app, Fire TV app and web browsers.
  • Each channel is managed by a different software engineering team.
  • The engineering teams track the number of bugs reported each week and monitor patterns.
  • Management of the Web team identified a worrying upwards trend in the number of bugs reported in early 2020 and provided training to only that team in May 2020 to address the problem.

Following the training, the number of bugs reported for the Web team decreased and stabilized for the remainder of 2020. Was this decrease in bugs reported a result of the training provided or was there something else?

Our dataset contains weekly bug reporting for each of the software engineering teams. All code and data is available in this GitHub repo.

You can see in the below line plot the increase in bugs reported for the Web team in the first part of 2020. The red dashed line indicates the week in which training was provided.

This next line plot shows the trends of bugs reported by all of the software engineering teams. Bug reporting for all of the teams other than Web is fairly stable throughout the year, bouncing around within a consistent band.

Based on the above line plots, we could make a preliminary conclusion that the training provided is a possible cause for the reduction in bugs reported for the Web team. However, to increase confidence in our conclusion we will utilize the Causal Impact library for our statistical analysis.

Before we are able to used Causal Impact, we need to transform our DataFrame into a wide format, so that each software engineering team has a column listing the number of bugs reported per week. We can use the Pandas pivot table function to do this. We also need to ensure that our date variable is set to the index and that the Web variable is moved to the first column of the DataFrame.

The model will use the bug reporting data for all of the software engineering teams to help us determine whether the specified intervention was the true cause of the decrease in bugs reported for the Web team.

Here is the code and the resulting DataFrame:

Using and interpreting the results of the Causal Impact library is incredibly easy. We start by defining the period of time before the intervention (training was provided during the last week of May 2020) and the period after the intervention.

We then run the model by providing the wide DataFrame and the two periods we just defined. Once processing is complete, we can plot the results using the three available plot types: original, pointwise and cumulative:

The first plot below shows the actual bugs reported for the Web software engineering team (y) versus the prediction for the same team (predicted), taking into consideration both the bugs reported in January to May 2020 for the Web team and the bugs reported throughout the year by the other software engineering teams.

It is evident from the plot, that the prediction for the Web team from June to December 2020 is above the actual reporting. This indicates that an intervention took place in May 2020 that positively impacted the number of bugs reported by the Web team from June onwards. This further supports a conclusion that the training provided to the Web team in May 2020 was the cause for the reduction in bugs reported from June onwards.

The next plot shows the difference between the actual series and the predicted series, referred to as the point effects.

This final plot shows the cumulative effect, which is basically the summation of the point effects accumulated over time.

The Causal Impact library can also provide us with numerical and statistical outputs for further analysis:

And finally, we can also produce a written report explaining the results of our analysis with just one line of code:

This report confirms our earlier preliminary conclusions that the cause of the reduction in bugs reported for the Web software engineering team from June 2020 onwards was the training provided to the team in May 2020. The probability of obtaining this effect by chance is very small and therefore the causal effect can be considered statistically significant.

The intervention, which is the training provided, had an effect of -21.03 bugs reported, with a 95% interval. Providing the training was a good decision because it had the desired effect.

Concluding Remarks

Causal inference is a difficult problem to solve, but one that data scientists are increasingly being asked to address. The introduction of libraries such as Causal Impact give us a good tool to be able to make headway on this.

As you saw in the bug reporting example, the library is super easy to use and the results can be quickly interpreted and understood. We can quickly gain confidence in any conclusions drawn and communicate our results to stakeholders.

--

--

Data Scientist | Expert in Generating and Intelligently Communicating Insights and Solutions | Find me on LinkedIn: https://www.linkedin.com/in/nickdcox