Find the Causes (Part 2 of 2)
- Niyi Ogunbiyi
- Feb 16
- 2 min read
In the previous post, we explored methods to identify potential causes for our problem of interest and introduced causal diagrams to depict the relationships between these factors. In this post, we will look into estimating the impact of these causal factors on the problem at hand.
Once we've established the causal relationships using our causal diagram, the next crucial step is to estimate the causal effect of a specific factor. To illustrate this process, let's revisit the BPIC 2017 challenge. The company involved hypothesized that contacting customers multiple times—referred to as "chasing"—might deter them from accepting offers. Contrary to this, two winning entries in both the professional and student categories found this hypothesis unproven; one even suggested that chasing customers could be beneficial. This raises the question: Does chasing customers positively or negatively influence offer acceptance, and to what extent?
To address this, we utilize our causal diagram (see below). In this diagram, the color pink represents a confounder (credit score), and the intervention denotes the causal factor whose effect we aim to estimate. This intervention is also known as the treatment—a term borrowed from epidemiology, where such techniques assess the impact of a treatment on patient outcomes. The outcome variable here is whether an offer is accepted. Additionally, there's an unobserved causal factor, indicating a variable we believe influences the outcome but for which we lack data—such as whether a customer received a loan offer from another company.

Estimating the effect of chasing on loan acceptance involves a three-step process:
Propensity Score Generation: Calculate the probability that a customer will be chased, based on confounders like credit score. This involves creating a logistic regression model to predict the likelihood of a customer being chased.
Inverse Probability Weighting (IPW): Determine the IPW for each case to assess how unusual it is concerning the causal variable. High IPWs indicate cases where the outcome was unexpected based on the propensity score, while low IPWs align with expected outcomes.
Causal Effect Estimation: Use the IPWs as weights in a regression model to calculate the causal effect. In our example, the analysis suggests that chasing has a minimal negative impact on offer acceptance rates. Even in the best-case scenario within the confidence intervals, the effect remains slightly negative. This insight is valuable; for instance, if the company had implemented a policy to chase all clients, it could have led to unnecessary resource expenditure without improving acceptance rates.
Propensity Scores & Inverse Probability Weights
It's important to note that these conclusions rest on the assumption that our causal model is accurate. Validating these assumptions is essential, and we will explore methods to do so in the next post of this series.
Comentários