Find the Causes (Part 1 of 2)

This article is the third installment in our series on diagnostic process mining. In our previous discussions, we emphasized the importance of accurately identifying the root causes of problems and the significance of properly defining these issues. Finding the causes consists of three key objectives:

Identifying Potential Causes: We will explore methods to uncover possible factors contributing to our problem of interest.
Understanding Causal Diagrams: We will delve into causal diagrams, which illustrate the relationships between various factors.
Estimating Effects: We will examine approaches to assess the impact of these causal factors on the problem at hand.

In this post, we will consider the first two steps.

As a brief reminder, this discussion pertains to the second step in a six-step diagnostic process mining workflow.

Defining Causality

To begin, let's clarify the concept of causality. When we assert that a variable X causes Y, we mean that a change in X leads to a change in the probability of Y occurring. For instance, consider X as the average case complexity score and Y as the average case cycle time. If a higher case complexity score results in a longer case cycle time, we would observe that as case complexity increases, cycle time also increases, and vice versa. This relationship can be depicted with an arrow from case complexity to cycle time, indicating the direction of causality.

Conversely, if there is no causal link between two variables—say, Z (the average height of the team) and case complexity—altering the case complexity would not affect the average team height; the team's average height remains unchanged regardless of variations in case complexity.

Identifying Causes

To identify the various causes of our problem of interest, a structured brainstorming session with Subject Matter Experts (SMEs) is recommended. The goal of this session is to compile a comprehensive list of potential factors contributing to the problem.

For illustration, let's revisit the problem statement we defined earlier: investigating the lower-than-expected loan acceptance rate for a Dutch financial services company using the BPIC 17 event log. Without access to SMEs from that company, we have independently developed a list of potential causes. This list includes factors such as monthly cost and whether the client was offered an amount less than they requested. For example, as monthly cost increases, the probability of acceptance might decrease. Similarly, if a client is offered less than they applied for, they may be less likely to accept the offer.

Causal Diagram For Declined Offers Problem

If you're familiar with fishbone or Ishikawa diagrams, you might find them helpful in this step, as they categorize causes systematically. The essential outcome is a list of potential causal factors.

Constructing Causal Diagrams

Next, we define the causal relationships between these variables using a causal diagram. There are two simple rules to follow when drawing this diagram:

Direction of Causality: If we believe X causes Y, we draw an arrow from X to Y.
Avoiding Cycles: Direct or indirect cycles are not permitted; a variable cannot cause itself, either directly or indirectly. This ensures the diagram remains a Directed Acyclic Graph (DAG), meaning it is directed and contains no cycles.

As with identifying causal factors, creating your causal diagram in a structured brainstorming session is advisable. No special tools are required; you can sketch your diagram on a whiteboard, paper, or use software like Visio. I often use a web application called DAGitty, which offers support and validation features. A tutorial on using Dagitty is available for those interested.

Upon completing your causal diagram, you'll notice different colors or notations indicating various types of relationships. Before concluding this section, let's briefly discuss the three basic (or atomic) causal structures that form the foundation of more complex causal diagrams:

Common Cause (Confounder): This occurs when a causal factor influences multiple other factors. For example, a customer's credit score might affect both the likelihood of a client having a competing offer from another financial services provider and the monthly cost. A higher credit score could make it more likely for a client to have a competing offer and lower monthly costs, and vice versa. These variables may appear statistically correlated due to their common cause.
Common Cause (Confounder)
Mediator (Chain): In this structure, one variable influences another through an intermediary factor. For instance, credit score might affect the probability of a client accepting an offer through the variable "chased," which indicates whether the company repeatedly contacted the client to obtain missing documentation. The company might prioritize contacting clients with high credit scores, increasing the likelihood of acceptance. An analogy from "The Book of Why" by Judea Pearl describes a fire causing smoke, which in turn triggers an alarm; without smoke, the alarm wouldn't be triggered.
Mediator (Chain)
Collider: This structure involves two unrelated variables causing a third variable. For example, an individual's level of experience might influence the quality of their output, and their workload could also affect their experience level. However, experience level and workload may not be directly correlated.
Collider

Understanding these structures is crucial for accurately interpreting causal diagrams and the relationships they represent.

By following these steps—identifying potential causes, constructing causal diagrams, and understanding causal structures—you can effectively analyze and address complex problems within the framework of diagnostic process mining.

In the next blog post, we will discuss how we can estimate the causal effect of each factor

Find the Causes (Part 1 of 2)

Recent Posts

Comments

Change Enablers Ltd