A Process Mining Project

I’m writing this post two weeks after I concluded the first process mining project. Curiously on the same week I started making public the case study (if you want a copy send an e-mail) a lot of noise part 1 and part 2 appeared around process mining. Good I thought. A lot of discussion, fear, anger and anxiety means that people are engaged and far from the tiring, but still evolving discussion around Adaptive Case Management .

I discovered process mining around march this year (I never heard about it before), and I saw and felt that there was for the first time something that could bring a revolution to business process management, since the times of re-engineering. Better, no system lock in, agnostic, far away from coining and acronyms and business methods war.

If process mining did not have value it was impossible in so short time, get traction, demonstrate the power and deliver results.


Before start your process mining project and jump in to facts gathering, the first thing you need to do is define scope.

What you you want to accomplish? Do you want to improve and end-to-end process? Do you want to focus on a particular interaction (collaboration with partners)? Do you want to understand how people collaborate? How people coordinate others? How they prioritize issues?

This is important because it will define the kind of business data you will need to collect and the sources of business data you will look for.

Choose the type of project

There are three basic project types:

  • Question driven. A question driven project delivers answers to predefined questions, but during the mining other questions may arise connected along with the first questions you asked. I would like to say clearly that one of the fears the community is afraid, regarding that this kind of approach leads to a “never ending story” is tackled easy by experienced people like in other change management process. If you have problems dealing with it buy this book Buy-In: Saving Your Good Idea from Getting Shot Down by John P Kotter and help your self. A question driven project should be something like: Why we have communication problems among team members when they are working together on a problem? Why we fail to register complains? Why the back office is bottleneck(ing) everything?
  • Outcome driven. An outcome driven project will look for the causes of bad or excellent performance in multidimensional ways, like time, costs, fte, etc.
  • Data driven. A data driven project looks for finding particular information that can help managers to deep their knowledge about execution.

The first type is the simplest project type you can carry, the last one it should be executed when people are very familiar with enterprise culture, execution, people, data that is stored, systems.

Get the data

One of the fears of most of the people it’s the data, specially for business people that do not understand or are afraid of looking to databases schema (for that there is the IT department right?)

Getting the data is not a Business Intelligence saga.

Actually very few data is needed to make the analysis.

This is an example from a purchasing process. The data provided does not belong to any organization.

So, as you can see you need by default this data set:

  • Case ID – This is an unique number of the process instance and aggregates all the information around a particular case.
  • Start timestamps –  is the timestamp when it started an activity, combined with the complete timestamp provides the activity time an important performance dimension.
  • Complete timestamps – is the timestamp when the activity stopped being executed. The complete timestamp with the start timestamp  of the next activity shows the cycle time, or by other words the waiting time from one activity to the other.
  • Activity – the name of the piece of work being performed. Some systems do not store the activities name and translate it into state machine transition codes, like 1 – Open, 2- Register, 3 – Amend, etc. Again your IT staff can help you make this easy to understand.
  •  Resource – the person (or persons) that performed the activity. Why use persons? First most the the systems do not have the roles updated, second if you are designing an ad-hoc process there is no predefined process model, thus the only reference how information flows, how things are done is looking to people data, third if you have a team that perform better than other you can only understand that if you know with whom you are dealing with, if you the right people on place (or … not). One remark: if the process is executed by many people (more than the number you are used to manage it can be cumbersome to look to the data, but its more easier to do it in a structured way.

Some remarks about data:

  • Process mining is all about the data. If there is no data, there is no mining.
  • If data quality is bad (incomplete – sometimes people don’t fill important information on systems you need to look at, imprecise, meaning that the data you are looking is not accurate) there are no miracles and you need to spend time to fix it.
  • The above attributes are the minimum set to understand action <> reaction, interaction and how things are done. Of course you can add more attributes like site, categories, products, but don’t let data greed to start playing a role in the project, it will let you down.

Discover the process

How much time you take to discover a process? Six months? How much money and time you spend to understand what is happening? Are you sure you are discovering the truth (or something in between).

Now the time finally comes. You discover the process model. You can discover the process model using different dimensions. The most used is time, but you may be interested in social network construction and understand how people behave. This last is important for ad-hoc processes where everything is about  how people work together.

There are a lot of things you can do with models, depending again of the type of project you are taking:

  • Conformance – what people do is aligned with company policies;
  • Matching – the discovered model equals other defined inside the company;
  • Patterns, variants, social network, for ad-hoc processes this can be the most value information because shows how people work and behave.
  • Finding – when you don’t have an idea how the process is executed.

A lovely non bpmn process model

The discovered model is the base of all your work in project because:

The model is like a memory of a concept that remains inside your brain. Is about having consciousness of everything that belongs to the process you are connected to. Every action your perform to get answers you link it to the model, thus the model is everything.

Dig, Dig, Dig

Using filters you can get answers that you need to improve.

  • How people manage its workforce? What managers do when there are back orders, back logs?
  • What are the reasoning actions when particular events quick in?
  • How people balance it’s work across a time period?
  • What data is being accessed? Why they need it? Is this the correct data to make  the best or quicker decision?
  • How can we influence cases that still running to be executed faster/better?
  • Where are the bottlenecks? What are the slowest tasks? Where is the rework?

Filtering the data, brings more detailed maps, brings awareness, you will react to the findings, build your how maps inside your head. Process mining is about constructing and structuring  and rediscovering the knowledge about a process. Soon you will drive inside your own maps and seek for answers. This is very far away from the classic approaches that looks like to me something like process philosophy (people around a table talking and reflecting and building consensus to next project meeting).

Replaying plays a key role because it reconstructs case execution like it happened before. Replaying is like matching your own ideas against the static discovered model and feel process behavior. It increases and nurtures your consciousness.


Process mining is a proved technology but it cannot be used alone. It must be combined with human reasoning.  It can be applied on both structured and  ad-hoc processes. It kills the long, never ending effort of getting information, you jump in immediately to improvement. You look to facts, not to people assumptions, you see and feel the reality and it’s fast and agile.

Interested  in a short key note? Check this link.

11 thoughts on “A Process Mining Project

  1. Pingback: A Process Mining Project « End to End BPM | UXWeb.info

  2. Pingback: BPM – a year in review – 2011 « End to End BPM

  3. Pingback: 7 Typical Objections Against Process Mining — Flux Capacitor

  4. Pingback: A new challenge for Process Mining – Big Data « End to End BPM

  5. Pingback: New Killer Star « End to End BPM

  6. Pingback: Business Analysis Conference Europe 2012 – day 0 « End to End BPM

  7. Pingback: Process Mining Bible – Questions and Answers « End to End BPM

  8. Pingback: A Social Platform definition « End to End BPM

  9. Pingback: Social Network Analysis – part one – the importance of God on complexity | End to End BPM

  10. Pingback: Social Network Analysis – part two | End to End BPM

  11. Pingback: Enterprise Architecture Handbook part 5 | End to End BPM

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s