How to Analyse Downtime: A Step-by-Step Framework for Maintenance Teams
Master downtime analysis with this practical step-by-step framework. Learn how to identify bad actor assets, run root cause analysis, and build a repeatable process that reduces unplanned stoppages.

How to Analyse Downtime: A Step-by-Step Framework for Maintenance Teams
If you run a production line, you feel downtime in your bones.
Lost output. Overtime to catch up. Operators standing around. Angry emails about missed orders.
Most plants collect downtime data somewhere – in a CMMS, SCADA, MES, or even in spreadsheets – but very few teams have a simple, repeatable downtime analysis process they run every week.
This article lays out a step-by-step framework for downtime analysis that any maintenance team can use. We will start from the data in your CMMS and finish with a focused action plan for reducing unplanned stoppages.
What Is Downtime Analysis (And Why It Matters)
Downtime analysis is the process of taking all the small and large stops on your equipment, organising them, and identifying:
- Which assets cause the most lost time
- Which failure modes or causes occur most often
- Where you can act to get the biggest win in the shortest time
Done well, downtime analysis gives you:
- A clear list of "bad actor" assets
- A ranked set of root causes to attack
- Evidence to justify maintenance spend and improvement projects
Done badly, it becomes just another report that nobody reads.
The goal of this framework is to make downtime analysis practical, fast, and repeatable, not perfect.
Step 1 – Define the Scope of Your Downtime Analysis
Before you touch the data, define the scope. Otherwise, you get lost in noise.
Decide on:
-
Time window
- Last 30 days
- Last quarter
- Last 12 weeks (to catch patterns)
-
Area / line / asset group
- One critical production line
- All packaging lines
- Utilities (boilers, compressors, chillers)
-
Downtime types
- Unplanned downtime only
- Unplanned + changeovers
- Include / exclude micro-stops (e.g. stops < 5 minutes)
A good starting point:
Analyse unplanned downtime on one critical line for the last 90 days.
That gives you enough data for patterns without becoming overwhelming.
Step 2 – Get the Right Data Out of Your CMMS
Next, you need a data set you can actually work with.
From your CMMS or downtime logging system, export at least the following fields:
- Asset / equipment ID
- Asset description
- Start time of downtime event
- End time of downtime event
- Duration (minutes)
- Downtime type (planned changeover, unplanned breakdown, minor stop, etc.)
- Cause code / failure code (if you have it)
- Free text description (operator or technician comments)
- Work order number (if applicable)
CSV or Excel is fine – the key is consistency.
If your system does not calculate duration, you can derive it as:
Duration (minutes) = End time – Start time
Make sure your export covers the time window and assets you chose in Step 1.
If you need more guidance on working with CMMS exports and data quality, see our Ultimate Guide to CMMS Data Analysis.
Step 3 – Clean and Standardise Your Downtime Data
Raw downtime data is messy. If you try to analyse it without cleaning, you will get junk.
Focus on a few simple cleaning rules:
-
Remove obvious errors
- Negative durations
- Durations of 0 minutes
- Events longer than your shift length (unless they are genuine extended outages)
-
Standardise timestamps
- Ensure all timestamps are in the same time zone and format
- Fix any obvious date errors (e.g. future dates, wrong year)
-
Check asset names
- Standardise common spelling issues
- Merge duplicated assets (e.g. "Filler 1" vs "Filler-1")
-
Filter to the scope
- Keep only the line / asset group and timeframe you chose
You do not need perfect data. You just need it clean enough that your numbers are credible to your team.
Step 4 – Build a Simple Downtime Classification Structure
To make downtime analysis useful, you need a way of grouping events into meaningful buckets.
If you already have good cause codes in your CMMS, use them. If not, build a simple classification from what you have:
-
High-level categories (examples)
- Mechanical failure
- Electrical / control failure
- Changeovers
- Cleaning / sanitation
- Waiting on materials
- Waiting on operators
- Quality rework
- External (power outage, upstream process, etc.)
-
Sub-categories where useful
- Mechanical failure → conveyor jams, bearing failures, gearbox failures
- Electrical / control → sensor faults, PLC issues, drive trips
You can build this classification by:
- Reviewing the free-text descriptions
- Grouping similar issues
- Assigning each event a category and sub-category
Start simple and improve it over time. The aim is to be able to say:
"Most of our unplanned downtime comes from this category on these few assets."
Step 5 – Calculate the Core Downtime Metrics
With your cleaned and classified data, you can now calculate a few key metrics.
For each asset and each downtime category, calculate:
-
Total downtime (minutes) – Sum of all durations.
-
Number of events – How many times the downtime occurred.
-
Average duration per event – Total downtime ÷ number of events.
-
Mean Time Between Failures (MTBF) – Time in operation ÷ number of failures. Even a rough estimate (e.g. hours run per week) is useful.
-
Mean Time To Repair (MTTR) – Total repair time ÷ number of failures. Approximated by the average downtime duration if you don't have detailed labour time.
You do not need to turn this into a complicated reliability study. The practical questions are:
- Which assets are causing the most lost minutes?
- Which failure modes happen most often?
- Where is the combination of frequency and duration highest?
Step 6 – Use Pareto Analysis to Find the "Vital Few" Losses
Now we apply the classic 80/20 principle.
Create Pareto charts (bar charts ordered from highest to lowest) for:
-
Downtime by asset
- X-axis: assets
- Y-axis: total downtime minutes
-
Downtime by category / cause code
- X-axis: categories or failure modes
- Y-axis: total downtime minutes
You will almost always see the same pattern:
- A small number of assets or causes account for a large share of downtime.
Those are your "vital few" bad actors.
For example, you might see:
- Filler 1 – 26% of unplanned downtime
- Labeller 2 – 18%
- Case packer 1 – 15%
This immediately focuses your improvement effort.
Modern AI tools can automatically detect these patterns in your data. If you're interested in how AI is transforming downtime pattern detection, see our guide on AI & Machine Learning in Maintenance.
Step 7 – Run Root Cause Analysis with the Team
Numbers tell you where the pain is. People tell you why it is happening.
Take your top 3–5 downtime causes or assets and run a quick root cause analysis session with:
- Maintenance technicians
- Operators from the line
- The planner / supervisor
Use simple tools:
- 5 Whys – keep asking "why?" until you reach a systemic cause.
- Cause-and-effect (fishbone) diagram – look at potential causes across methods, machines, materials, manpower, environment, and measurement.
Ground the discussion in real data:
- Show recent downtime events on the screen
- Read out free-text comments from work orders or logs
- Ask: "What is really going on here? What do we see when this happens?"
Aim to finish with:
- 2–3 clear root causes per major downtime item
- A shortlist of countermeasures that are realistic for your plant
Step 8 – Turn Insights into an Action Plan
Insight is useless without action.
For each major downtime cause or bad actor asset, define:
-
Specific action – E.g. redesign a problematic chute, introduce a new inspection step, improve operator training, adjust PM frequency.
-
Owner – One person accountable, not a committee.
-
Due date – Realistic but tight enough to maintain momentum.
-
Expected impact – Rough estimate of downtime reduction (e.g. "aim to cut these failures by 50%").
-
Follow-up check – When will you review whether the action worked?
Capture these in a simple table or tracker. This becomes your downtime reduction backlog.
Step 9 – Make Downtime Review a Regular Ritual
One-off analyses are better than nothing, but the real gains come from a cadence.
Set up a regular rhythm such as:
-
Weekly downtime review – 30–45 minutes
- Review previous week's top downtime causes
- Check progress on actions
- Decide next 1–2 focus items
-
Monthly deep dive – 60–90 minutes
- Look at trends over the last 12 weeks
- Validate that improvements are sticking
- Decide if PM strategies or spare parts strategies need adjusting
Keep the visuals simple:
- A Pareto chart of downtime by asset
- A Pareto chart of downtime by cause
- A trend line of total unplanned downtime per week
The measure of success is not the report itself; it is:
- Fewer breakdowns
- More stable output
- More predictable maintenance workload
Common Pitfalls in Downtime Analysis (and How to Avoid Them)
1. Drowning in Data, Starving for Insight
Pitfall: Exporting everything and building huge spreadsheets that nobody wants to touch.
Fix: Limit the scope. One line. One quarter. Unplanned downtime only. Build from there.
2. Poor Downtime Coding
Pitfall: Operators and techs select "Other" for half of all causes, or pick the first item in the list.
Fix:
- Simplify and train.
- Use a short, clear list of cause codes.
- Involve operators in designing the codes so they make sense.
3. Chasing the "Interesting" Instead of the "Impactful"
Pitfall: Spending time on rare but dramatic failures instead of the everyday small losses that quietly erode output.
Fix: Let the Pareto chart decide. Go where the minutes are, not where the noise is.
4. Treating Downtime Analysis as a One-Off Project
Pitfall: Doing a big analysis once, implementing a couple of fixes, and then going back to business as usual.
Fix: Build downtime review into your weekly routine. Short, consistent reviews beat big, infrequent projects.
Practical Downtime Analysis Checklist
You can use this checklist as a quick guide every time you run downtime analysis.
-
Define scope
- Time period chosen
- Line / area selected
- Downtime types included/excluded agreed
-
Export data
- CMMS / logging system export complete
- Required fields present (asset, timestamps, duration, cause, description)
-
Clean data
- Obvious errors removed
- Timestamps standardised
- Asset names cleaned
- Data filtered to agreed scope
-
Classify events
- High-level categories defined
- Sub-categories added where needed
- Major events assigned to categories
-
Analyse
- Total downtime by asset calculated
- Total downtime by cause calculated
- Pareto charts created
-
Decide actions
- Top 3–5 causes/assets selected
- Root cause sessions held
- Actions assigned with owners and due dates
-
Review
- Weekly review date booked
- Progress vs previous week checked
- Wins documented and shared with the team
How LeanReport Can Help
Everything in this framework is achievable with Excel – but it is slow and fragile.
Most maintenance teams do not have spare hours every week to:
- Clean messy CSV exports from multiple CMMS systems
- Build and maintain Pareto charts and trend graphs by hand
- Re-cut the data by asset, line, cause, and time period whenever someone asks a new question
LeanReport was built to make this process almost automatic:
- Upload your CMMS downtime export as a CSV
- LeanReport cleans, normalises, and structures the data
- You get ready-made downtime Pareto charts, bad actor lists, and trend graphs within minutes
- You can quickly filter by line, asset, cause, and timeframe without creating a new spreadsheet every time
Instead of spending your week preparing reports, you can spend it reducing downtime.
If you want to see what this looks like with your own data, upload a sample CSV or visit our How It Works page to learn more. Ready to start? Check out our pricing and begin your free trial today.
Frequently Asked Questions
What is the main goal of downtime analysis?
The main goal of downtime analysis is to identify the small number of assets and failure modes that cause most of your lost production time, so you can focus your maintenance and improvement effort where it will have the biggest impact.
How often should we analyse downtime?
A good rhythm is weekly for a short review of the last week's performance and monthly for a deeper look at trends. The key is consistency – small, regular reviews beat big, infrequent projects.
Do we need perfect data before we start?
No. You need data that is clean enough to be trusted, not perfect. Start with what you have, fix the biggest data quality issues, and improve your downtime coding as you go.
What tools do we need for downtime analysis?
At minimum, you need a CMMS or logging system that records downtime events, a way to export to CSV or Excel, and a tool (Excel, BI tool, or LeanReport) to aggregate and chart the data. Dedicated tools like LeanReport can save significant time by automating the cleaning and analysis steps.
How do we get operators to enter good downtime data?
Keep the process simple and explain why it matters. Use a short, clear list of cause codes, involve operators in designing them, and feed back the results in weekly meetings so people can see the impact of the data they enter.
What is Pareto analysis and why is it important for downtime?
Pareto analysis is a technique based on the 80/20 principle – it reveals that a small number of causes (typically 20%) account for the majority of the impact (80%). In downtime analysis, Pareto charts help you identify the "vital few" assets or failure modes that deserve immediate attention, rather than spreading effort thinly across all problems.
About the Author

Rhys Heaven-Smith
Founder & CEO at LeanReport.io
Rhys is the founder of LeanReport.io with a unique background spanning marine engineering (10 years with the Royal New Zealand Navy), mechanical engineering in process and manufacturing in Auckland, New Zealand, and now software engineering as a full stack developer. He specializes in helping maintenance teams leverage AI and machine learning to transform their CMMS data into actionable insights.