In highly automated factories, production rarely comes to a complete halt. There are no sparks, alarms, or catastrophic breakdowns. Instead, machines pause briefly. A few seconds here. A minute there. These interruptions are so short that they often escape formal reporting, yet their cumulative effect can be devastating. Across large-scale manufacturing systems, these small pauses quietly erode productivity, reduce machine availability, and inflate operational costs.
These interruptions are known as micro-downtime events, or microfailures. They occur when machines momentarily stop correcting process deviations, adjust parameters, or address nonconformities before a product moves further down the line. Individually, they appear insignificant. Over weeks and months, however, they can translate into substantial losses in output and efficiency.
A recent study led by Mustapha Belmouadden presents a data-driven approach to addressing this invisible problem. Published in IEEE Access, under the title “Real-Time Decision Support System for Dynamic Optimization in Multi-Product Process Manufacturing” the research introduces a real-time decision support system that uses explainable artificial intelligence to predict and mitigate micro downtime in complex multi-product manufacturing environments. Conducted at Polytechnique Montréal in Canada, the study represents one of the most comprehensive attempts to integrate interpretable machine learning into the core of industrial decision-making.
The silent productivity drain inside factories
Modern manufacturing systems are designed for speed, precision, and reliability. Yet they operate under conditions of extreme variability. Machines process hundreds of product types, and raw materials behave differently depending on environmental conditions; operators must respond quickly to deviations without full visibility into the underlying causes.
Micro downtime often occurs when a product begins to deviate from its technical specifications. Operators intervene by adjusting process parameters, such as pressure or component length, frequently relying on experience and trial and error. While this expertise is invaluable, it is also time-consuming and inherently reactive. Each adjustment introduces a pause in production, even when the product is ultimately brought back into compliance.
Belmouadden and colleagues highlight that in many industrial contexts, up to a fifth of production cycles may involve non-compliance events that require corrective action. Importantly, most of these products are eventually corrected and pass final inspection. The cost is therefore not scrap or rework, but lost time and reduced machine availability. This distinction is critical, as it explains why micro downtime remains underaddressed despite its economic impact.
Why prediction matters more than correction
Traditional manufacturing quality systems primarily focus on detecting defects after they have occurred. Sensors, inspections, and quality gates are designed to identify non-conformities before products reach customers. While effective for quality assurance, these systems do little to prevent the disruptions that occur earlier in the process.
The research published in IEEE Access shifts the focus from detection to prediction. Instead of asking whether a product is defective after assembly, the proposed system predicts whether a product is likely to become non-compliant while it is still being assembled. This predictive approach opens the door to proactive adjustments that can prevent micro downtime before it occurs.
Crucially, the authors do not treat manufacturing as a static system. They recognise that production behaviour depends on sequences of events. What happened to the previous product, or the one before that, can influence the next. This temporal dimension is often overlooked in conventional models but proves essential for understanding how micro failures propagate through production lines.
Real-world data at an industrial scale
One of the defining strengths of this study is its use of real-world industrial data. Rather than relying on laboratory experiments or simulations, the research analyses a full year of production data from an automated tire manufacturing facility. The dataset comprises approximately 3.5 million observations, encompassing over 500 distinct product types.
Each observation integrates heterogeneous data sources. These include product specifications, machine calibration parameters, production cycle times, environmental conditions such as temperature and humidity, and detailed records of delays and stoppages. The target variable is product conformity, defined by whether critical joint lengths remain within specified tolerances during assembly.
The scale and diversity of the data present significant challenges. Manufacturing datasets are typically high-dimensional, highly imbalanced, and composed of both numerical and categorical variables. Non-conforming products form a minority class, making accurate prediction difficult without careful modelling choices.
Designing machine learning for multi-product complexity
To address these challenges, the study adopts a structured modelling framework grounded in industrial expertise. Rather than building a single monolithic model, the researchers segment machines into homogeneous groups based on shared operational characteristics. This segmentation allows models to capture machine-specific behaviours while retaining generalisability across products.
The study evaluates several machine learning algorithms, including Light Gradient Boosting Machine, CatBoost, XGBoost, and Long Short-Term Memory networks. Each model is trained using weighted samples to account for class imbalance and optimised using Bayesian hyperparameter tuning. Performance is assessed using recall, precision, and F1 score rather than accuracy alone, reflecting the importance of correctly identifying rare but critical non conformities.
Human expertise meets Industry 4.0
The study positions artificial intelligence as a complement to human expertise rather than a replacement. Manufacturing operators possess deep tacit knowledge that cannot be fully encoded in algorithms. At the same time, they cannot process millions of data points in real time or detect subtle statistical patterns across production sequences.
By combining predictive modelling with explainability, the proposed system supports informed decision making. Operators receive early warnings, contextual insights, and directional guidance while retaining control over final adjustments. This human-in-the-loop design addresses common concerns about automation and fosters trust in AI-driven systems.
From an Industry 4.0 perspective, the research illustrates how digitalisation, machine learning, and explainability can converge to enhance resilience and productivity. It also highlights the importance of designing AI systems that integrate seamlessly within existing workflows, rather than imposing radical changes.
Reference
Belmouadden, M., Dadouchi, C., and Pellerin, R. (2025). Real time decision support system for dynamic optimization in multi product process manufacturing. IEEE Access, 13, 53895–53908. https://doi.org/10.1109/ACCESS.2025.3553034
