Predictive Maintenance Modeler

Predictive Maintenance Modeler using XGBoost's Accelerated Failure Time Implementation. This is a survival analysis model (see explanation below), applied to the automotive or industrial context. This is a personal project inspired by a work project; please feel free to fork the repository or contact me if you have suggestions.

Modeling Motivation

In many real world settings we are interested in estimating the duration of time until an event occurs, whether that event be a mechanical failure in the automotive or industrial context, death in a biological sense, or a customer purchase of a good or service (churn). The type of modeling context is known as survival analysis in statistics, reliability analysis in engineering, or duration analysis in economics.

Survival analysis seeks to model the time to a singlar event for an entity (e.g. a human or machine). However, in many real-life situations, entitles can potentially experience multiple types of event, e.g. a person can die from cancer, a heart attack, car accident. When only one of these types of events can occur this is referred to a competing events, in the sense that each event competes with the other to yield the observed event of interest, and the occurrence of one type of event precludes the occurrence of the others. Consequently, probabilities of each events are referred to as competing risks (for more information, see).

The code in this repository was developed to model a single outcome for classicial time-to-failure analysis. Competing events is currently not supported.

Censoring

Fundamental to both classical survival and competing events models is the idea of censoring, i.e. the label is not fully observed. As explained in the XGBoost documentation, there are four kinds of censoring:

Uncensored: The observation's true lifetime and event is observed
Left-censored: The observations true lifetime is only partly observed, though the event is observed. For instance, you're provided with telemetry data for a machine that already has some wear, but not data from the beginning of time. Over time, you see degredation and failure of the machine.
Right-censored: The observation's true lifetime is not fully observed and no event is observed (though it is assumed to eventually happen). For example, you are provided telemetry data for machines' first 12 months of life. During that period some are observed to fail, while others do not. These observations with no observed failure are right-censored, as their true lifetimes are to the right of the observed timeline, either because the study ended before an event, or because the subject exited the study early (attrition).
Interval-censored: A given observation exhibits both left- and right-censoring. This happens oftentimes in discrete-time panel data, where an observation is observed repeatedly in time₁, time₂, time₃, etc. The Scania example is one example of this type of censoring.

Critically, it is assumed that censoring is uncorrelated with the outcome(s) of interest. That is, censored observations are assumed to have the same probability of experiencing a subsequent event as those observations whose event was observed (the so-called "non-informative censoring" assumption).

Predictive vs. Preventative Maintenance

In automotive and manufacturing applications, we are frequently interested anticipating failures before they occur. There are several approaches to this, sorted by increasing level of modeling sophistication:

Reactive maintenance - perform maintenance only when a component breaks
Preventative maintenance - perform maintenance based on a maintenance (time or usage) schedules, regardless of the actual physical wear
Conditions-based maintenance - perform maintenance based on current wear indications, but the actual remaining useful life (RuL) is unclear
Predictive maintenance - perform maintenance using a forecast to estimate the RuL

Of these strategies, predictive mainteance is the most efficient as it promises to replace parts or components just in time, optimizing their lifecycle.

Quick Start

Requirements

The Python environment used for this project was created using the Conda package manager, which if you don't already can be downloaded here. This code has only been developed for local execution. It hasn't been configured for any cloud or Docker implementation, though this would be logical extension.

OSX

This code and environment were developed on a Mac. If you also use OSX, you can create the same Python 3.11.8 environment simply by running:

conda env create -f environment.yaml
conda activate pdm_v2  # activates the environment

Windows

conda create --name pdm_v2 python=3.12.2  # creates new virtual environment
conda activate pdm_v2

Windows users should next install the following packages pdm_v2 virtual environment:

conda install
- xgboost
- hydra-core
- pandas
- scikit-learn
- matplotlib
- lifelines
- optuna
- scikit-survival
pip install
- you can ignore these dependencies

This hasn't been tested on Windows, so you may need to experiment a bit.

Running the Code

Configuration

Prior to running the code you must configure the training and data build file, which is done in ./configs/config.yaml. Currently, this is populated with the default values for the Scania dataset (explained below).

This config file contains the following arguments:

mode: whether the modeler should be used for training or prediction (inference). Accepted values: [train, predict]
data:
- data_path: path to CSV training data. Code expects CSV data type and will throw an error otherwise
- unit_identifier: ID feature name
- time_identifier: time feature name
- target_feature: outcome feature name. Note - this feature is expected to be binary or boolean
- lag_length: only applies to panel data. Number of previous time-series lags to add as new columns per feature. Time-series observations < lag_length will be filled as far back as possible, using the values for a given ID's first time period to fill the rest
- sampling_n: Maximum number of observations per ID to sample. This reduces the influence of statistical outliers (e.g. vehicles with many time-series observations). Set to 1 to convert panel data to a simple cross-section by keeping only the last observation per unit_identifier. Note - sampling_n must be >= lag_length
training_config:
- test_size: proportion of unique IDs to use in cross-validation. Test set is constructued assuming a panel structure, though works equivalently for cross-sectional data
- hyperoptimize: whether to hyperoptimize using Optuna. This employs the default Tree-structured Parzen Estimator
- n_trials: number of hyperparameter trials to execute. Ignored if hyperoptimize is false
- save_model: whether to save trained XGBoost model as JSON at the end
predict_config:
- model_path: path to gzipped directory containing previously-trained model artifacts
seed: initializer value for pseudo random number generator

Note - don't alter the Hydra configuration information, this alters the default destination for outputting logs and artifacts.

Executing code in the terminal

Ensure the current working directory of your IDE is set at ./src

python -m pdm

Logging is streamed to the console during the train/predict procedure. At the end, a log file, figures, and model artifacts are output to a new folder with the datetime when training is initiated: ./output/<YYYY-MM-DD>/<HH-mm-SS>.

Example

To see an example of this code applied to a multivariate time-series dataset from Scania trucks, see scania example

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
configs		configs
data		data
outputs		outputs
scania_example		scania_example
src/pdm		src/pdm
.gitignore		.gitignore
README.md		README.md
environment.yaml		environment.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive Maintenance Modeler

Modeling Motivation

Censoring

Predictive vs. Preventative Maintenance

Quick Start

Requirements

OSX

Windows

Running the Code

Configuration

Executing code in the terminal

Example

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Predictive Maintenance Modeler

Modeling Motivation

Censoring

Predictive vs. Preventative Maintenance

Quick Start

Requirements

OSX

Windows

Running the Code

Configuration

Executing code in the terminal

Example

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages