Prediction of Intermittent Demand Occurrence using Machine Learning

Demand forecasting plays a pivotal role in modern Supply Chain Management (SCM). It is an essential part of inventory planning and management and can be challenging at times. One of the major issues being faced in demand forecasting is insufficient forecast accuracy to predict the expected demand and fluctuation in actual vs. the predicted demand results in fore-casting errors. This problem is further exaggerated with slow-moving and intermittent demand items. Every organization encounters large proportions of items that have small ir-regular demand with long periods of zero demand, which are known as intermittent demand Items. Demand for such items occur sporadically and with considerable fluctuation in the size of the demand. Forecasting of the intermittent demand entails the prediction of demand series that is characterized by the time interval between demand being significantly greater than the unit forecast period. Because of this there are multiple periods of no demand in the intermittent demand time series. The challenge with these products with low irregular demand is that these items need to be stocked and replenished at regular interval irrespective of the demand cycle, thus adding to the cost of holding the inventory. Since the demand is not continuous, Traditional Forecasting models are unable to provide reliable estimate of required inventory level and replenishment point. Forecast errors would resulting in obsolescent stock or unfulfilled demand. The current paper presents a simple yet powerful approach for generating a demand forecasting and replenishment process for such low volume intermittent demand items to come up with a recommendation for dynamic re-order point, thus, improving the inventory performance of these items. Currently, the demand forecast is generally based on past usage patterns. The rise of Artificial Intelligence/Machine Learning (AI/ML) has provided a strong alternative to solve the problem of forecasting Intermittent Demand. The intention is to highlight that machine learning algorithm is more efficient and accurate than traditional forecasting method. As we move forward to industry 4.0, the digital supply chain is considered as the most essential com-ponent of the value chain wherein the inventory size is controlled, and the demand predicted.


Introduction
SCM constitutes a significant portion of an organizations total cost and has com-mensurable impact on the Company's performance [1].Good SCM can improve customer service, reduce operating and inventory costs, and bring significant financial gains to an organization.With increasing competition, companies are facing increasing customer expectations and products need to be delivered to the customer at the right cost, to the correct place, at the precise time and in the required quantity.It, therefore, becomes necessary for companies to invest in implementing modern supply chain technologies and strategies to get a competitive advantage over rivals.
Most spares and components used in Maintenance, Repair and Operations (MRO) are generally permeated with intermittent demand, meaning that they are not needed on a regular, consistent basis.This can make it difficult for manufacturers to accurately predict demand and maintain appropriate inventory levels.Because of this, manufacturers frequently maintain sizable pool of spares and maintenance stock, to prevent any downtime triggered due to lack of spares.This stratagem can lead to high inventory carrying costs and obsolescence [1].To overcome these challenges, manufacturers have started using various inventory management strategies such as just-in-time (JIT) and Kanban systems, which aim to optimize inventory levels by aligning production and delivery with actual demand.
To overcome this challenge, organizations are increasingly adopting innovative SCM technologies that utilize advanced planning and scheduling techniques to create an optimized schedule to increase efficiency and obtain a maximum output with limited resources by synchronizing the supply with the demand period and reducing inventory.To achieve efficient inventory management, it is necessary to find the demand probability based on historic demand to get a reliable demand estimate.Additionally, companies are utilizing forecasting methods and demand planning techniques to improve the accuracy of their inventory replenishment decisions and to minimize the impact of lead-time variability.There are various forecasting techniques available to predict the demand.

Figure 1. Intermittent Demand Pattern
Forecasting of intermittent demand focuses prevising of demand series where the interval between demands is remarkably greater than the unit time of the period forecasted.This results in several periods with no demand [2].Requirement for intermittent demand items may be sporadic and considered as slow movers but can make up as much as 60% of the stock value [3].Any uncertainty in demand prediction raises the requirement of safety stock and thus ties up capital required to hold the inventory along with the cost of storage space.Reliable forecast of demand is required to support inventory sizing and reordering point as stock replenishment of such items at regular intervals adds to the cost of the organization as the inventory carrying costs itself is around 15% to 30% on top of the value of the inventory [3].

Figure 2. Demand Pattern
In 1984, T. M. Williams suggested a technique to categorize the demand patterns into "Smooth", "Slow-Moving" or "Sporadic" using variance partitioning, which the author has defined as "partitioning the variance of demand during a lead time into its constituent causal parts" [5].The purpose of this categorization was to have a separate inventory management policy and utilization of the best forecasting technique for each category of products.Classifying the demand based on patterns as shown in Figure 2 generally gives valuable insights into future demand [6] but the stochastic demand pattern of intermittent demand items causes hindrance in accurate forecasting and planning.
To ensure that the most appropriate prediction method is applied to demand with a particular attribute, demand patterns have been robustly classified into four categories erratic, smooth, lumpy and intermittent using two metrics, first the Average Demand Interval (ADI) and secondly, Coefficient of Variation (CV2).ADI is the average time between subsequent demand occurrences in the previous demand data (known as intermittence) and CV2, is a ratio that measures demand variability as the standard deviation of demand data squared and then divided by the mean demand (known as lumpiness) [7].Intermittent demand has a relatively high ADI as seen in Figure 3.

Figure 3. Demand Classification
Currently most demand forecasting is done by using time series forecasting methodologies such as Simple Moving Average (SMA), Exponential Smoothing and Simple Exponential Smoothing (SES), Autoregressive Integrated Moving Average (ARIMA), etc.However, these methods are not suited to predict intermittent demand as time series models do not take into account the underlying structure of intermittency in the demand and give more weightage to recent demand [8].Also, the variation in-demand size and period make it difficult to forecast this type of demand and results in a severe error, rendering classical statistical models infeasible.
The main purpose of an inventory control system is to determine the triggering point for stock replenishment [4].Most organizations can reduce their inventory size by using more efficient inventory control tools.To achieve the target of an efficient inventory management it is necessary to find the demand probability based on historic demand to get a reliable demand estimate.Most organization today are migrating to advanced algorithms and predictive analytics utilizing historical data to predict the future demand.

Related Work
J.D. Croston, in the year 1972, published "Forecasting and Stock Control for Intermittent Demands" [9] where he concluded that statistical methods like exponential smoothening is largely biased due to large smoothing constants and proposed a possible improvement in the forecasting system that aimed to reduce this error that arises when there are multiple periods of no demand.Croston further introduced a new methodology to forecast products with intermittent demand that have a univariate forecast profile that came to be known as Croston's method.It is a modification of the exponential smoothing method that is adapted for intermittent demand products and is very useful in certain circumstances.Croston's method is considered to be the gold standard in intermittent demand forecasting and majority of research in this field has Croston's estimator as the basis, that has demand sizes and inter-demand intervals being utilizes exponentially smoothed [10].
The supposed "count data forecasting", as proposed by Croston in the year 1972 [9], concentrates separately on both variables independently using a robust exponential smoothing approach and estimates a future mean demand size forecast using the ratio of demand size and the demand interval forecasts, which is more intuitive and accurate for intermittent demand data.Separate single exponential smoothing evaluation of the average demand size and the demand interval are calculated after the demand occurs as seen in Equations ( 1)-(3).Therefore, when no demand eventuates, the forecast valuation before and after the demand period remains constant as seen in Equations ( 4)-( 6).The ratio of the above smoothing estimates of demand quantity and the demand interval then gives the average demand per period.

Let,
Dt, be the existing demand occurring at time t, Zt, be the approximate average of non-zero demand size for time t, Pt, the approximate average of the size of interval in between non-zero demands, q, be the number of successive zero-demand periods, and, F, denotes an estimate of mean demand size forecast Then, If Dt > 0, Where α is the learning Parameter that is used to allocate importance to either the most recent observations or the historical ones (0<α<1) The most significant achievement of the Croston method over SES is its ability to estimate the period between the demand occurrences, which is of great value for inventory optimization.Syntetos and Boylan in their paper "On the bias of intermittent demand estimates" highlighted that Croston's method is biased [11].Various methodologies and adjustments of Croston's method have been proposed ever since, like the Syntetos-Boylan Approximation (SBA), Shale-Boylan-Johnston (SBJ) method, Teunter-Syntetos-Babai (TSB) method.
Nevertheless, due to the underlying bias and inhibitions of the above methods, all of these methods demonstrate mediocre accuracy across datasets.Croston method and its variants outperformed traditional methods with modest gains, and some studies even concluded that they have inferior performance [11] [12].Also, since these techniques are built using simple exponential smoothing that contemplates predicting inter-demand interval, and demand size, produces static forecasts which only gives an average demand over the range of the forecast period, which in turn burdens the organizations with excessive inventory and associated costs.The special characteristic of Intermittent demand increases the difficulty of the forecasting problem and hence it has received limited academic attention to further develop the available forecasting methods [13].Therefore, even the slightest improvement in the accuracy of intermittent demand prediction translates into remarkable savings.

Material and Methods
Limited forecasting methods have been developed particularly to deal with the problem of intermittent demand.The special characteristics of Intermittent demand further increases the difficulty of the forecasting problem and hence it has received limited academic attention to further develop the available forecasting methods.Most Academicians and practitioners use advanced probabilistic models to forecast intermittent demand rather than predicting the exceptional occurrence of "peaks-over-threshold time series".These methods typically aspire to reconstitute the spread of the elementary phenomena [13].Due to the randomness and nonlinearity of the intermittent demand distribution, traditional forecasting methods are not able to predict results with good accuracy.
Traditional forecasting methods presume the demand to be fixed whereas intermittent demand data is non-stationary due to periods of no demand and fluctuating demand sizes.Intermittent Demand has two major uncertainties, variations in the demand size and the demand interval.Accurate and reliable forecasting methods for intermittent demand can help us deal with both the above uncertainties.
Traditional forecasting models were compared with Croston method and its variants, most results indicated that Croston methods and its variants gave inconsistent results for intermittent demand.It outperformed traditional methods with modest gains, and some studies even concluded that Croston method and its variants had inferior performance [2], [3].AI/ML have gained a lot of attention in the recent past due to improvement in computational might.Machine Learning (ML) based forecasting models have achieved initial success in improving forecast accuracy including intermittent demand forecasting.It requires additional input data along with the historical demand values of the time series being forecasted i.e., it works on a multivariate model and not univariate as employed by time series or Croston's Method.
The inducement for this paper is to have improvement over Croston's method of intermittent demand forecast by classifying the likelihood of the rare event demand occurrence using AI/ML.As highlighted earlier Intermittent demand consists of two major components, interval between the demand and the rate of demand.The inter-demand interval is the duration demands, i.e., the period between the previous and current demand occurrence which is generally greater than one in case of intermittent demand and demand rate is the demand value.The objective here is to create a classification model to predict the inter-demand interval for intermittent demand more reliably and hence lower the typical stock holding value whilst sustaining high Customer service levels.The classification of demand occurrence will enable business to control the inventory size and reduce safety stock by forecasting a dynamic reorder point just before the demand is expected which will overcome the limitations of Croston's method.

Naïve Bayes Classifier
The Naïve Bayes classifier is simple "probabilistic classifiers" based on applying Bayes' theorem with staunch (naïve) assumption of independence between each pair of features [14] as shown in Figure 4.They are among the simplest Bayesian network models and work well for both binary and multi-class classifications.Naive Bayes is among the most prevalent machine learning algorithm due to its simplicity and ease of use, and sometimes outperforms even highly sophisticated classification methods, specially when less training data is available.By presuming that features are independent of a given class, it makes learning simpler.Even though independence is often an abysmal assumption, in reality naive Bayes frequently outperforms more sophisticated classifiers.The beauty of this method lies in its incredible speed as compared to other classification algorithms.

Support Vector Machine (SVM) Classifier
The SVM classifiers belong to a family of generalized linear classifiers that maximize predictive accuracy while avoiding over-fitting.SCM constructs a hyper-plane as seen in Figure 5 or set of hyper-planes dividing the features space which is developed to maintain the proper class division and keeps the objects in the features space separated by doubling the distance from the hyperplane in order to maximize the margin (m).[16].SVM uses a function known as the SVM kernel to turn low-dimensional input space into a higher dimensional space, or, more specifically, to convert non-separable problems into separable problems by adding additional dimensions.In contrast to neural networks, which can easily over generalize, SVM prevents over generalization.This makes SVM very powerful, flexible, and accurate [16].Artificial Neural Networks (ANNs) are a category of machine learning models intended to replicate the architecture and functioning of biological neurons in the brain.They are made up of layers of interconnected "neurons" that process and transmit information.Multilayer Perceptron (MLP) is a category of feedforward ANN.MLP contains many perceptron that are organized into layers.An MLP is composed of a minimum of three layers of nonlinearly activating nodes: an input layer followed by a hidden layer and then an output layer as seen in Figure 6.Every node, with the deviation of the input nodes, is a neuron that employs a nonlinear activation function.for training, MLP uses a supervised learning technique called backpropagation.Multiple layers in MLP and its non-linear activation differentiates it from linear perceptions.It has the capability to delineate non-linearly separable data.[18].

Methodology
This paper aims to tackle the business problem from an analytic approach, utilizing new age analytical tools such as AI/ML.The advantage of ML in forecasting intermittent demand is by virtue of its capability to deduce a non-linear procedure without needing any postulation about the distribution [20].This Project aims to approach business concerns from an analytic perspective, leveraging analytical tools Machine Learning and causal modelling.ML can be utilized to generate forecasts for dynamic demand rates, without presuming that the demand rate would remain constant in future, and it can reflect the correlation between non-zero demand and rate at which the demand occurs between successive demand events vanquishing the impediment of Croston's method.
The classification of demand occurrence would enable the business to control the inventory size and reduce safety stock by forecasting a dynamic reorder point just before the demand is expected, which will overcome the impediment of Croston's method.This can help to save the cost of holding the inventory, as well as free up capital that is needed to procure and store the inventory; to be utilized for more pressing needs.
The proposed method as shown in Figure 7 classifies the peaks in the demand to improve inventory control and eventually customer service levels as opposed to Croston's method that gives an average demand per period.The intention is to highlight that machine learning algorithms are more efficient and accurate than traditional forecasting methods such as Croston's method for Intermittent Demand Forecast.The quantitative data of factors influencing the demand is taken and trained using ML classifiers.The final objective is to see if machine learning methods are able to capture underlying factors in Intermittent Demand better than Statistical Methods.

Data
The dataset used for performing the analysis is intermittent demand data consisting of four columns that included the day of the week on which the sale is made, promotional discount on that particular day, details of the marketing campaign performed or not, and Sales made for a period of 61 days.The dataset is bifurcated into training data and testing data for performing the modelling.Training sample is used for building the model and the testing data sample is used to evaluate the prediction reliability.An 80-20 split is performed on the data set i.e 80% data for training and 20% data for testing.

Feature Engineering
To improve forecast accuracy, additional features are developed to encapsulate the distinctive attribute of intermittent demand forecasting problem, to help the algorithmic proposition learn the motif better.
The following features are added: 1. Rolling window: A rolling window with a window size of 7 days was created.2. Lag: 7 lags of the demand were created (lag1 = Dt-1, lag2 = Dt-2, lag3 = Dt-3...and so on).3. Zero Cumulative: The count of periods in between the non-zero demand periods [21].4. Rolling Mean: Average for a window of data for 3 and 6 months.5. Demand Days: Cumulative sum of demand rolling window of 7 days.6.No Demand Days: 7 -Demand Days.

Model
After additional features are created, we move forward with the data modelling.The quantitative data of factors influencing the demand is taken and trained using ML classifiers.The three ML techniques as discussed in the previous section, namely the Naive Bayes classifier, SVM Classifier, and NN -MLP classifier were used to create classification models with and without the additional engineered features.The six models thus created were compared against each other for performance and accuracy.

Experiment and Result
Table Ⅰ, illustrates the performance of each of the ML models.The Naïve Bayes model achieved an accuracy of 77% without the additional engineered features and the accuracy improved to 92% for the model with feature engineering.Similarly, NN-MLP model came up with an accuracy of 92% without the feature engineering however the accuracy deteriorated to 85% for the feature engineered model suggesting that additional features could nor enhance the classification results.However, SVM model outperformed both these models and came up with an accuracy of 92% without feature engineering that went up to 100% for the model with the feature engineering.
The forecast accuracy metrics of demand classification models are evaluated with respect to the stock control implications.The above results show that SVM classifier can forecast the occurrence of non-zero precisely with the help of additional features in the data.This approach can be very useful to adopt a forecast-based inventory control and a dynamic reordering and replenishment trigger could be created based on the model prediction while ensuring sufficient stock is available to cover all demand.

Conclusion
The proposed solution focuses on the broader use of Machine Learning Models and Classification techniques precisely to address the difficulty in forecasting of intermittent demand to predict the special event of demand occurrence.The likelihood of demand occurrence is forecasted using classifiers, which would be consecutively combined with the aggregated average for the demand period obtained from the conventional Croston's method, to give the demand size.The proposed method is easy, applicable, quick, and a robust substitute for more intricate Statistical Forecasting methods that result in zero excessive stock while ensuring that the demand is met.It is to the best of our knowledge that this proposition should intuitively appeal and be extremely suitable for highly uncertain situations like intermittent demand forecasting.However, the limitation of this approach is that it relies on Croston's method for the 'demand per period' estimate to be aggregated for the actual demand size to be predicted.

Table 1 .
Model Results