# Probabilistic lead time forecasting

Accurately estimating future lead times is critical for accurately estimating the amount of inventory needed to fulfill the future demand. Lokad’s forecasting engine features a *lead time forecasting* mode that is specifically tailored for lead times. Indeed, lead times need to be forecast just like one forecasts demand. Lead times also exhibit multiple statistical patterns, such as seasonality or day-of-the week effects. The lead time forecasts produced by Lokad’s forecasting engine are *probabilistic*, and represent the expected probabilities of every single lead time duration expressed in days. In this section, we detail the syntax used for computing lead time forecasts through Lokad.

## General syntax

The forecasting engine has a special function - a *call function* as it is known in Envision terminology. The syntax is the following:

```
// "PO" stands for PurchaseOrders
Leadtime = forecast.leadtime(
category: C1, C2, C3, C4
hierarchy: H1, H2, H3, H4
supplier: Supplier
offset: 0
present: (max(Orders.Date) by 1) + 1
leadtimeDate: PO.Date
leadtimeValue: PO.ReceivedDate - PO.Date + 1
leadtimeSupplier: PO.Supplier)
```

Unlike regular functions, call functions have *named* arguments instead of *positional* arguments. These named arguments are more suitable for complex functions, because they make the source code much more readable - at the expense of a limited extra verbosity. These arguments behave just like regular function arguments, thus, they are permitted for Envision expressions.

The function returns a vector `Leadtime`

that is of type *distribution* (see also Algebra of Distributions). Distributions are an advanced data type that represent functions $p: \mathbb{Z} \to \mathbb{R}$. More specifically, the forecasting engine returns *random variables* - that is - distributions that are *positive* and have a *mass* equal to 1. In the present case, $p(k)$ represents the lead time probability associated with $k$ days. Each item - in the Envision sense - becomes associated with its own distribution.

The full `forecast.leadtime`

syntax includes many arguments, however, only two of them are mandatory:

`present`

: a scalar date value`leadtimeDate`

: a date vector with an*item*affinity

The `present`

value is the date intended as the first day to be forecast, following the assumption that data is complete up to the day before. Indeed, some businesses may be closed on Sundays for example, and if the most recent date found in the dataset is a Saturday, there is an ambiguity as to whether the forecast should start on Sunday or Monday. In the illustrative syntax above, we use `max(Orders.Date) + 1`

, assuming that orders are observed every day, and that the input data is fresh from the day before.

The `leadtimeDate`

and `leadtimeValue`

are expected to belong to the same table, which exhibits an item affinity, that is `[Id, *]`

in Envision terminology. The dates represent the starting days (inclusive) of the lead time observations. The values are expected to be expressed in days. Fractional days are not supported. This table contains the actual lead time history being forecast by the forecasting engine.

When `leadtimeValue`

is omitted, the successive durations in-between the `leadtimeDate`

dates are used as lead time values instead. This behavior is intended to forecast the *ordering* lead time; which is typically performed separately from forecasting the *supply* lead time.

Ideally, the history length should be as long as possible, although in practice there are limited benefits in exceeding 5 years’ worth of lead times. The forecasting engine accommodates both short and long lead time history alike, when this history is long, older data points simply fade into statistical irrelevance.

Beyond these mandatory arguments, the accuracy of forecasts can be greatly improved by providing more data to the forecasting engine. The following sections explain this in more detail.

## Categories and hierarchy

Categories and hierarchy play a very similar role from the forecasting engine perspective: they help the forecasting engine to cope with sparse historical data. See Forecasting with attributes.

## Supplier

The choice of supplier frequently plays an important part in the lead time estimation. Supplier data can be communicated to the forecasting engine in two ways that are complementary.

First, the history of lead time observations can be **categorized by supplier**. This is the purpose of the `leadtimeSupplier`

argument. When this argument is present, the forecasting engine leverages this information to assess whether the lead times associated with a given supplier are correlated or not. Supplier information offers a granularity of information that may not be found through the simple item hierarchy or categories, because items may come from different suppliers over time.

Second, one can inform the forecasting engine as to which **supplier is to be considered for the next shipment**. This is the purpose of the `supplier`

argument. When this argument is present, the forecasting engine can anticipate that an item may experience sudden lead time modifications, because the supplier has just been changed. Naturally, the forecasting engine may only leverage this information if the lead time history itself is properly categorized by supplier, as seen above.

## Lead time offsets

By default, the forecast begins at the `present`

day. However, sometimes, one might wish to have a forecast that starts at a later date. If the lead times are seasonal, the start date of the lead time might have a significant impact on its distribution of value. This case is handled through the `offset`

properties. This optional property expects to have a number vector of the *items* table. The values of this vector represent the offsets - expressed in days - to be considered for the lead time forecasts. For example if an item has an offset value of 10, then the first day of the forecast lead time will be `present + 10`

. When `offset`

is omitted, all offsets are considered to be zeros.