# Time series

Most business operations of any given commerce can be accurately represented with time-series: historical sales, historical purchase orders, historical prices, etc. Since working with time-series is of primary importance for companies, Envision provides extensive native support for this type of scenario. In particular, Envision allows to aggregate data per day, week and month. Envision also supports more complex time-series analysis such as lagging data, or calculating moving averages. These features are illustrated and documented in this page.

## An illustrating example

Once more, we use our sample dataset to illustrate the capabilities of Envision. As this section represents a slightly more advanced usage of Envision, we suggest reading Making calculations and Aggregating data before proceeding to this section.

```
read "/sample/Lokad_Items.tsv" as Items[Id] with
Id : text
read "/sample/Lokad_Orders.tsv" as Orders expect [Id, Date] with
Id : text
Date : date
Quantity : number
NetAmount : number
read "/sample/Lokad_PurchaseOrders.tsv" as PO expect [Id, Date] with
Id : text
Date : date
Quantity : number
NetAmount : number
show label "Time-series calculations" a1f1 tomato
oend = max(Orders.date)
lastMonday = monday(oend)
Week.sold = sum(Orders.NetAmount)
when date >= oend - 52 * 7
show linechart "Weekly sales" a2f4 tomato unit:"$" with
Week.sold as "This year"
Week.sold[-52] as "Last year"
Week.ma = sum(Orders.NetAmount / 4) over [-3 .. 0]
when date >= oend - 52 * 7
show linechart "Weekly sales with 4 weeks moving average" a5f7 tomato unit:"$" with
Week.ma as "This year"
Week.ma[-52] as "Last year"
Day.cashFlow = sum(Orders.NetAmount) - sum(PO.NetAmount)
Day.balance = avg(Day.cashFlow) over [-13 .. 0]
when date >= lastMonday - 6 * 7 and date < lastMonday
show linechart "Cash flow over the last 6 weeks" a8f10 tomato unit:"$" with
Day.balance as "Balance"
PO.Price = PO.NetAmount / PO.Quantity
Orders.PurchasePrice = latest(PO.Price)
Orders.Cost = Orders.PurchasePrice * Orders.Quantity
Orders.Profit = Orders.NetAmount - Orders.Cost
Week.profitability = sum(Orders.Profit) / sum(Orders.Cost) default 1
when date >= lastMonday - 13 * 7 and date < lastMonday
show linechart "Profitability over the last 13 weeks" a11f13 tomato unit:"%" with
Week.profitability
```

After this script is run with the sample dataset, the following dashboard is obtained.

## Virtual calendar tables

Since daily, weekly and monthly aggregations are ubiquitous, Envision was designed to natively support these periodic calendar patterns. More specifically, Envision benefits from three virtual tables named `Day`

, `Week`

and `Month`

respectively. These tables are referred to as “virtual” because they do not have tabular file counterparts, or in other words, these tables only exist for the duration of the script’s execution. The script above leverages these virtual tables in order to display the line charts. Let’s review the relevant lines re-copied below.

```
oend = max(Orders.date)
lastMonday = monday(oend)
Week.sold = sum(Orders.NetAmount)
when date >= oend - 52 * 7
show linechart "Weekly sales" a2f4 tomato unit:"$" with
Week.sold as "This year"
Week.sold[-52] as "Last year"
```

At line 4, the content of the `Orders`

table is summed into the `Week`

table. There is a single value computed per week. At line 6, a filter is defined to exclude data that is more than 52 weeks old. Finally at lines 7 to 9, a line chart with two time-series displayed is defined. The second time-series `Week.sold[-52]`

comes with a lag operator, which is reviewed in the next section. This script can be easily modified for a daily or a monthly aggregation instead. For example, it is possible to add the following lines at the very end of the script:

```
Day.sold = sum(Orders.NetAmount)
show linechart "Daily sales" a14f16 tomato unit:"$" with
Day.sold
Month.sold = sum(Orders.NetAmount)
show linechart "Monthly sales" a17f19 tomato unit:"$" with
Month.sold
```

This block of code displays two more line charts, featuring time-series aggregated at the daily and monthly levels respectively. The variable `Day.sold`

can be interpreted as the content of the column named `sold`

within the `Day`

table, but it can also be interpreted as an equally spaced time-series of the “1 day” period – as opposed to the regular Envision tables that contain a `Date`

column such as `Orders`

and that can be interpreted as unevenly spaced time-series.

## Displaying lagged time-series

In time-series analysis, “lagging” refers to the operation which consists of moving time-wise the elements of the time-series. One of the most elementary purposes of time-series lagging consists of comparing two distinct periods of time. Envision supports a lag operator precisely targeted at dealing with such situations. Let’s revisit the previous code snippet.

```
Week.sold = sum(Orders.NetAmount)
when date >= oend - 52 * 7
show linechart "Weekly sales" a2f4 tomato unit:"$" with
Week.sold as "This year"
Week.sold[-52] as "Last year"
```

We have two time-series defined at lines 4 and 5. The first time-series `Week.sold`

is the original weekly aggregated sales. The second series comes with an extra suffix `[-52]`

. This suffix is the lag operator itself. Please remark that the argument must be a literal integer. In this example, the data that goes back 52 weeks in the past is moved forward and finally displayed in the line chart. When the lag operator is applied to the `Week`

table, then the lag argument is an integer expressed in weekly units. Similarly, the applicable units are days and months for the `Day`

and `Month`

tables respectively.

The lag operator benefits from the smart cooperative behavior of the filter operator `when`

. In fact, without this cooperative behavior, the filter `when`

would have already excluded all the data older than 52 weeks, and as a result, lagging the time-series of 52 weeks would have resulted in moving zeroes forward. Yet, as illustrated by the screenshot of the dashboard seen previously, the lag operator is correctly moving the actual one-year old data forward, and not merely the zeroes. This behavior is achieved through the built-in cooperation between the `when`

filter and the lag operator.

If you hover your mouse over the line chart within your dashboard, you should notice that dates and values get displayed. In particular, with the script at the beginning of this section, the dates reported for the time-series named “Last year” are indeed short of one year compared to the dates of the time-series named “This year”. However, it is important to note that the lag operator does not really preserve the original dates of the time-series. Instead, Envision simply uses a convention to make it work: if the lag operator is defined within the declaration of a line chart, then, and only then, the original dates remain preserved.

Let’s modify the script in order to apply the lag operator outside the linechart. This can be done by introducing a variable named `Week.lastYear`

.

```
Week.sold = sum(Orders.NetAmount)
Week.lastYear = Week.sold[-52]
when date >= oend - 52 * 7
show linechart "Weekly sales" a2f4 tomato unit:"$" with
Week.sold as "This year"
Week.lastYear as "Last year"//date display issue
```

If you run this modified script, and if you hover your mouse over the time-series dots, you should notice that both time-series are reported with the same dates. In this case, there is no non-ambiguous semantic for the dates to be displayed. For example, the `Week.lastYear`

time-series could have actually been interpreted as a year-to-year forecast, and as a result, having the same dates for the two time-series would have been what we were actually looking for in this case. In conclusion, if you wish to preserve the original dates in your line chart for side-by-side comparison of lagged time-series, then the lag operator should be defined within the `show`

statement.

## Aggregating data over a time window

In one of our previous guides, we have reviewed how to aggregate data with Envision. With time-series, there is another type of aggregation that is very desirable: the aggregation over a time window. The script from the beginning of this section illustrates how to compute a moving average of the sales over 4 weeks. The relevant lines of script are re-copied below for the sake of clarity.

```
Week.ma = sum(Orders.NetAmount / 4) over [-3 .. 0]
when date >= oend - 52 * 7
show linechart "Weekly sales with 4 weeks moving average" a5f7 tomato unit:"$" with
Week.ma as "This year"
Week.ma[-52] as "Last year"
```

At line 1, an aggregation with the `sum()`

aggregator is performed, and this aggregation comes with a statement starting with the keyword `over`

at the end. At line 2, data is filtered, specifically taking the last 52 weeks of data. Finally, at lines 3 to 5, two time-series are displayed within the line chart. Both time-series are “smoothed” as they are averaged over 4 weeks.

The operator `over`

is used to define the applicable time window, and is expected to be written as `[a .. b]`

where `a`

and `b`

are integers with `a`

being lower than or equal to `b`

. The unit used for `a`

and `b`

depends on the expression on the left side of the assignment. In the present case, we have the `Week`

table on the left side of the assignment, and as result, `-3`

and `0`

are expressed in weeks.

Gotcha: between the week of index -3 and the week of index 0, there are 4 weeks, not 3 weeks. As a result, we have the following weeks of indices -3, -2, -1 and 0.

The option `over`

can be used with all aggregators. When this option is used, the left side of the assignment is usually a virtual calendar table such as `Day`

, `Week`

and `Month`

. However, this is not a requirement, and any table can be used as long as it is indexed by a `Date`

. Also, by definition, when calendar tables are used, using `over [0 .. 0]`

gives the same results as the default aggregation:

```
Week.sold = sum(Orders.NetAmount)
// same result!
Week.same = sum(Orders.NetAmount) over [0 .. 0]
```

## More complex time-series aggregations

The syntax of Envision offers the possibility to perform more elaborate time-series calculations. For example, it is possible to compute time-series and then, to make further calculations based on those initial time-series. The third code block in the script at the top of this page illustrates this. The relevant lines of script are copied below.

```
Day.cashFlow = sum(Orders.NetAmount) - sum(PO.NetAmount)
Day.balance = avg(Day.cashFlow) over [-13 .. 0]
when date >= lastMonday - 6 * 7 and date < lastMonday
show linechart "Cash flow over the last 6 weeks" a8f10 tomato unit:"$" with
Day.balance as "Balance"
```

At line 1, the time-series `Day.cashFlow`

is defined as the difference between the total of sales and the total of purchases. At line 2, the time-series `Day.balance`

is calculated as the moving average over 14 days of `Day.cashFlow`

. At line 3, a filter with two specific conditions is defined: data should not be more than 7 weeks old – counting from last Monday, and data should not be “fresher” than last Monday. Here the use of the `monday()`

function ensures that the scope includes exactly 6 whole weeks. Finally, the time-series `Day.balance`

is displayed as the sole time-series of the line chart defined at lines 4 and 5.

The aggregation taking place at line 2 leverages the `over`

option that we have detailed in the previous section. Here, the time-series is averaged over a time window of 14 days – from the index -13 to index 0 there are 14 distinct indices. This aggregation is somewhat unlike the previous aggregations we have seen so far, because the table `Day`

appears both on the left and the right sides of the assignment, while the more usual aggregations that do not have an `over`

suffix typically aggregate data from one table into another table.

It is also possible to re-aggregate daily time-series into weekly time-series. The script below illustrates how this can be done to calculate a weekly cash flow instead of a daily one.

```
Day.cashFlow = sum(Orders.NetAmount) - sum(PO.NetAmount)
Week.balance = sum(Day.cashFlow / 2) over [-1 .. 0]
when date >= lastMonday - 6 * 7 and date < lastMonday
show linechart "Cash flow over the last 6 weeks" a8f10 tomato unit:"$" with
Week.balance as "Balance"
```

## Dealing with event-driven data

Event-driven data refers to a certain representation of the historical data that focuses on “changes” . For example, instead of having all the historical prices for every single day of the data history, it is much more practical to only collect the list of historical price changes: each price change gives a new price and a date, and it is assumed that the new price remains unchanged until a new price change is observed. Envision supports scenarios where the historical data is represented as a list of changes.

The sample dataset does not include a table with the historical prices, however, it is possible to approximate this data by looking at the past purchase transactions, and assuming that prices stay unchanged until the next transaction is observed. This is exactly what is taking place in the last block of script in the example at the start of this section. Let’s have a closer look at the relevant lines copied below.

```
PO.Price = PO.NetAmount / PO.Quantity
Orders.PurchasePrice = latest(PO.Price)
Orders.Cost = Orders.PurchasePrice * Orders.Quantity
Orders.Profit = Orders.NetAmount - Orders.Cost
Week.profitblty = sum(Orders.Profit) / sum(Orders.Cost) default 1
when date >= lastMon - 13 * 7 and date < lastMon
show linechart "Profitability over the last 13 weeks" a11f13 tomato unit:"%" with
Week.profitblty
```

At line 1, the purchase price per unit is computed for every line of the `PO`

table through a vector operation in the usual Envision style. At line 2, the `latest`

function is used, and this function has a rather specific behavior: for every line of the `Orders`

table – the table being assigned to – the `latest`

function looks up the most recent `PO.Price`

line available, no more recent that the `Orders`

line being considered, and copies this value on the left side of the assignment. At lines 3 and 4, we have more calculations involving vectors and aggregations. Finally, at line 5, a filter is defined to restrict the scope of operations to the last 13 whole weeks. Within this filter block, a line chart is displayed, leveraging the `Week.profitability`

time-series that has been previously calculated at line 4.

The `latest`

function is the specific ingredient that supports the calculation being considered here. This function is precisely intended to capture the semantics of an event stream, where a value is assumed to be constant until it is overridden by a new event. In particular, it is also possible to “densify” purchase prices, calculating a price for every single day of the history with:

```
Day.PurchasePrice = latest(PO.Price)
```

In practice, the `latest`

function can be used to deal with many situations such as stock-outs, promotions, product lifecycles, etc.