Making calculations

Envision offers the possibility to carry out just about any calculation that could otherwise be done with Excel. In this respect, the syntax to compute such calculations is similar to the syntax used for Excel formulas. Envision puts an emphasis on vector calculations. Vector-based operations are used to process many values at once rather than working with a single value at a time. Tables and vectors are covered in greater detail in this section, and should help you get started doing your own calculation with Envision.

In order to follow this page more effectively, we suggest you set up the sample dataset. We have not yet detailed the Envision capabilities relative to uploading the input data, but our examples will show you the lines that you need to put at the top of your script.

Tables and vectors

Envision’s data model revolves around tables and vectors. An Envision table is similar to the tables that exist in relational databases. From an Excel perspective, a table is a well-formed spreadsheet where the first line contains column headers and where the (many) lines below contain data that is correctly aligned with these headers. In an Envision script, the tables are also named and the table names are typically driven by the names of the underlying tabular files. Vectors are associated with the columns in the table and similarly, they are also named. Envision uses the term “vector” rather than “column” to emphasize that operations can be performed on all vector values at once, that is, on all the lines of the original table.

Let’s illustrate this idea with a few lines of script that can be applied to the sample dataset. Below, for every order line, we calculate the tax rate, or in other words, the ratio between the amount of tax and the net amount of tax being charged to the customer.

read "/sample/Lokad_Items.tsv" as Items[Id] with
  Id : text
  BuyPrice : number
  StockOnHand : number
  StockOnOrder : number

read "/sample/Lokad_Orders.tsv" as Orders expect [Id, Date] with
  Id : text
  Date : date
  NetAmount : number

read "/sample/Lokad_PurchaseOrders.tsv" as PO expect [Id, Date] with
  Id : text
  Date : date
  NetAmount : number

Orders.TaxRate = Orders.TaxAmount / Orders.NetAmount

Here Orders refers to the orders table, i.e. the table that contains the entire sales history, with every transaction represented with as many lines as there are items within the given transaction. The variable Orders.TaxAmount refers to the vector associated with the column named TaxAmount within the Orders table. You can make a mental note of the syntax which consists of using a dot (.) between the table name and the vector name, as this pattern is frequently used in Envision.

The operation that involves the equals sign = is called an assignment: the calculation takes place on the right side of the = sign, and the result gets assigned to left side of the statement. In the example above, we have a division operation taking place on the right side. Since neither Orders.TaxAmount nor Orders.NetAmount are actually defined anywhere in the script, Envision tries to upload this data directly from the input dataset. Since the Orders table of the table dataset contains the two columns NetAmount and TaxAmount, the script is executed successfully. Then, on the left side, we have Orders.TaxRate which gets assigned the newly computed tax rate. An assignment is the logical equivalent of creating a new column in Excel, named according to the assignment variable, that is, TaxRate in this instance.

In the following script snippets, for the sake of concision, we omit the read “/sample/Lokad_XYZ.tsv” lines from all the examples, as they are expected to always be included at the top of every script.

The syntax for calculations in Envision is similar to that used in Excel formulas. In the script below, we perform a series of (rather arbitrary) calculations to illustrate this syntax.

Orders.A = 42
Orders.B = 5 * (1 + Orders.A)
Orders.C = (Orders.A + Orders.B) * (1 + Orders.A)

Here, the script defines three vectors named respectively A, B and C, and all three vectors are attached to the Orders table. The first line is a simple assignment where the value 42 is assigned to Orders.A. However, as Orders.A is a vector, it is not just one value of 42, but one value for every line of the original table. Envision puts an emphasis on vectors and most of the operations on vectors take place on all their values at once.

Then, Orders is not the only table available in the sample dataset. For example, the sample dataset also contains a PO table, and very similar operations can be performed on this table as well.

PO.A = 42
PO.B = 5 * (1 + PO.A)
PO.C = PO.A + PO.B

Since we have just introduced a second table, it raises the following question: can Envision carry out operations between tables? The answer is yes, but it requires a tiny bit more effort. Let’s consider the following script:

Orders.A = 1
PO.A = Orders.A + 1 // WRONG!

Since the two tables Orders and PO have no reason to be aligned in any way - the two tables do not even contain the same number of lines – the semantics associated with such an operation would be very unclear. Thus, such an operation is not valid with Envision, and if an attempt is made to execute such a script, the execution fails and an error message is displayed.

Nevertheless, Envision provides abundant ways to combine data from distinct tables, and this aspect is covered in the following section.

Special status of dates

Operations are also typically linked to a specific date. Each line of the sales history comes with an applicable date, idem for the purchase order history, and this is also true for nearly all data that would qualify as historical data. Because of the importance of historical data in commerce, where practically all business operations can be described as a list of dated entries accounting for stock movements or payments, Envision makes a special case for the dates in a very commerce-driven way.

Any table can have a Date column in addition to its canonical Id column. When such a Date column is present, not only is the table indexed by the item identifiers, but also by dates. The indexation by dates is often practical, because when one seeks to apply a time-window of some kind to a calculation, the entire history - no matter what type of entries are involved - is expected to be filtered similarly.

In order to illustrate this, let’s get back to our cash flow calculation. Let’s assume that instead of computing values for the entire history, we decide to compute the cash flow per item when only taking the last year of data history. This can be done with:

oend = max(Orders.date)
when date > oend - 365
  Items.CashFlow = sum(Orders.NetAmount) - sum(PO.NetAmount)

The end variable is defined as the most recent date observed across the entire input dataset. Given that end is a date, this demonstrates that Envision offers the possibility to carry out date arithmetics, as shown in the script above. Adding the +1 convention to any date in Envision, results in adding one day to that particular date. Thus, by subtracting 365 days, we are moving back roughly one year in the past.

The script starts with a when filter which represents a condition that is imposed as being true for all the lines processed within a given block of script. As far as items are concerned, they are not indexed by date. Thus, the date filter has no effect on them, and all items continue to be present within the when block. However, both orders and purchase orders do have a Date column of their own, and, as a result, all the lines that do not satisfy the condition of the when filter get filtered from the block of script.

As a consequence, the sum() aggregation within the when block only processes the non-filtered lines, that is, the lines that are less than one year old. The same calculation could also have been carried out with intermediate variables, just like it was done initially with the full-history example.

oend = max(Orders.date)
when date > oend - 365
  Items.CashIn = sum(Orders.NetAmount)
  Items.CashOut = sum(PO.NetAmount)
  Items.CashFlow = Items.CashIn - Items.CashOut

With the example just above, it probably becomes a little clearer as to why we use the when statement at the start of the filtering block: all the lines within the block, noticeable by the two extra spaces at the beginning of lines 2, 3 and 4, undergo the same ambient filter for the dates.

This script also illustrates Envision’s capacity to re-align complex data coming from the other tables with the Items table. From an Excel perspective, it is as if one was capable of transforming the other sheets (for example, the order history) into columns using the master sheet, which contains the list of products. Envision makes this process a lot less tedious.