Navigation :

Iterating with 'each'

An each loop is a statement used to iterate over the lines of a table, executing the same set of operations once for each iteration. It is more complex but more flexible than for loops.

The each loop being an advanced feature, this article assumes in-depth knowledge about for loops, and will describe each in terms of its differences with for loops.

In terms of syntax, the only difference is that in the loop header, the for Expr in Expr is replaced with each Table. For comparison, here is an example of the each loop syntax and equivalent for:

table T = extend.range(10)
A = 0

T.M = each T scan auto
  keep A
  A = A + T.N
  return A

T.M = for N in T.N scan auto 
  keep A
  A = A + N
  return A

When given the choice of using a for or an each loop, it is strongly recommended to use the for loop, as it is significantly simpler to understand. The each loops should be used only to achieve things that are impossible with for loops (see reasons to use each below).

Table diagram

Both the each and the for loops execute their body once for every line of data in the iteration table. At the start of each iteration, the for loop loads the input data into scalar variables as described by the .. in .. pairs in its header. The each loop instead loads the input data based on the table diagram. All the tables defined in the script are classified into one of the following categories:

The iteration table appears after the each keyword. The body of the loop is executed once for every line in the iteration table.
An upstream table is any table from which one can broadcast (directly or indirectly) into the iteration table.
A downstream table is any table into which one can broadcast (directly or indirectly) from the iteration table. Downstream tables cannot be used in an each loop.
An upstream-cross table is any cross table where either component is the iteration table or an upstream table. Upstream-cross tables can only be used in an each loop under the following conditions:
1. The right table must be a full table.
2. It must be small, or have the iteration table as its left table.
Any other table is classified as full table. Full tables can only be used in an each loop if they are small.

The position of a table T in the diagram determines what happens when reading a variable T.X, writing a variable T.X, or using into T:

Diagram	Read variable	Write variable	`into`
Iteration table `Iteration`	Returns the scalar value on the line corresponding to the current iteration. `X = Iteration.X`	Writing to the iteration table is forbidden.	Broadcasting `into` the iteration table is forbidden.
Upstream table `Upstream`	Identifies the line that would be broadcast into the line of the iteration table that corresponds to the current iteration, and returns the scalar value on that line. `X = Upstream.X`	Allowed only for `keep` variables. Assigns a scalar to the line corresponding to the current iteration, leaving the rest unchanged. `Upstream.X = X`	Broadcasting `into` an upstream table is forbidden.
Upstream-cross table `UpstreamFull`	Identifies the line of the left table that corresponds to the current iteration, and returns the vector in the `Full` table that corresponds to that line. `Full.X = UpstreamFull.x`	Writing to an upstream-cross table is forbidden.	Broadcasting `into` an upstream-cross table is forbidden.
Full table	Normal behavior.	Normal behavior.	Normal behavior.

Exercise

What is the classification of each of those tables in the three each blocks below ?

read ".." as Category[category]
read ".." as Product[sku] expect [category]
read ".." as Channel[channel]
read ".." as Orders expect [channel, sku, date]

table CategoryWeek = cross(Category, Week)
table ProductWeek = cross(Product, Week)
table ChannelWeek = cross(Channel, Week)

Product.X = each Product
  // Here ?

Week.X = each Week
  // Here ? 

Category.X = each Category
  // Here ?

Answers

Table	`each Product`	`each Week`	`each Category`
`Product`	Iteration	Full	Downstream
`Week`	Full	Iteration	Full
`Category`	Upstream	Full	Iteration
`Channel`	Full	Full	Full
`Orders`	Downstream	Downstream	Downstream
`CategoryWeek`	Upstream-Cross	Unavailable	Upstream-Cross
`ProductWeek`	Upstream-Cross	Unavailable	Unavailable
`ChannelWeek`	Unavailable	Unavailable	Unavailable

`each .. scan` blocks

Like for .. scan, the each .. scan blocks allow you to keep values from one iteration to the next. However, this extra capability implies that the Envision runtime can’t parallelize the iterations of an each .. scan block. Thus, this variant should only be favored when values need to be kept from one iteration to the next. A simple example is given below:

table Obs = with
  [| as Date,          as Quantity |]
  [| date(2021, 1, 1), 13          |]
  [| date(2021, 2, 1), 11          |]
  [| date(2021, 3, 1), 17          |]
  [| date(2021, 4, 1), 18          |]
  [| date(2021, 5, 1), 16          |]

Best = 0

Obs.BestSoFar = each Obs scan Obs.Date
  keep Best
  NewBest = max(Best, Obs.Quantity)
  Best = NewBest
  return NewBest

show table "" a1b4 with Obs.Date, Obs.BestSoFar

In the above script, scan Obs.Date specifies the order in which the lines of the iteration table are to be traversed. The statement keep Best specifies that the variable Best must retain its value from one iteration line to the next. Finally, Best = NewBest assigns a new value to the variable ; it will be the one available on the next iteration line.

Lines of the iteration table are processed in the ascender order. However, the option desc can be used to specify the descending order, as illustrated by:

table Obs = with
  [| as Date,          as Quantity |]
  [| date(2021, 1, 1), 13          |]
  [| date(2021, 2, 1), 11          |]
  [| date(2021, 3, 1), 17          |]
  [| date(2021, 4, 1), 18          |]
  [| date(2021, 5, 1), 16          |]

Best = 0

Obs.BestSoFar = each Obs scan Obs.Date desc
  keep Best
  NewBest = max(Best, Obs.Quantity)
  Best = NewBest
  return NewBest

show table "" a1b4 with Obs.Date, Obs.BestSoFar

The each .. scan block comes with a short series of syntactic constraints relative to the keep statements. The block requires at least one keep statement. All the keep statements must be made at the very beginning of the each .. scan block. The keep statements must refer to variables that have already been defined, prior to the each .. scan block. A variable marked with keep is modified by the execution of the each .. scan block. Its last value remains available after exiting the each .. scan block.

keep vectors must be from small tables in order to be kept in-memory, and must be scalars, full-table or upstream-table vectors.

As a rule of thumb, user-defined processes should be preferred to each .. scan blocks whenever possible. The each .. scan block should be used when the logic grows too complex, or involves keeping non-scalar variables.

Return-less blocks

It may happens that an each .. block is introduced for the sole purpose of getting the last value held by a keep variable. Thus, the return statement may be omitted altogether as illustrated by the following script:

table Currencies = with
  [| as Code |]
  [| "EUR"   |]
  [| "JPY"   |]
  [| "USD"   |]

Sep = ""
List = ""
each Currencies scan Currencies.Code
  keep Sep
  keep List
  List = "\{List}\{Sep}\{Currencies.Code}"
  Sep = ", "

show scalar "" with List

In the above script, the variable List is built through iterative concatenations. However, as only the final form is of interest, a return-less each .. block is used.

In practice, however, the above script could be rewritten in simpler way leveraging the built-in join aggregator as illustrated by:

table Currencies = with
  [| as Code |]
  [| "EUR"   |]
  [| "JPY"   |]
  [| "USD"   |]

show scalar "" with join(Currencies.Code; ", ") sort Currencies.Code

`auto` ordering in `scan`

The ordering of the scan follows the primary dimension of the table being enumerated through the use of the keyword auto:

table T = extend.range(6)
x = 0
T.X = each T scan auto
  keep x
  x = T.N - x
  return x

show table "T" a1b5 with T.N, T.X

The above script is logically identical to the one below:

table T[t] = extend.range(6)
x = 0
T.X = each T scan t
  keep x
  x = T.N - x
  return x

show table "T" a1b5 with T.N, T.X

Any-order blocks

While persisting variables from one line to the next might be needed, the specific ordering might not matter. Envision provides a syntax to deal with those situations as illustrated by:

table Obs = with
  [| as X |]
  [| 42   |]
  [| 41   |]
  [| 45   |]

myMin = 1B
myMax = -(1B)
each Obs scan Obs.*
  keep myMin
  keep myMax
  myMin = min(myMin, Obs.X)
  myMax = max(myMax, Obs.X)

show summary "" a1b2 with myMin, myMax

In the above script, the scan Values.* indicates that an arbitrary order is taken.

As a rule of thumb, this feature should be considered as fringe and sparingly used. Indeed, the Envision compiler does not rely on any proof that ordering does not matter. Hence, if accidentally ordering does matter, the ambiguity might be resolved in non-predictable ways by the Envision runtime.

`each .. when` blocks

Like for .. when blocks, each loops can be filtered. The each .. when block only executes its body on lines where the condition specified by when is true.

table T = extend.range(5)

s = 0
each T scan auto when T.N mod 2 == 1
  keep s
  s = s + T.N

show scalar "odd sum" with s // 9

In the above script, the filter when T.N mod 2 == 1 is applied to every line of the table T. It filters out every line where T.N is even.

The each .. when block cannot return a vector, via the keyword return as lines would be missing. Instead, variables marked as keep must be used to extract information from the iteration.

Reasons to use `each`

The following features are available in each loops, but not in for loops, and are thus proper reasons to use each:

Having a keep in an upstream table.
Reading values from upstream-cross tables.

If an each loop is using neither of the above, consider changing it to a for loop.

User Contributed Notes

0 notes + add a note