Iterating with 'for'

A for loop is a statement used to iterate over the lines of a table, executing the same set of operations once for each iteration. While Envision relies first and foremost on its relational algebra features to implicitly iterate over tables, the for loop is the most commonly used way to express an explicit iteration.

Table of contents

A for loop is written with a header describing the range of iteration and the order in which the range is traversed, and a body describing the operations to be repeated for every line in the range. The body can create output vectors using the return statement, or it can modify variables defined before the loop using the keep statement, or it can do both.

Kept = 0
table T = extend.range(10)
T.Returned = for N in T.N scan T.N
  keep Kept
  Kept = Kept + N
  return Kept

The code example above iterates over the numbers 1 to 10 and computes Kept the total sum of those numbers (55), and T.Returned the cumulative sum of those numbers (1, 3, 6…).

Iteration range with in

The range of iteration is defined by the Variable in T.Variable section of the header. The two names do not need to be the same (A in T.B is allowed), but it is usually recommended that they be kept the same for the sake of readability.

The left variable must be a scalar and the right variable must be a non-scalar vector. The for loop will execute once for every line in the right variable, and the left variable will contain, on every iteration, the corresponding value from the right variable.

The header may define multiple in pairs, separated by commas. The convention is to split them over several lines and align them vertically:

table Numbers = extend.range(10)
Numbers.NSquared = Numbers.N * 2

// Given for example purposes only. This is equivalent to: 
//   Numbers.Result = Numbers.N + Numbers.NSquared

Numbers.Result = for N        in Numbers.N,
                     NSquared in Numbers.NSquared
  return N + NSquared

If multiple pairs are used, all the iterated vectors should be in the same table, which will be the iteration table. If the iterated vectors are in different tables, Envision will attempt to find a common table, broadcast all the vectors into that table, and use that as the iteration table.

Iteration order with scan

The lines of the iteration table are traversed in an order specified using the scan keyword.

This is mandatory if the loop body uses keep to preserve the value of a variable from an iteration to the next (because in that case, changing the order of iteration will also change the result) ; to the contrary, the use of scan is forbidden if the loop body does not use keep, since the order of iteration is irrelevant when every iteration is independent from the others.

The scan columns must be in (or able to broadcast into) the iteration table.

See the documentation of scan for more information on how it is used to express an ordering.

The loop body

Producing outputs with return

The most common way of producing values in a for loop is to compute a scalar value during each iteration, and to combine those values into a vector in the iteration table, where each line of the vector contains the value that was produced in the corresponding table line.

The code example below defines a table of calendar ranges, and for each range, computes a boolean Ranges.Collision that is true if that range intersects any of the other ranges in the table.

table Ranges = with 
  [| date(2010, 01, 03) as Start, date(2010, 10, 12) as End |]
  [| date(2011, 07, 23)         , date(2012, 03, 01)        |]
  [| date(2010, 09, 27)         , date(2011, 05, 31)        |]

Ranges.Collision = for Start in Ranges.Start, 
                       End   in Ranges.End

  // (true, false, false) on the 1st iteration,
  // (false, true, false) on the 2nd,
  // (false, false, true) on the 3rd  
  Ranges.NotSameRange = (Start != Ranges.Start or End != Ranges.End)

  // (true, false, true)  on the 1st iteration,
  // (false, true, false) on the 2nd iteration, 
  // (true, false, true)  on the 3rd iteration
  Ranges.Intersects = max(Start, Ranges.Start) <= min(End, Ranges.End)

  // true  on the 1st iteration,
  // false on the 2nd iteration,
  // true  on the 3rd iteration
  return any(Ranges.NotSameRange and Ranges.Intersects)

show table "Ranges" with
  Ranges.Start
  Ranges.End 
  Ranges.Collision // (true, false, true)

Multiple returns are allowed, separated by commas:

table T = extend.range(10)

T.Log, T.Exp = for N in T.N 
  return (log(N), exp(N))

While this may look similar to the tuple syntax, it’s important to note that these are not tuples! Whereas the components of a tuple all need to be in the same table, multiple return values can be in different tables (in case of a cross-table return).

Modifying values with keep

The body of the for loop can start with any number of keep statements, indicating which variables keep their value from one iteration to the next.

The keep variables must be initialized before the loop. They remain available after the loop, and will contain the value that was assigned to them on the last iteration.

Conversely, inside a for loop, assigning a new value to a variable defined before the loop is forbidden unless that variable is mentioned in a keep. You must decide whether the change is intended to be kept from one iteration to the next (in which case, add a keep statement) or is just temporary (in which case create a new variable instead of changing an existing one).

Already1 = 0
Already2 = 0

for X in T.X scan auto 
  keep Already1

  // Allowed because of 'keep'
  Already1 = Already1 + X

  // Forbidden because of no 'keep'
  Already2 = Already2 + X

  // Allowed because 'NewVar' did not exist above
  NewVar = Already2 + X

Operations allowed in a loop body

Since every operation in the body of a for loop will be repeated once for every line in the iteration table, these operations have an oversized impact on the performance of the script. As such, the design of the Envision language restricts what operations are available in the body of a for loop.

The only statements allowed in a for loop are:

All other kinds of statements are excluded, including where, table, def, read, write, if, show and import.

Roadmap: we intend to support where, if and table comprehension yield statements [| ... |] in the future.

Envision will report uses of forbidden constructs.

The only expressions allowed in a for loop are:

Map expressions

Map operations include:

The example below invokes * (multiplication) on the vector U.N and the scalar N:

table T = extend.range(10)
table U = extend.range(10)
table TU = cross(T, U)
TU.X = for N in T.N 
  return U.N * N

Broadcasting between tables

A scalar can always be broadcast into any table. Non-scalar vectors can only be broadcast from a small table into another small table.

In the example of the previous section, scalar N is broadcast into table U in order to be multiplied by U.N.

In the example below, vector Colors.Score is broadcast into table Catalog:

table Catalog = with 
  [| as Label, as Color |]
  [| "Socks" , "red"    |]
  [| "Socks" , "blue"   |]
  [| "Shirt" , "red"    |]

table Colors[Color] = by Catalog.Color
Colors.Score = if Color == "red" then 2 else 1

Catalog.Total = 0
for C in Colors.Color
  keep Catalog.Total
  Catalog.Score = Colors.Score // Broadcast
  Catalog.Total = if C = Catalog.Color then Catalog.Total else
                  Catalog.Total + Catalog.Score

The secondary dimension used for the broadcast must have been defined outside the loop (and in the case of nested loops, outside the outermost loop) because Envision expects any broadcasting/aggregation/lookup index to be computed before the loop begins.

Lookups

It is possible to lookup from a small table into another small table or into the scalar table.

table Products[product] = with
  [| as product,   as Price |]
  [| "apple",      1.50     |]
  [| "pear",       1.30     |]
  [| "orange",     2.10     |]
  [| "clementine", 2.70     |]
 
table Selection = with
  [| as Choice |]
  [| "pear"    |]
  [| "orange"  |]
  [| "banana"  |] 

table T = extend.range(20)
T.Limit = T.N / 10 // from 0.00 to 2.00 by increments of 0.1

T.Available = for Limit in T.Limit 
  Selection.PriceOfSelection = Products.Price[Selection.Choice] 
  return count(Selection.PriceOfSelection <= Limit)

The key used for the lookup must have been defined outside the loop (and in the case of nested loops, outside the outermost loop) because Envision expects any broadcasting/aggregation/lookup index to be computed before the loop begins.

Aggregating between tables

It is always possible to aggregate from a small table into another small table or into the scalar table.

The example below aggregates using join from table T into the scalar table. This has the effect of visiting all nine pairs of lines from table T (the cartesian product, also known as the cross join in SQL), and at the same time guarantees that the amount of memory necessary to perform the operation is no greater than that needed to store table T.

table T = extend.range(3)
T.Pairs = for N in T.N do 
  return join("\{N},\{T.N}";" ") sort T.N
show table "Pairs" with 
  T.N
  T.Pairs

Output:

N Pairs
1 "1,1 1,2 1,3"
2 "2,1 2,2 2,3"
3 "3,1 3,2 3,3"

The aggregation keys (whether they have been explicitly provided with by and at or are implicit based on a secondary dimension) must have been defined outside the loop (and in the case of nested loops, outside the outermost loop) because Envision expects any broadcasting/aggregation/lookup index to be computed before the loop begins.

Advanced features

Cross-table return

By default, if a for loop iterates over table T and returns a scalar, the returned values will be collected in a vector in table T.

In addition, if a for loop iterates over table T and returns a vector in table U, then it is expected that the script previously defined a cross(T, U) table (not cross(U, T) !), and the returned values will be collected in a vector in that cross-table.

It is possible to return multiple scalars and vectors at the same time.

table T = extend.range(10)
table U = extend.range(15)
table V = extend.range(20)

table TU = cross(T, U)
table TV = cross(T, V)

T.A, TU.B, TV.C = for N in T.N
  A = N
  U.B = N * U.N
  V.C = N * V.N 
  return (A, U.B, V.C)

In this example:

Iteration filtering with when

By default, all lines of the iteration table are traversed. However, it is possible to restrict this to lines where a condition is true.

This condition is specified by adding when Expression after then scan option. The expression must be a scalar boolean that is computed from the in and keep variables:

table Movements = with
  [|  10 as Qty, week(2025, 1) as Time |]
  [| -10       , week(2025, 2)         |]
  [| -30       , week(2025, 3)         |]
  [| -20       , week(2025, 4)         |]
  [| 100       , week(2025, 5)         |]
  [| -30       , week(2025, 6)         |]

// Initial stock
Stock = 40

// Total dispatched
Dispatched = 0

for Qty in Movements.Qty scan Movements.Time when Stock > 0
  keep Stock
  keep Dispatched
  Served = min(Stock, -Qty)
  Dispatched = max(Dispatched, Dispatched + Served)
  Stock = Stock - Served

The above example to computes dispatched stock up to the first point when the stock reaches zero. The condition when Stock > 0 is evaluated on each iteration based on the current value of the Stock variable.

The condition could also depend on variables Dispatch (the other keep variable) or Qty (the iteration variable), but not on variable Served, because it is only defined in the body of the loop.

The use of when is forbidden on a loop that uses return, since in that case it needs to compute and return a value for every line of the iteration table.

Outer-join aggregation

An iteration variable can appear in the at of an aggregation, resulting in the equivalent of a left outer join. In the example below, the tables T and U are joined based on T.Group and U.Group, respectively, by using Group in T.Group and then at Group.

table T = extend.range(5)
table U = extend.range(4)

T.Group = if T.N < 3 then "Low" else 
          if T.N > 3 then "High" else "Middle"
U.Group = if U.N < 3 then "Low" else "High"

T.Pairs = for N in T.N, Group in T.Group 
  return join("\{N},\{U.N}";"  ") sort U.N by U.Group at Group 
                                  default "None"

show table "Pairs" with 
  T.N
  T.Group
  T.Pairs

Output:

N Group Pairs
1 Low "1,1 1,2"
2 Low "2,1 2,2"
3 Middle "None"
4 High "4,3 4,4"
5 High "5,3 5,4"

When the source table is a small table, this flags the table as partial, which will restrict its other uses in the same block.

When the source table is not a small table, in order to maintain a correct level of performance, Envision requires the iteration to be performed by ascending values of the at key:

For example, the script below is broken:

table T = extend.range(5)
table U = extend.range(10m)

T.Group1 = if T.N < 3 then "Low" else
           if T.N > 3 then "High" else "Middle"
T.Group2 = if T.N < 2 then "Low" else
           if T.N > 4 then "High" else "Middle"
U.Group = if U.N < 3 then "Low" else "High"

T.A, T.B = for N in T.N, Group1 in T.Group1, Group2 in T.Group2
  A = join("\{N},\{U.N}";" ") sort U.N by U.Group at Group1
  // Error: Cannot aggregate from non-small table 'U' 
  // because T.Group2 iteration order is incompatible with
  // pre-existing T.Group1 iteration order. 
  B = join("\{N},\{U.N}";" ") sort U.N by U.Group at Group2
  return (A, B)

Partial tables

Under some conditions, a small table can be marked as a partial table by the system. This happens when the system detects that attempting to compute a vector in that table in its entirety, on every iteration of the loop, can lead to significant performance degradation. When a table is marked as partial, it cannot be the destination of aggregations or scans.

There is currently no way to make a partial table non-partial.

User Contributed Notes
0 notes + add a note