# Iterating with 'for'

A `for`

loop is a statement used to iterate over the lines of a table, executing the same set of operations once for each iteration. While Envision relies first and foremost on its relational algebra features to implicitly iterate over tables, the `for`

loop is the most commonly used way to express an explicit iteration.

**Table of contents**

A `for`

loop is written with a **header** describing the range of iteration and the order in which the range is traversed, and a **body** describing the operations to be repeated for every line in the range. The body can create output vectors using the `return`

statement, or it can modify variables defined before the loop using the `keep`

statement, or it can do both.

```
Kept = 0
table T = extend.range(10)
T.Returned = for N in T.N scan T.N
keep Kept
Kept = Kept + N
return Kept
```

The code example above iterates over the numbers 1 to 10 and computes `Kept`

the total sum of those numbers (55), and `T.Returned`

the cumulative sum of those numbers (1, 3, 6…).

## Iteration range with `in`

The range of iteration is defined by the `Variable in T.Variable`

section of the header. The two names do not need to be the same (`A in T.B`

is allowed), but it is usually recommended that they be kept the same for the sake of readability.

The left variable must be a scalar and the right variable must be a non-scalar vector. The `for`

loop will execute once for every line in the right variable, and the left variable will contain, on every iteration, the corresponding value from the right variable.

The header may define multiple `in`

pairs, separated by commas. The convention is to split them over several lines and align them vertically:

```
table Numbers = extend.range(10)
Numbers.NSquared = Numbers.N * 2
// Given for example purposes only. This is equivalent to:
// Numbers.Result = Numbers.N + Numbers.NSquared
Numbers.Result = for N in Numbers.N,
NSquared in Numbers.NSquared
return N + NSquared
```

If multiple pairs are used, all the iterated vectors should be in the same table, which will be the **iteration table**. If the iterated vectors are in different tables, Envision will attempt to find a common table, broadcast all the vectors into that table, and use that as the iteration table.

### Iteration order with `scan`

The lines of the iteration table are traversed in an order specified using the `scan`

keyword.

This is *mandatory* if the loop body uses `keep`

to preserve the value of a variable from an iteration to the next (because in that case, changing the order of iteration will also change the result) ; to the contrary, the use of `scan`

is *forbidden* if the loop body does not use `keep`

, since the order of iteration is irrelevant when every iteration is independent from the others.

The `scan`

columns must be in (or able to broadcast into) the iteration table.

See the documentation of `scan`

for more information on how it is used to express an ordering.

## The loop body

### Producing outputs with `return`

The most common way of producing values in a `for`

loop is to compute a scalar value during each iteration, and to combine those values into a vector in the iteration table, where each line of the vector contains the value that was produced in the corresponding table line.

The code example below defines a table of calendar ranges, and for each range, computes a boolean `Ranges.Collision`

that is true if that range intersects any of the other ranges in the table.

```
table Ranges = with
[| date(2010, 01, 03) as Start, date(2010, 10, 12) as End |]
[| date(2011, 07, 23) , date(2012, 03, 01) |]
[| date(2010, 09, 27) , date(2011, 05, 31) |]
Ranges.Collision = for Start in Ranges.Start,
End in Ranges.End
// (true, false, false) on the 1st iteration,
// (false, true, false) on the 2nd,
// (false, false, true) on the 3rd
Ranges.NotSameRange = (Start != Ranges.Start or End != Ranges.End)
// (true, false, true) on the 1st iteration,
// (false, true, false) on the 2nd iteration,
// (true, false, true) on the 3rd iteration
Ranges.Intersects = max(Start, Ranges.Start) <= min(End, Ranges.End)
// true on the 1st iteration,
// false on the 2nd iteration,
// true on the 3rd iteration
return any(Ranges.NotSameRange and Ranges.Intersects)
show table "Ranges" with
Ranges.Start
Ranges.End
Ranges.Collision // (true, false, true)
```

Multiple returns are allowed, separated by commas:

```
table T = extend.range(10)
T.Log, T.Exp = for N in T.N
return (log(N), exp(N))
```

While this may look similar to the tuple syntax, it’s important to note that these are *not* tuples! Whereas the components of a tuple all need to be in the same table, multiple `return`

values can be in different tables (in case of a cross-table return).

### Modifying values with `keep`

The body of the `for`

loop can start with any number of `keep`

statements, indicating which variables keep their value from one iteration to the next.

The `keep`

variables must be initialized before the loop. They remain available after the loop, and will contain the value that was assigned to them on the last iteration.

Conversely, inside a `for`

loop, assigning a new value to a variable defined before the loop is forbidden unless that variable is mentioned in a `keep`

. You must decide whether the change is intended to be kept from one iteration to the next (in which case, add a `keep`

statement) or is just temporary (in which case create a new variable instead of changing an existing one).

```
Already1 = 0
Already2 = 0
for X in T.X scan auto
keep Already1
// Allowed because of 'keep'
Already1 = Already1 + X
// Forbidden because of no 'keep'
Already2 = Already2 + X
// Allowed because 'NewVar' did not exist above
NewVar = Already2 + X
```

### Operations allowed in a loop body

Since every operation in the body of a `for`

loop will be repeated once for every line in the iteration table, these operations have an oversized impact on the performance of the script. As such, the design of the Envision language restricts what operations are available in the body of a `for`

loop.

The only statements allowed in a `for`

loop are:

- inner loops (
`each`

,`for`

,`loop`

or`montecarlo`

), - assignment statements
`Variable = Expression`

, including the scoping`with`

assignment, - the
`keep`

and`return`

statements documented above, - the
`delete`

statement, - the
`expect table`

statement to constraint table size (though it is recommended to apply it before the loop for clarity)

All other kinds of statements are excluded, including `where`

, `table`

, `def`

, `read`

, `write`

, `if`

, `show`

and `import`

.

*Roadmap*: we intend to support `where`

, `if`

and table comprehension yield statements `[| ... |]`

in the future.

Envision will report uses of forbidden constructs.

The only expressions allowed in a `for`

loop are:

#### Map expressions

Map operations include:

- Calling a pure function
- Using a conditional expression or match
- Converting between enum and text
- Concatenation and text interpolation
- Constructing and deconstructing tuples

The example below invokes `*`

(multiplication) on the vector `U.N`

and the scalar `N`

:

```
table T = extend.range(10)
table U = extend.range(10)
table TU = cross(T, U)
TU.X = for N in T.N
return U.N * N
```

#### Broadcasting between tables

A scalar can always be broadcast into any table. Non-scalar vectors can only be broadcast from a small table into another small table.

In the example of the previous section, scalar `N`

is broadcast into table `U`

in order to be multiplied by `U.N`

.

In the example below, vector `Colors.Score`

is broadcast into table `Catalog`

:

```
table Catalog = with
[| as Label, as Color |]
[| "Socks" , "red" |]
[| "Socks" , "blue" |]
[| "Shirt" , "red" |]
table Colors[Color] = by Catalog.Color
Colors.Score = if Color == "red" then 2 else 1
Catalog.Total = 0
for C in Colors.Color
keep Catalog.Total
Catalog.Score = Colors.Score // Broadcast
Catalog.Total = if C = Catalog.Color then Catalog.Total else
Catalog.Total + Catalog.Score
```

The secondary dimension used for the broadcast must have been defined *outside the loop* (and in the case of nested loops, outside the *outermost loop*) because Envision expects any broadcasting/aggregation/lookup index to be computed before the loop begins.

#### Lookups

It is possible to lookup from a small table into another small table or into the scalar table.

```
table Products[product] = with
[| as product, as Price |]
[| "apple", 1.50 |]
[| "pear", 1.30 |]
[| "orange", 2.10 |]
[| "clementine", 2.70 |]
table Selection = with
[| as Choice |]
[| "pear" |]
[| "orange" |]
[| "banana" |]
table T = extend.range(20)
T.Limit = T.N / 10 // from 0.00 to 2.00 by increments of 0.1
T.Available = for Limit in T.Limit
Selection.PriceOfSelection = Products.Price[Selection.Choice]
return count(Selection.PriceOfSelection <= Limit)
```

The key used for the lookup must have been defined *outside the loop* (and in the case of nested loops, outside the *outermost loop*) because Envision expects any broadcasting/aggregation/lookup index to be computed before the loop begins.

#### Aggregating between tables

It is always possible to aggregate from a small table into another small table or into the scalar table.

The example below aggregates using `join`

from table `T`

into the scalar table. This has the effect of visiting all nine pairs of lines from table `T`

(the cartesian product, also known as the cross join in SQL), and at the same time guarantees that the amount of memory necessary to perform the operation is no greater than that needed to store table `T`

.

```
table T = extend.range(3)
T.Pairs = for N in T.N do
return join("\{N},\{T.N}";" ") sort T.N
show table "Pairs" with
T.N
T.Pairs
```

Output:

N | Pairs |
---|---|

1 | `"1,1 1,2 1,3"` |

2 | `"2,1 2,2 2,3"` |

3 | `"3,1 3,2 3,3"` |

The aggregation keys (whether they have been explicitly provided with `by`

and `at`

or are implicit based on a secondary dimension) must have been defined *outside the loop* (and in the case of nested loops, outside the *outermost loop*) because Envision expects any broadcasting/aggregation/lookup index to be computed before the loop begins.

## Advanced features

### Cross-table `return`

By default, if a `for`

loop iterates over table `T`

and returns a scalar, the returned values will be collected in a vector in table `T`

.

In addition, if a `for`

loop iterates over table `T`

and returns a vector in table `U`

, then it is expected that the script previously defined a `cross(T, U)`

table (not `cross(U, T)`

!), and the returned values will be collected in a vector in that cross-table.

It is possible to return multiple scalars and vectors at the same time.

```
table T = extend.range(10)
table U = extend.range(15)
table V = extend.range(20)
table TU = cross(T, U)
table TV = cross(T, V)
T.A, TU.B, TV.C = for N in T.N
A = N
U.B = N * U.N
V.C = N * V.N
return (A, U.B, V.C)
```

In this example:

- Vector
`T.A`

contains value (1, 2, .. 10). - Vector
`TU.B`

contains, for the first line of`T`

the range (1, 2, .. 15), for the second line the range (2, 4, .. 30), and so on. - Vector
`TV.C`

contains, for the first line of`T`

the range (1, 2, .. 20), for the second line the range (2, 4, .. 40), and so on.

### Iteration filtering with `when`

By default, all lines of the iteration table are traversed. However, it is possible to restrict this to lines where a condition is true.

This condition is specified by adding `when Expression`

after then `scan`

option. The expression must be a scalar boolean that is computed from the `in`

and `keep`

variables:

```
table Movements = with
[| 10 as Qty, week(2025, 1) as Time |]
[| -10 , week(2025, 2) |]
[| -30 , week(2025, 3) |]
[| -20 , week(2025, 4) |]
[| 100 , week(2025, 5) |]
[| -30 , week(2025, 6) |]
// Initial stock
Stock = 40
// Total dispatched
Dispatched = 0
for Qty in Movements.Qty scan Movements.Time when Stock > 0
keep Stock
keep Dispatched
Served = min(Stock, -Qty)
Dispatched = max(Dispatched, Dispatched + Served)
Stock = Stock - Served
```

The above example to computes dispatched stock up to the first point when the stock reaches zero. The condition `when Stock > 0`

is evaluated on each iteration based on the current value of the `Stock`

variable.

The condition could also depend on variables `Dispatch`

(the other `keep`

variable) or `Qty`

(the iteration variable), but not on variable `Served`

, because it is only defined in the body of the loop.

The use of `when`

is forbidden on a loop that uses `return`

, since in that case it needs to compute and return a value for every line of the iteration table.

### Outer-join aggregation

An iteration variable can appear in the `at`

of an aggregation, resulting in the equivalent of a left outer join. In the example below, the tables `T`

and `U`

are joined based on `T.Group`

and `U.Group`

, respectively, by using `Group in T.Group`

and then `at Group`

.

```
table T = extend.range(5)
table U = extend.range(4)
T.Group = if T.N < 3 then "Low" else
if T.N > 3 then "High" else "Middle"
U.Group = if U.N < 3 then "Low" else "High"
T.Pairs = for N in T.N, Group in T.Group
return join("\{N},\{U.N}";" ") sort U.N by U.Group at Group
default "None"
show table "Pairs" with
T.N
T.Group
T.Pairs
```

Output:

N | Group | Pairs |
---|---|---|

1 | Low | `"1,1 1,2"` |

2 | Low | `"2,1 2,2"` |

3 | Middle | `"None"` |

4 | High | `"4,3 4,4"` |

5 | High | `"5,3 5,4"` |

When the source table is a small table, this works without additional conditions.

When the source table is *not* a small table, in order to maintain a correct level of performance, Envision requires the iteration to be performed by ascending values of the `at`

key:

- If multiple outer-join aggregations (over non-small tables) are performed in the same loop, they should all use the same
`at`

key. - If the loop has a
`scan`

option, the`at`

should be equal to the scan keys (or be a prefix of the scan keys).

For example, the script below is broken:

```
table T = extend.range(5)
table U = extend.range(10m)
T.Group1 = if T.N < 3 then "Low" else
if T.N > 3 then "High" else "Middle"
T.Group2 = if T.N < 2 then "Low" else
if T.N > 4 then "High" else "Middle"
U.Group = if U.N < 3 then "Low" else "High"
T.A, T.B = for N in T.N, Group1 in T.Group1, Group2 in T.Group2
A = join("\{N},\{U.N}";" ") sort U.N by U.Group at Group1
// Error: Cannot aggregate from non-small table 'U'
// because T.Group2 iteration order is incompatible with
// pre-existing T.Group1 iteration order.
B = join("\{N},\{U.N}";" ") sort U.N by U.Group at Group2
return (A, B)
```