# Cross tables

A cross table represents the Cartesian product between two tables. Cross tables are typically intended for situations that involve multiple dimensions (e.g. stores and products, SKUs and days, etc). Source tables can broadcast into the cross table. Cross tables also provide an additional class of vector operations within Envision.

**Table of contents**

## Syntax overview

The keyword `cross`

is used to declare a cross table as illustrated by:

```
table R = extend.range(2)
table S = extend.range(3)
table RS = cross(R, S)
RS.A = R.N
RS.B = S.N
show table "cross" a1b5 with RS.A, RS.B
```

In the above script, the two assignments of `RS.A`

and `RS.B`

, respectively originating from `R`

and `S`

work because the table `RS`

has `R`

and `S`

as two upstream tables. The table tile displays the 6=2x3 lines, which result from the Cartesian product between `R`

and `S`

.

The right source table within a cross should be a small table. This constraint ensures that large cross tables can be processed efficiently when performing complex operations, such as the ones allowed by the `each`

keyword.

A cross table has exactly two primary dimensions. Thus, the two source tables must be distinct and have one primary dimension each. The scalar table and cross tables cannot be used as source tables.

Every operation that expects two or more vectors in to be in the same table attempts at finding a common table between those vectors. A cross table, once declared, becomes a candidate for being this common table.

For example, a map operation performed between two independent tables operates over the cross table associated to this pair of tables:

```
table R = extend.range(2)
table S = extend.range(3)
table RS = cross(R, S)
s = sum((1 into R) + (1 into S))
show scalar "sum" with s // 12
```

In the above script, the sum `(1 into R) + (1 into S)`

is broadcasted in the common table between `R`

and `S`

, that is, the cross table `RS`

. The resulting value, `2`

over all the lines of the table `RS`

is finally summed into `s`

, a scalar variable.

The table `RS`

must be declared for the above script to work. Indeed, Envision adopts a defensive design where combining within a single map operation two unrelated tables is illegal - unless the corresponding cross table is explicitly declared. This behavior prevents the accidental, and likely nonsensical, combination of values originating from unrelated tables.

A similar effect is obtained beyond the map operations, for example:

```
table R = extend.range(2)
table S = extend.range(3)
table RS = cross(R, S)
c = sum(sum(1) by [R.N, S.N]) // RS as common table
show scalar "count" with c // 6
```

Or alternatively:

```
table R = extend.range(2)
table S = extend.range(3)
table RS = cross(R, S)
show table "cross" a1b6 with R.N, S.N // RS as common table
```

A cross table can be aggregated into its source tables:

```
table R = extend.range(2)
table S = extend.range(3)
table RS = cross(R, S)
RS.A = R.N
RS.B = S.N
R.C = sum(RS.A + RS.B)
S.D = sum(RS.A + RS.B)
show table "R" a1b5 with R.C
show table "S" c1d5 with S.D
```

In the above script, `R.C`

(resp. `S.D`

) is defined as an aggregation from the table `RS`

to the table `R`

(resp. `S`

).

**Advanced remark**: Cross tables in Envision can be seen as a stepping stone toward an extensive tensor comprehension. The tensor comprehension paradigm is the descendent of the array programming paradigm.

**Roadmap:** Cross tables are limited to two source tables, however we plan to lift this limitation in the future.

### Transposition

It is not possible to declare two cross tables that have the same dimensions *in the same order*. However, as the ordering of the source dimensions matters, it is possible to declare two cross tables that have the same source dimensions, but in a distinct order. It is also possible to broadcast a cross table into its transposed counterpart.

Let’s consider a transposed cross table:

```
table R = extend.range(3)
table S = extend.range(4)
table RS = cross(R, S)
table SR = cross(S, R)
RS.N = (R.N into RS) + S.N
SR.N = (R.N into SR) + S.N
show table "" a1d6 with R.N as "R", S.N as "S", RS.N as "RS", SR.N as "SR"
```

In the above script, the variable `R.N`

is explicitly broadcast first into the table `RS`

and second into the table `SR`

because there are two candidate common tables between `R`

and `S`

: both `RS`

and `SR`

are eligible. The broadcast indicates to Envision which cross table should be picked.

Indeed, the more direct approach does not compile:

```
table R = extend.range(3)
table S = extend.range(4)
table RS = cross(R, S)
table SR = cross(S, R)
RS.N = R.N + S.N // WRONG! Ambiguous common table between RS and SR
```

Simple assignments between a cross table and its transposed cross table are supported. As a result, the previous script can be rewritten as:

```
table R = extend.range(3)
table S = extend.range(4)
table RS = cross(R, S)
table SR = cross(S, R)
RS.N = (R.N into RS) + S.N
SR.N = RS.N // Transpose!
show table "" a1d6 with R.N as "R", S.N as "S", RS.N as "RS", SR.N as "SR"
```

The assignment `SR.N = RS.N`

is a transposition of the values held within the table `RS`

into the table `SR`

.

As data accesses are vastly more efficient whenever lookups happen through the left table (more on this in the following), if a series of operations have to be performed through “right” lookups, it’s better to transpose the cross table, and then operate via “left” lookups afterward.

## Filtering a cross table

Filtering a cross table is possible, however the filtering process itself removes the ‘cross’ property, as the table does not hold any more a complete Cartesian product of the source tables. Thus, filtering a cross table is performed by explicitly introducing the relevant table:

```
table R = extend.range(2)
table S = extend.range(3)
table RS = cross(R, S)
RS.N = R.N + S.N
table RSf = where RS.N > 3
RSf.N = RS.N
show table "RSf" a1b4 with R.N, S.N, RSf.N // only 3 lines
```

In the above script, the table `RSf`

is defined as a filtered version of the cross table `RS`

. This table inherits the primary dimensions of `RS`

but those dimensions as inherited as secondary dimensions. As both `R`

and `S`

can broadcast into `RSf`

, the table `RSf`

is identified as the common table for the table tile.

## Filtering a source table

Filtering a source table of a cross table transitively applies a filter on the cross table itself. This filtering pattern is of primary interest because it preserves the ‘cross’ nature of the table:

```
table R = extend.range(2)
table S = extend.range(3)
table RS = cross(R, S)
where R.N == 1
RS.A = R.N
RS.B = S.N
show table "cross" a1b5 with RS.A, RS.B
```

In the above script, the source table `R`

is filtered, and as a result, the cross table `RS`

is also filtered. The table tile displays 3 lines instead of the 6 lines that would be expected if the table `RS`

were unfiltered. The table `RS`

is ‘cross’ when declared, and it remains ‘cross’ within the filter.

As filtering according to a source table do preserve the ‘cross’ nature of the cross table, all the operations over the cross table which are allowed *outside* the `where`

block remain allowed.

It is also possible to define a cross table within a filtered context:

```
table R = extend.range(2)
table S = extend.range(3)
where R.N == 1
table RS = cross(R, S)
RS.A = R.N
RS.B = S.N
show table "cross" a1b5 with RS.A, RS.B
```

When a cross table is defined in a filtered context, then this cross table is bound to this context. For example, in the above example, using the table `RS`

outside, after the end of the `where`

block, would not be allowed.

As a filtered declaration of a cross table is strictly scoped to the filtered block, it is possible to define another identically named table after the end of the `where`

block, as illustrated with:

```
table R = extend.range(2)
table S = extend.range(3)
where R.N == 1
table RS = cross(R, S)
RS.A = R.N
RS.B = S.N
show table "cross" a1b5 with RS.A, RS.B
table RS = extend.range(3)
show table "alt" c1d5 with RS.N
```

However, this behavior is specific to filtering blocks that *do impact* the cross table. Indeed, if a cross table is declared within a `where`

block that does not impact any of its source tables, then the scope does not apply, and the cross table can be used after the end of the `where`

block:

```
table R = extend.range(2)
table S = extend.range(3)
table Dummy = extend.range(4)
where Dummy.N == 1
table RS = cross(R, S)
RS.A = R.N
RS.B = S.N
show table "cross" a1b5 with RS.A, RS.B
```

In the above script, the table `RS`

is declared within a filtered scope, but this table is accessed after the end of the scope (i.e. the end of the `where`

block) because the filter does not touch `RS`

or its source tables.

## Transitive broadcasting

Broadcasting into the cross table is possible for any table that can already broadcast into one of the source tables of the broadcast itself. This mechanism is of interest, for example, for situations that involve multiple hierarchical levels. Let’s consider 3 tables `S1`

, `S2`

and `T`

where `T`

extends `S2`

, as a cross table, and where `S2`

extends `S1`

. Let’s see how values originating from `S1`

can be broadcast into `T`

:

```
table R = extend.range(2)
table S1 = extend.range(3)
table S2 = extend.range(S1.N)
table RS2 = cross(R, S2)
RS2.A = R.N
RS2.B = S1.N into S2
show table "cross" a1b5 with RS2.A, RS2.B
```

In the above script, the value `S1.N`

is first broadcast into the table `S2`

via the expression `S1.N into S2`

and then, the resulting expression is itself broadcast into `RS2`

via the assignment `RS2.B = ..`

.

In order to remove the need for expliciting the intermediate broadcast table, i.e. `S2`

in the above script, a dimension can be imported into the cross table itself:

```
table R = extend.range(2)
table S1[s1] = extend.range(3)
table S2 = extend.range(S1.N)
table RS2 = cross(R, S2)
RS2.s1 = S2.s1 // same as 's1 into S2'
RS2.A = R.N
RS2.B = S1.N
show table "cross" a1b5 with RS2.A, RS2.B
```

In the above script, once the dimension `s1`

has been imported in the cross table `RS2`

via `RS2.s1 = s1 into S2`

, there is no longer any need to specify `S2`

as the intermediate table. This allows the assignment `RS2.B = S1.N`

, which involves a direct broadcast from `S1`

to `RS2`

.

## Lookups

Lookups offer dictionary-like access to vectors by specifying the key associated with the table’s primary dimension. Both regular tables and cross tables support lookups. The lookup over a cross table specifies two keys to identify the line within the table:

```
table R[r] = with
[| as r |]
[| 1 |]
[| 2 |]
table S[s] = with
[| as s |]
[| 1 |]
[| 2 |]
[| 3 |]
table RS = cross(R, S)
RS.N = r + s
n = RS.N[r: 2, s: 3]
show scalar "n" with n
```

In the above script, two tables `R`

and `S`

are declared through their respective table comprehensions. Also, `r`

(resp. `s`

) is the primary dimension of the table `R`

(resp. `S`

) and is of type `number`

. A lookup is performed via the expression `RS.N[r: 2, s: 3]`

to extract a single scalar value from the vector `RS.N`

. The `r`

and `s`

prefixes match the names of the source tables’ dimensions `R`

and `S`

of the cross table `RS`

.

The lookup syntax for cross tables is similar to the one used for regular vector lookup. The dimensions are named to lift any ambiguity. While the ordering of the lookup arguments has no impact, It is recommended, for the sake of code clarity, to keep the dimensions in the order of the declaration of the source tables (as done above).

Lookups over vectors are also allowed, as illustrated by:

```
table R[r] = with
[| as r |]
[| 1 |]
[| 2 |]
table S[s] = with
[| as s |]
[| 1 |]
[| 2 |]
[| 3 |]
table RS = cross(R, S)
RS.N = r + s
table U = with
[| as I, as J |]
[| 2, 3 |]
[| 1, 2 |]
[| 0, 1 |] // no matching counterpart
U.N = RS.N[r: U.I, s: U.J]
show table "U" a1b4 with U.I, U.J, U.N
```

In the above script, the lookup `RS.N[r: U.I, s: U.J]`

is performed over the table `U`

, and it attempts to extract one value from the vector `RS.N`

for every line of the table `U`

. When there is no matching line to be found within the table `RS`

, as is the case for the last line of `U`

, then the lookup returns the default value for the type, i.e. 0 (zero) here as the type is `number`

.

In order to prevent unmatched lookups being replaced by default values, the `default fail`

option can be specified. This can be done by replacing the declaration line for `U.N`

by:

```
U.N = RS.N[r: U.I, s: U.J] default fail
```

Once this replacement is done, the above script fails as the last line of the table `U`

cannot be looked-up with the cross table `RS`

.

Lookups are expected to be infrequently needed in Envision. Default vector behaviors should cover most situations. However, lookups offer a greater degree of expressiveness, which can be required to cope with more complex calculations.

**Roadmap:** Lookups are read-only operations; they don’t yet have a *write* counterpart in Envision, although we plan to support such a write counterpart in the future.

### Partial lookups

If we look at a cross table as a matrix, a partial lookup can be seen as the selection of a column or of a row: given a column (resp. a row) identifier, the column vector (resp. the row vector) is obtained. The lookup is said to be “partial” - as in *partially explicit* - because one key is provided explicitly while the other key is provided implicitly:

```
table R[r] = with
[| as r |]
[| 1 |]
[| 2 |]
table S[s] = with
[| as s |]
[| 1 |]
[| 2 |]
[| 3 |]
table RS = cross(R, S)
RS.N = r + s
R.N = RS.N[s: 3]
S.N = RS.N[r: 2]
show table "R" a1b5 with R.N
show table "S" c1d5 with S.N
```

In the above script, the table `T`

is a cross between the tables `R`

and `S`

. The vector `R.N`

(resp. the vector `S.N`

) is obtained by performing a partial lookup over a line within the `S`

table (resp. the `R`

table). The explicit key belongs to the scalar table while the implicit one belongs to the table `R`

(resp. the `S`

table). As a result, the output of the lookup belongs to the table `R`

(resp. the table `S`

).

The common table used for the pairs of key can be expressed explicitly with a slightly more verbose version of the same script:

```
table R[r] = with
[| as r |]
[| 1 |]
[| 2 |]
table S[s] = with
[| as s |]
[| 1 |]
[| 2 |]
[| 3 |]
table RS = cross(R, S)
RS.N = r + s
R.N = RS.N[s: 3 into R]
S.N = RS.N[r: 2 into S]
show table "R" a1b5 with R.N
show table "S" c1d5 with S.N
```

From another perspective, we could say that the partial lookup is “underspecified”, and returns all the lines that match the constraint expressed by the key.

Performance-wise, a partial lookup over the right dimension is expected to be more efficient than one over the left dimension. In the above example, the calculation of `S.N`

(a left partial lookup) is structurally more efficient than the one of `R.N`

(a right partial lookup). Unless there is some structural twist involved (e.g. a shift of dimension), the right partial lookup usually involves a matrix transposition operation, while the left partial lookup does not.

**Advanced remark:** The Envision runtime implementation of cross tables can be expected to loosely similar to multidimensional arrays which become quite popular through the `ndarray`

object as found in the open source library NumPy. The left partial lookup is more efficient than the right one because it better preserves the locality of the data accesses.