by

The by keyword serves several purposes in Envision, typically related to grouping lines of an existing table.

’table T = by (expr)’, table creation

The by statement creates a table by grouping the lines of the target vector into group of identical values.

table Orders = with
  [| as Pid, as OrderDate, as Quantity |]
  [| "apple",  date(2022, 9, 13), 3 |]
  [| "orange", date(2022, 9, 14), 2 |]
  [| "apple",  date(2022, 9, 15), 7 |]
  [| "orange", date(2022, 9, 15), 2 |]
  [| "apple",  date(2022, 9, 16), 7 |]

table Products[product] = by Orders.Pid // new table 'Products'

show table "Products" with 
  product
  sum(Orders.Quantity) into Products

The statement also operates if a tuple is provided.

table Orders = with
  [| as Pid, as OrderDate, as Quantity |]
  [| "apple",  date(2022, 9, 13), 3 |]
  [| "orange", date(2022, 9, 14), 2 |]
  [| "apple",  date(2022, 9, 15), 7 |]
  [| "orange", date(2022, 9, 15), 2 |]
  [| "apple",  date(2022, 9, 16), 7 |]

table Sales[tu] = by [Orders.Pid, Orders.OrderDate] // new table 'Sales'

Sales.Product, Sales.OrderDate = tu

show table "Sales" with 
  Sales.Product
  Sales.OrderDate
  sum(Orders.Quantity) into Sales

By construction, if the new table T is created from table U, then U is upstream of T.

The created table may reuse the name of the source table. After the statement, that name refers to the grouped table, not to the original table anymore.

table Ext = extend.range(5)
Ext.M = Ext.N mod 2

table Ext[m] = by Ext.M

show table "Ext" with
  m

Grouping keys must support equality with good performance: text, number, boolean, date, week, month, flagset, all enums, and the internal types ordinal, double, and long.

Tuple variables can be used as grouping keys. When the tuple is the entire key, use it directly. When mixing it with other keys, unpack it with ....

table T = extend.range(5)
T.M2 = T.N mod 2
T.M3 = T.N mod 3
T.K = (T.M2, T.M3)

table U[tu] = by T.K
U.M2, U.M3 = tu

table V[tu2] = by [...T.K, T.N]
V.M2, V.M3, V.N = tu2

show table "U" with U.M2, U.M3

‘proc(..) by (expr)’, process option

The by option is used to control the grouping used by a process (aggregators being the most common processes in Envision).

table S = with // shipments
  [| as Destination, as Origin, as OrderDate,        as Quantity |]
  [| "USA",         "France",     date(2020, 4, 15), 3           |]
  [| "Germany",     "Spain",      date(2020, 4, 16), 6           |]
  [| "Germany",     "Italy",      date(2020, 4, 16), 2           |]
  [| "USA",         "France",     date(2020, 4, 17), 5           |]
  [| "Germany",     "Spain",      date(2020, 4, 18), 1           |]

// 'by' lets the 'sum' aggregation unfold using 'S.Destination' as the grouping.
// afterward, the results are broadcasted by into the table 'S'.
S.TotalOverTheDestination = sum(S.Quantity) by S.Destination

show table "Shipments" with  S.Destination, S.Origin, S.Quantity, S.TotalOverTheDestination
// Destination,Origin,Quantity,TotalOverTheDestination
// USA,France,3,8
// Germany,Spain,6,9
// Germany,Italy,2,9
// USA,France,5,8
// Germany,Spain,1,9

The at option can be used to supplement the by option, see the by .. at below.

The by option also works inside for and each bodies. In that case, the grouped result is computed for the current iteration and assigned to the target vector.

table Items[id] = with
  [| as id, as X, as Cat |]
  [| "1", 1, "A" |]
  [| "2", 2, "B" |]
  [| "3", 3, "A" |]
  [| "4", 4, "B" |]

expect table Items small 4

Items.B = for X in Items.X
  Items.N = join("\{X}-\{Items.X}"; " ") sort id by Items.Cat
  return join(Items.N; ";") sort id

When by .. at combines enums with text values, the text side is automatically promoted to the enum type, and non-matching text becomes null.

table Items = with
  [| as A, as B, as Letter |]
  [| "K", "K", "A" |]
  [| "I", "I", "B" |]

table enum EA = Items.A
Items.EA = enum<<EA>>(Items.A)
Items.Letter2 = same(Items.Letter) by Items.B at Items.EA

See also

User Contributed Notes
0 notes + add a note