by
The by keyword serves several purposes in Envision, typically related to grouping lines of an existing table.
’table T = by (expr)’, table creation
The by statement creates a table by grouping the lines of the target vector into group of identical values.
table Orders = with
[| as Pid, as OrderDate, as Quantity |]
[| "apple", date(2022, 9, 13), 3 |]
[| "orange", date(2022, 9, 14), 2 |]
[| "apple", date(2022, 9, 15), 7 |]
[| "orange", date(2022, 9, 15), 2 |]
[| "apple", date(2022, 9, 16), 7 |]
table Products[product] = by Orders.Pid // new table 'Products'
show table "Products" with
product
sum(Orders.Quantity) into Products
The statement also operates if a tuple is provided.
table Orders = with
[| as Pid, as OrderDate, as Quantity |]
[| "apple", date(2022, 9, 13), 3 |]
[| "orange", date(2022, 9, 14), 2 |]
[| "apple", date(2022, 9, 15), 7 |]
[| "orange", date(2022, 9, 15), 2 |]
[| "apple", date(2022, 9, 16), 7 |]
table Sales[tu] = by [Orders.Pid, Orders.OrderDate] // new table 'Sales'
Sales.Product, Sales.OrderDate = tu
show table "Sales" with
Sales.Product
Sales.OrderDate
sum(Orders.Quantity) into Sales
By construction, if the new table T is created from table U, then U is upstream of T.
The created table may reuse the name of the source table. After the statement, that name refers to the grouped table, not to the original table anymore.
table Ext = extend.range(5)
Ext.M = Ext.N mod 2
table Ext[m] = by Ext.M
show table "Ext" with
m
Grouping keys must support equality with good performance: text, number, boolean, date, week, month, flagset, all enums, and the internal types ordinal, double, and long.
Tuple variables can be used as grouping keys. When the tuple is the entire key, use it directly. When mixing it with other keys, unpack it with ....
table T = extend.range(5)
T.M2 = T.N mod 2
T.M3 = T.N mod 3
T.K = (T.M2, T.M3)
table U[tu] = by T.K
U.M2, U.M3 = tu
table V[tu2] = by [...T.K, T.N]
V.M2, V.M3, V.N = tu2
show table "U" with U.M2, U.M3
‘proc(..) by (expr)’, process option
The by option is used to control the grouping used by a process (aggregators being the most common processes in Envision).
table S = with // shipments
[| as Destination, as Origin, as OrderDate, as Quantity |]
[| "USA", "France", date(2020, 4, 15), 3 |]
[| "Germany", "Spain", date(2020, 4, 16), 6 |]
[| "Germany", "Italy", date(2020, 4, 16), 2 |]
[| "USA", "France", date(2020, 4, 17), 5 |]
[| "Germany", "Spain", date(2020, 4, 18), 1 |]
// 'by' lets the 'sum' aggregation unfold using 'S.Destination' as the grouping.
// afterward, the results are broadcasted by into the table 'S'.
S.TotalOverTheDestination = sum(S.Quantity) by S.Destination
show table "Shipments" with S.Destination, S.Origin, S.Quantity, S.TotalOverTheDestination
// Destination,Origin,Quantity,TotalOverTheDestination
// USA,France,3,8
// Germany,Spain,6,9
// Germany,Italy,2,9
// USA,France,5,8
// Germany,Spain,1,9
The at option can be used to supplement the by option, see the by .. at below.
The by option also works inside for and each bodies. In that case, the
grouped result is computed for the current iteration and assigned to the target
vector.
table Items[id] = with
[| as id, as X, as Cat |]
[| "1", 1, "A" |]
[| "2", 2, "B" |]
[| "3", 3, "A" |]
[| "4", 4, "B" |]
expect table Items small 4
Items.B = for X in Items.X
Items.N = join("\{X}-\{Items.X}"; " ") sort id by Items.Cat
return join(Items.N; ";") sort id
When by .. at combines enums with text values, the text side is automatically
promoted to the enum type, and non-matching text becomes null.
table Items = with
[| as A, as B, as Letter |]
[| "K", "K", "A" |]
[| "I", "I", "B" |]
table enum EA = Items.A
Items.EA = enum<<EA>>(Items.A)
Items.Letter2 = same(Items.Letter) by Items.B at Items.EA
See also
- Aggregating with an explicit output table
- by .. at
- single by (table creation)
- whichever by (table creation)