process

The contextual keyword process specifies the type of a user-defined function.

‘def process funname (params) with’, process definition

The modifier process indicates that the function sequentially processes its vector arguments, all while maintaining internal state.

Call options such as by, at, sort, and scan are only available on process calls; pure functions do not accept call options. Options like by partition work by group, while scan enforces sequential evaluation.

table T = with
  [| as N, as X |]
  [| 0,    1    |]
  [| 1,    2    |]
  [| 2,    -1   |]

def process sumOfSquares(x : number) with
  keep sum = 0
  sum = sum + x * x
  return sum

T.CumulativeSum = sumOfSquares(T.X) scan T.N
total = sumOfSquares(T.X) sort T.N

show table "" with T.N, T.X, T.CumulativeSum

// 6
show scalar "" with total

The first show statement results in the following table:

N X CumulativeSum
0 1 1
1 2 5
2 -1 6

In our example, the keywords used to invoke sumOfSquares are scan and sort. The scan keyword feeds sumOfSquares with T.X values according to the order of T.N, namely: first 1, then 2, and finally -1. As a result, we obtain a vector containing all the steps of computation:

  1. sum = 0 + 1 * 1 = 1
  2. sum = 1 + 2 * 2 = 5
  3. sum = 5 + (-1) * (-1) = 6

At each step, the internal state sum (introduced by the keep keyword) has a definite value: it is 0 at the beginning, 1 after the first step, 5 after the second step, and 6 after the third step.

Following the explanation above, the sort keyword acts in the same way as scan, but instead of returning a vector of results, it only returns the final value of sum, which is 6. Therefore, the total variable is not a vector but a mere scalar, which we print with show scalar.

Every process call must specify an order with sort or scan (including scan auto). The compiler enforces ordering even if the process logic is commutative.

table T = extend.range(3)

def process runningCount(x : number) with
  keep c = 0
  c = c + 1
  return c

T.C = runningCount(T.N) scan auto

show table "" with T.N, T.C

Vector arguments can be comma-separated. For example, if we wanted to sum the squares of two numbers, we would define the function as follows:

table T = with
  [| as N, as X, as Y |]
  [| 0,    1,    4    |]
  [| 1,    2,    5    |]
  [| 2,    3,    6    |]

def process sumOfSquares(x : number, y : number) with
  keep sum = 0
  sum = sum + x * x + y * y
  return sum

total = sumOfSquares(T.X, T.Y) sort T.N

// 91
show scalar "" with total

The computation would then proceed as follows:

  1. sum = 0 + 1 * 1 + 4 * 4 = 17
  2. sum = 17 + 2 * 2 + 5 * 5 = 46
  3. sum = 46 + 3 * 3 + 6 * 6 = 91

When a by clause is specified, the process state is initialized once per group and reset between groups. With scan, the process emits a value for each line in each group.

table T = with
  [| as Group, as N |]
  [| "A", 1 |]
  [| "A", 2 |]
  [| "B", 3 |]

def process runningSum(x : number) with
  keep s = 0
  s = s + x
  return s

T.S = runningSum(T.N) by T.Group scan T.N

show table "" with T.Group, T.N, T.S

On the other hand, if wanted to initialize sum with a specific value, we would pass it as an additional argument after a semicolon (;):

table T = with
  [| as N, as X |]
  [| 0,    1    |]
  [| 1,    2    |]
  [| 2,    -1   |]

def process sumOfSquares(x : number; seed : number) with
  keep sum = seed
  sum = sum + x * x
  return sum

total = sumOfSquares(T.X; 5) sort T.N

// 11
show scalar "" with total

This will be computed as follows:

  1. sum = 5 + 1 * 1 = 6
  2. sum = 6 + 2 * 2 = 10
  3. sum = 10 + (-1) * (-1) = 11

Group arguments are aligned with the grouped table and remain constant for the duration of a group.

table Items = with
  [| as Product, as Color, as Sep |]
  [| "shirt", "pink", ", " |]
  [| "shirt", "white", ", " |]
  [| "socks", "green", " - " |]
  [| "socks", "yellow", " - " |]

table Products[Product] = by Items.Product
Products.Sep = same(Items.Sep)

def process joinColors(c : text; sep : text) with
  keep acc = ""
  if acc == ""
    acc = c
  else
    acc = "\{acc}\{sep}\{c}"
  return acc

Products.Colors =
  joinColors(Items.Color; Products.Sep)
  sort Items.Color

show table "" with Product, Products.Colors

Of course, it is also possible to use multiple vector arguments and multiple initialization arguments at the same time, as the following example demonstrates:

table T = with
  [| as N, as X, as Y |]
  [| 0,    1,    4    |]
  [| 1,    2,    5    |]
  [| 2,    3,    6    |]

def process sumOfSquares(x : number, y : number; seedX : number, seedY : number) with
  keep sum = seedX + seedY
  sum = sum + x * x + y * y
  return sum

total = sumOfSquares(T.X, T.Y; 5, 10) sort T.N

// 106
show scalar "" with total

The computation proceeds similarly to the previous examples:

  1. sum = 5 + 10 + 1 * 1 + 4 * 4 = 32
  2. sum = 32 + 2 * 2 + 5 * 5 = 61
  3. sum = 61 + 3 * 3 + 6 * 6 = 106

Finally, what happens if the vector arguments are empty? In this scenario, the process will return the default value of the data type. For example:

table T = with
  [| as N, as X |]
  [| 0,    1    |]
  [| 1,    2    |]
  [| 2,    -1   |]

def process sumOfSquares(x : number) with
  keep sum = 0
  sum = sum + x * x
  return sum

where T.X > 100
  total = sumOfSquares(T.X) sort T.N

// 0
show scalar "" with total

Since where T.X > 100 essentially filters out all the elements of T.X, the process returns 0.

However, it is also possible to specify what default return value must be. In the following example, the default return value is explicitly set to 42:

table T = with
  [| as N, as X |]
  [| 0,    1    |]
  [| 1,    2    |]
  [| 2,    -1   |]

def process sumOfSquares(x : number) default 42 with
  keep sum = 0
  sum = sum + x * x
  return sum

where T.X > 100
  total = sumOfSquares(T.X) sort T.N

// 42
show scalar "" with total

All keep statements, including keep process, must appear before any non-keep statement in a process body.

Process instances

A process instance separates state access from state updates.

table T = with
  [| as N |]
  [| 2 |]
  [| 4 |]
  [| 8 |]

def process sumPlusOne(x : number) with
  keep process acc = sum(number)
  state = acc + 1
  updated = acc(x)
  return (state, updated)

T.State, T.Updated = sumPlusOne(T.N) scan T.N

show table "" with T.N, T.State, T.Updated

See also

User Contributed Notes
0 notes + add a note