extend.split

extend.split, function

def table extend.split(source: text, separator: text): table
def table extend.split(source: text, separator: text, unit: text): table

Splits source by one or more separator values and returns a table of tokens with their starting position. When unit is provided, tokens ending in a unit and preceded by a digit are split into number + unit.

Examples

table Phrases = with
  [| as Text |]
  [| "blue|berry" |]
  [| "fast red" |]

table Seps = with
  [| as Sep |]
  [| "|" |]
  [| " " |]

table Tokens max 1000 = extend.split(Phrases.Text, Seps.Sep)

show table "Tokens" with
  Phrases.Text
  Tokens.Token
  Tokens.Position

This produces the following table:

Text Token Position
blue berry blue
blue berry berry
fast red fast 0
fast red red 5
table Phrases = with
  [| as Text |]
  [| "12kg" |]
  [| "5cm" |]

table Seps = with
  [| as Sep |]
  [| " " |]

table Units = with
  [| as Unit |]
  [| "kg" |]
  [| "cm" |]

table Tokens max 1000 = extend.split(Phrases.Text, Seps.Sep, Units.Unit)

show table "Units" with
  Phrases.Text
  Tokens.Token
  Tokens.Position

This produces the following table:

Text Token Position
12kg 12 0
12kg kg 2
5cm 5 0
5cm cm 1

Remarks

Tables produced by extend.split default to a maximum size of 100m lines unless an explicit max constraint is provided.

Errors

extend.split fails when a separator is empty, the separator table is empty, or the separator or unit tables exceed 100 rows.

User Contributed Notes
0 notes + add a note