schema
Table of contents
schema, keyword
Schemas bind a list of typed fields to a named schema or to a path. They reduce repetition and stabilize file formats across scripts. They can also be used to shape the output of union tables.
Path schemas
A path schema binds a file path to a list of fields. The path literal uses single quotes.
schema '/sample/products.csv' with
Product : text
Color : text
Price : number
table Products = with
[| as Product, as Color, as Price |]
[| "shirt", "white,grey", 10.50 |]
[| "pants", "blue", 15.00 |]
[| "hat", "red", 5.25 |]
write Products as '/sample/products.csv'
The same schema can be used on the read side:
schema '/sample/products.csv' with
Product : text
Color : text
Price : number
read '/sample/products.csv' as Products
show table "My Products" with
Products.Product
Products.Color
Products.Price
Field documentation
Triple-slash comments are attached to fields and surfaced in the editor:
schema '/sample/products.csv' with
/// The product identifier.
Product : text
/// The 3-letter color code.
Color : text
/// The VAT-included unit price.
Price : number
Field renaming
Use read("Column Name") to bind a schema field to a raw column name:
schema '/sample/products.csv' with
VAT : number = read("value added tax")
table Products = with
[| as VAT |]
[| 0.2 |]
write Products as '/sample/products.csv'
Field rebinding on write
Rebinding overrides the default binding for a path schema:
schema '/sample/products.csv' with
Product : text
Color : text
Size : text
Price : number
table Products = with
[| as Product, as Color, as Price |]
[| "shirt", "white,grey", 10.50 |]
[| "pants", "blue", 15.00 |]
write Products as '/sample/products.csv' with
Size = "XL"
Color = uppercase(Products.Color)
Path schemas do not support isolated rebinding on read; use a named schema for read-side rebinding.
Named schemas
Named schemas define field lists without a path. They can be embedded inside
path schemas or used directly in read and write blocks.
schema Products with
Product : text
Color : text
Price : number
schema '/sample/products.csv' with
schema Products
Named schemas can be used stand-alone:
schema Products with
Product : text
Color : text
Price : number
read "/sample/products.csv" as Products with
schema Products
Named schemas are incomplete by default: read/write blocks may add extra fields.
schema Products with
Product : text
Color : text
read "/sample/products.csv" as Products with
schema Products
Price : number
Composing schemas
Schemas can include other schemas. Duplicate field names are rejected.
schema JustProduct with
Product : text
schema JustColor with
Color : text
schema Products with
schema JustProduct
schema JustColor
Price : number
Field rebinding on read
Named schemas can be rebound inside a read block:
schema PartialProducts with
ProductId : text
Color : text
Size : text
read "/sample/products.csv" as Products with
schema PartialProducts with
ProductId = read("Product")
Size = "extra large"
Price : number
Field rebinding on write
schema PartialProducts with
Product : text
Color : text
table Products = with
[| as Name, as Color, as Price |]
[| "shirt", "white,grey", 10.50 |]
write Products as "/sample/products.csv" with
schema PartialProducts with
Product = Products.Name
Price = Products.Price
Path literals and prefixing
Path literals are single-quoted. Prefixing uses \{..} at the start of the path.
schema '/sample/products.csv' with
Product : text
Color : text
Price : number
const myFolder = '/sample'
read '\{myFolder}/products.csv' as Products
Path schema cloning
Cloning creates a new path schema that reuses a path schema definition:
schema '/production/products.csv' with
Product : text
Color : text
Price : number
schema '/sample/products.csv' = '/production/products.csv'
Prefix cloning applies to all schemas under a folder:
schema '/production/products.csv' with
Product : text
Color : text
Price : number
schema '/sample' = '/production'
Parameterized paths and partitions
Path parameters allow partitioned reads and writes: This mechanism is intended for internal partitioned flows, not for raw multi-file extractions from third-party systems.
schema '/sample/products-\{Bucket}.csv' with
Bucket : number
Product : text
Color : text
Price : number
table Products = with
[| as Product, as Color, as Price, as Bucket |]
[| "shirt", "white", 10.50, 1 |]
[| "pants", "blue", 15.00, 2 |]
write Products partitioned as '/sample/products-\{..}.csv'
Reading uses the same parameterized path:
schema '/sample/products-\{Bucket}.csv' with
Bucket : number
Product : text
Color : text
Price : number
read '/sample/products-\{..}.csv' as Products
Bounds restrict the captured range:
schema '/sample/products-\{Bucket}.csv' with
Bucket : number
Product : text
Color : text
Price : number
const lowerIncl = 1
const higherIncl = 2
read '/sample/products-\{lowerIncl..higherIncl}.csv' as Products
Path parameter types: text, number, date, week, month.
Multiple parameters are allowed when delimited by separators.
Path schema collisions are rejected at compile time.
Partitioned writes overwrite all captured files and delete empty ones. Files outside bounded paths remain untouched.
Size caps
Use max on a path schema to cap the number of lines:
schema '/sample/products.csv' max 10 with
Product : text
Color : text
Price : number
Enum downcast
Fields declared as text in a schema can be downcast to enums when reading:
schema '/sample/products.csv' with
Product : text
Color : text
Price : number
read '/sample/products.csv' as Products with
Color : table enum Colors
Aliasing on read
Aliases avoid name conflicts with schema fields:
schema '/sample/products.csv' with
Color : text
read '/sample/products.csv' as Products with
ColorAlias = Products.Color
Modules
Schemas can be exported from modules and reused:
// In "/sample/my-module"
export schema '/sample/products.csv' with
Product : text
Color : text
Price : number
import "/sample/my-module" as M
read '/sample/products.csv' as Products