read
Table of contents
read, table creation
The keyword read offers versatile mechanisms to read data from files.
table Products = with
[| as Product, as Color, as Price |]
[| "shirt", "white,grey", 10.50 |]
[| "pants", "blue", 15.00 |]
[| "hat", "red", 5.25 |]
write Products as "/sample/products.csv" with
Product = Products.Product
Color = Products.Color
Price = Products.Price
Then,
read "/sample/products.csv" as Products with
Product : text
Color : text
Price : number
show table "My Products" with
Products.Product
Products.Color
Products.Price
A single script is limited to 5000 input files.
All the read blocks must appear at the top of the Envision script, above any statement that cannot be processed at compile-time.
// only compile-time constants can be placed above 'read'
const path = "/mytable.csv"
read (path) as T with // parenthesis are needed here
ColumnA : text
ColumnB : number
The order of fields inside a read block does not matter, and fields can be omitted
if they are not needed by the script.
Table size constraints
max, min, and small can appear after the table name to enforce size limits for the loaded table. These are separate from the file pattern filters below.
When no explicit max is provided, read uses a default maximum of 1m lines.
This default can be overridden on the read statement or on the matching
path schema.
read "/sample/products.csv" as Products max 100 with
Product : text
Calendar auto-unfiltering
When a column is declared as date and expect [date] is used, Envision auto-creates the Day, Week, and Month tables and unfilters the date dimension over the observed range.
read "/sample/orders.csv" as Orders expect [date] with
"OrderDate" as date : date
Quantity : number
Day.Qty = sum(Orders.Quantity)
show linechart "Daily Orders" with Day.Qty
If multiple tables are read with date, the unfiltered range spans the earliest to latest dates across all those tables.
File patterns and filters
Wildcards can read multiple files into one table:
read "/sample/products-*.csv" as Products with
Product : text
Price : number
The max, min, and latest filters restrict the captured files:
read max 2 "/sample/products-*.csv" as Products with
Product : text
Price : number
max and min sort by file name, while latest sorts by last update time.
The count defaults to 1 when omitted.
The wildcard * matches any character except /, and multiple wildcards
are allowed in the same pattern.
Pattern unions and column inference
Multiple read statements can target the same table. If the formats match,
later read blocks can omit columns and rely on inference.
read "/sample/products-1.csv" as Products with
Product : text
Price : number
read "/sample/products-*.csv" as Products
Column inference is the only ordering-sensitive aspect of read statements.
If a file is captured multiple times, its content is loaded multiple times.
Pattern expressions
The file pattern can be a compile-time constant expression:
const myFolder = "/sample"
const myPath = "\{myFolder}/products.csv"
read (myPath) as Products with
Product : text
Price : number
Parameterized paths
Parameterized paths require a matching path schema. The expression inside
\{...} must be a compile-time constant (literal, const, list, or bounds).
Use .. to match any value for the parameter.
schema '/clean/Orders\{OrderMonth}.ion' with
OrderMonth : month
Id : text
const m = month(2023, 7)
read '/clean/Orders\{m}.ion' as Orders
read '/clean/Orders\{..}.ion' as Orders
Constant column
A constant column can be introduced through an assignment at the column level within the read block:
read "/sample/products.csv" as Products with
Product : text
Color : text
Price : number
VAT : number = 0.2 // constant column
show table "My Products" with
Products.Product
Products.Color
Products.Price
Products.VAT
Renamed column
A column can be renamed through an assignment and the use of the read function at the column level within the read block:
read "/sample/products.csv" as Products with
Product : text
Colour : text = read("Color")
Price : number
show table "My Products (UK)" with
Products.Product
Products.Colour
Products.Price
The function read is intended to cope with raw column names that are not valid Envision variable names.
Reading by column position is also possible but brittle:
read "/sample/products.csv" as Products with
Product : text = read(1)
Price : number = read(2)
Read aliasing
A read alias introduces the same column under a new name, typically to bind primary dimensions.
read "/sample/products.csv" as Products[id] with
Product : text
Color : text
Price : number
id = Products.Product
Prefer a post-read assignment unless the alias is needed for dimensions or schemas.
split column-level option
The option split splits the content of a text cell according to the specified delimiter, resulting in multiple lines being for each value obtained by the split.
| A | B |
|---|---|
| 1 | X,Y |
| 2 | Z |
With a split:"," on B gives:
| A | B |
|---|---|
| 1 | X |
| 1 | Y |
| 2 | Z |
The script below creates a TSV file with 2 lines:
table T = with
[| as A, as B |]
[| 1, "X,Y" |]
[| 2, "Z" |]
write T as "/sample/split.tsv" with
A = T.A
B = T.B
And, the script below re-reads the TSV file generating a table with 3 lines:
read "/sample/split.tsv" as T with
A : number
split:"," as B : text
show table "T" with T.A, T.B
The option split can only appear once within a read block.
Transpose column-level option
When multiple values are passed to the read function within a read block, it tranposes the corresponding columns - as found in the file - in the same number of lines. The columns that are not transposed have their values repeated. For example, by transposing (B1, B2, B3) in:
| A | B1 | B2 | B3 |
|---|---|---|---|
| X | X1 | X2 | X3 |
| Y | Y1 | Y2 | Y3 |
We obtain:
| A | B |
|---|---|
| X | X1 |
| X | X2 |
| X | X3 |
| Y | Y1 |
| Y | Y2 |
| Y | Y3 |
The following script creates a TSV file with 2 lines:
table T = with
[| as A, as B1, as B2, as B3 |]
[| "x", "x1", "x2", "x3" |]
[| "y", "y1", "y2", "y3" |]
write T as "/sample/transpose.tsv" with
A = T.A
B1 = T.B1
B2 = T.B2
B3 = T.B3
And, the script below re-reads the TSV file generating a table with 6 lines:
read "/sample/transpose.tsv" as T with
A : text
B : text = read("B1", "B2", "B3")
show table "T" with T.A, T.B
Discarded table
Prefixing the table name with an underscore (_) means that the table is discarded. However, the captured files are accessible from the Files built-in table.
read "/sample/*.csv" as _Samples
show table "Files" with
Files.Alias // '_Samples'
Files.Path
The table marked as discarded cannot be used in the script, and thus, this table does not trigger a not used error at compile time when it is, effectively, not used in the script.
Using a stand-alone discard symbol (_) is also possible:
read "/sample/*.csv" as _
show table "Files" with
Files.Alias // '_'
Files.Path
This feature is typically intended for scripts that are analyzing the presence or absence of files, ignoring of their content.
Read options and formats
Table-level read options control parsing. Values must be literals.
read "/sample/products.csv"
fileformat:"csv" quotes:false skip:0 separator:"," as Products with
Product : text
Price : number
Options include fileformat, skip, encoding, headers, quotes,
separator, spreadsheet, date, and number.
fileformat overrides extension-based format detection.
When overriding headers, pair headers with skip:1 to ignore the original
header line.
read "/manualfc" headers:"Id,M1,M2" skip:1 as ManualFC with
"Id" as Id : text
Values : number = read("M1", "M2")
For .xlsx files, the spreadsheet option selects a sheet by name
(case-insensitive). When omitted, the first sheet is read. An unknown sheet
name produces a runtime error.
read "/sample/items.xlsx" spreadsheet:"Sheet2" as Items with
Id : number
Label : text
File formats
Formats are detected by file extension. Supported formats include:
.csv, .tsv, .txt, .csv.gz, .tsv.gz, .xlsx, .ion, and .zip.
ZIP archives must contain exactly one file.
For flat text files, Envision auto-detects separators and number/date formats unless overridden. Header-free files are supported but fragile; prefer headers.
Accepted encoding values:
ibm861,ibm865ISO-8859-1(latin1)UTF-8,UTF-16Windows-1250,Windows-1252
Date formats use tokens like d, dd, M, MM, MMM, MMMM, yy, yyyy.
Number formats use 0 and separators, for example "0,0" or "0.0".
Sanitizing input values
When input values are inconsistent, read them as text and parse in-script:
read "/sample/products.csv" as Products with
Product : text
SizeRaw : text = read("Size")
Products.IsSizeDefined, Products.Size =
tryParseNumber(Products.SizeRaw, "", ",")
Filesystem atomicity
Reads observe a snapshot of the filesystem. Writes happen at script end and
are not visible during the same run. Two write blocks cannot target the same
path in a single script.
Datatype read/write support
The following table lists the database that can be read or written, depending on the intended file format.
| Type | TSV read/write | Ionic read/write |
|---|---|---|
| boolean | yes | yes |
| date | yes | yes |
| embedding | no | yes |
| markdown | no | no |
| month | yes | yes |
| number | yes | yes |
| ranvar | no | yes |
| text | yes | yes |
| week | yes | yes |
| text | yes | yes |
| zedfunc | no | yes |
Only .ion supports ranvar, zedfunc, and embedding.