Files
Files, built-in table
The Files table contains the list of all the files that have been captured by the read statements in the Envision script. It has a primary dimension named file : ordinal.
This table is intended to support the design data integrity checks. For example, to test files against conditions related to their expected size; or to pinpoint the origin file of an inconsistent line.
show table "My Files" with
Files.Path
Files.Age
Files.ModifiedDate
Files.ModifiedHour
Files.ModifiedMinute
Files.Alias
Files.Bytes
Files.Success
Files.RawLines
Files.BadLines
Files.BadDates
Files.BadNumbers
Files.MissingValues
The fields are defined as follow:
Files.Path : text: the original path of the fileFiles.Age: number: The fractional age in hours since the modified date of the file and the start of the run.Files.ModifiedDate : date: the “last modified” date of the file.Files.ModifiedHour : number: the “last modified” hour of the file, in the UTC+00 time zone.Files.ModifiedMinute : number: the “last modified” minute of the file.Files.Alias : text: the name of the table associated with file.Files.Bytes : number: the original file size, in bytes.Files.Success : boolean: whether the file was successfully loaded.Files.RawLines : number: the number of lines in the file, including those that were dropped (e.g. missingidordatevalues).Files.BadLines : number: the number of lines dropped - soRawLines - BadLinesis the size of the actual file processed.Files.BadDates : number: the number of bad date errors.Files.BadNumbers : number: the number of bad number errors.Files.MissingValues : number: the number of missing value errors.
The table Files is upstream of all the tables obtained through read statements. The following script illustrates this capability:
read "/sample/Orders*.tsv" as Orders with
Quantity : number
Orders.Path = Files.Path // broadcast
where Orders.Quantity < 0
show table "Files with negative order quantities" with
Orders.Path
group by Orders.Path
Read discards
The built-in table Files can be used to list files irrespectively of their content, typically to check the presence or the absence of files. The read discard is a syntax intended make more of the Files table.
read "/sample/product*" as _
show table "Files" a1b3 with
Files.Path
Files.Alias
In the above script, _ is used as a discard. The resulting table cannot be further used in the script. However, the name appears in the Files.Alias column.
read "/sample/product*" as _products
read "/sample/order*" as _orders
show table "Files" a1b3 with
Files.Path
Files.Alias
In the above script, the discards are named respectively _products and _orders. Those tables cannot be further used in the script, but their names are preserved through Files.Alias to identify the original capture pattern.
A read block that does not capture any file fails in Envision - except if the files are discarded. If a discard is used, then it is valid not to capture any file. This behavior can be used to verify that a folder is empty for example.