Files, special table
Files table contains the list of all the files that have been captured by the
read statements in the Envision script. It has a primary dimension named
file : ordinal.
This table is intended to support the design data integrity checks. For example, to test files against conditions related to their expected size; or to pinpoint the origin file of an inconsistent line.
show table "My Files" with Files.Path Files.Age Files.ModifiedDate Files.ModifiedHour Files.ModifiedMinute Files.Alias Files.Bytes Files.Success Files.RawLines Files.BadLines Files.BadDates Files.BadNumbers Files.MissingValues
The fields are defined as follow:
Files.Path : text: the original path of the file
Files.Age: number: The fractional age in hours since the modified date of the file and the start of the run.
Files.ModifiedDate : date: the “last modified” date of the file.
Files.ModifiedHour : number: the “last modified” hour of the file, in the UTC+00 time zone.
Files.ModifiedMinute : number: the “last modified” minute of the file.
Files.Alias : text: the name of the table associated with file.
Files.Bytes : number: the original file size, in bytes.
Files.Success : boolean: whether the file was successfully loaded.
Files.RawLines : number: the number of lines in the file, including those that were dropped (e.g. missing
Files.BadLines : number: the number of lines dropped - so
RawLines - BadLinesis the size of the actual file processed.
Files.BadDates : number: the number of bad date errors.
Files.BadNumbers : number: the number of bad number errors.
Files.MissingValues : number: the number of missing value errors.
Files is upstream of all the tables obtained through
read statements. The following script illustrates this capability:
read "/sample/Orders*.tsv" as Orders with Quantity : number Orders.Path = Files.Path // broadcast where Orders.Quantity < 0 show table "Files with negative order quantities" with Orders.Path group by Orders.Path
The special table
Files can be used to list files irrespectively of their content, typically to check the presence or the absence of files. The read discard is a syntax intended make more of the
read "/sample/product*" as _ show table "Files" a1b3 with Files.Path Files.Alias
In the above script,
_ is used as a discard. The resulting table cannot be further used in the script. However, the name appears in the
read "/sample/product*" as _products read "/sample/order*" as _orders show table "Files" a1b3 with Files.Path Files.Alias
In the above script, the discards are named respectively
_orders. Those tables cannot be further used in the script, but their names are preserved through
Files.Alias to identify the original capture pattern.