Data Modeling in Envision
This page gathers the core data-modeling tutorials in one place so you can move from raw files to clean, reusable tables. If you are starting here, open the 1-echelon 2017 dataset session; otherwise keep your current session with the dataset loaded.
Table of contents
Read TSV and CSV Files into Tables
Paste the script, run it, and look for a quick proof that both TSV and CSV-style inputs are in memory.
- Keep the default
/samplefolder in your session. - Run the script.
- You should now see a summary tile with row counts plus a table showing the last five order lines.
read "/sample/Lokad_Items.tsv" as Items[id] with
"Id" as id : text
Name : text
Category : text
SellPrice : number
SellCurrency : text
read "/sample/Lokad_Orders.tsv.gz" as Orders expect [id, date] with
"Id" as id : text
"Date" as date : date
Quantity : number
NetAmount : number
Currency : text
itemsCount = count(Items.*)
Orders.Row = rank() scan [Orders.date desc]
show summary "Loaded files" with
itemsCount as "Items"
count(Orders.*) as "Order lines"
min(date) as "Earliest order date"
max(date) as "Latest order date"
where Orders.Row <= 5
show table "Last 5 orders" with
Orders.date
Orders.id
Orders.Quantity
Orders.NetAmount { unit: "#(Orders.Currency)" }
Model Tables with Primary Dimensions
This script sets primary dimensions on both tables and gives you quick checks that the keys are behaving as expected.
- Keep the default
/samplefolder. - Run the script.
- You should now see a summary confirming counts and a sample of the keyed order lines.
read "/sample/Lokad_Items.tsv" as Items[id] with
"Id" as id : text
Name : text
Category : text
read "/sample/Lokad_Orders.tsv.gz" as Orders expect [id, date] with
"Id" as id : text
"Date" as date : date
Quantity : number
NetAmount : number
Currency : text
Orders.Row = rank() scan [Orders.date desc]
itemsCount = count(Items.*)
ordersCount = count(Orders.*)
show summary "Primary dimensions ready" with
itemsCount as "Items (id)"
ordersCount as "Order lines (id, date)"
where Orders.Row <= 5
show table "Sample keyed orders" with
Orders.date
Orders.id
Orders.Quantity
Orders.NetAmount { unit:"#(Orders.Currency)" }
If the counts look right, you now have primary dimensions set for both tables and can move on to combining them safely.
Join Orders to Items Without Writing a SQL JOIN
Orders already carry id as a primary dimension, so you can pull item attributes directly without an explicit join.
- Run the script.
- You should now see a table where each order line carries the item name and revenue.
- The summary tile confirms the total revenue aggregates across both tables.
read "/sample/Lokad_Items.tsv" as Items[id] with
"Id" as id : text
Name : text
SellPrice : number
SellCurrency : text
read "/sample/Lokad_Orders.tsv.gz" as Orders expect [id, date] with
"Id" as id : text
"Date" as date : date
Quantity : number
Currency : text
Orders.ItemName = Items.Name // Pulls from Items via shared id
Orders.UnitPrice = Items.SellPrice
Orders.Revenue = Orders.Quantity * Items.SellPrice
Orders.Row = rank() scan [Orders.date desc]
show summary "Joined revenue" with
sum(Orders.Revenue) as "Total revenue" { unit:"#(Orders.Currency)" }
where Orders.Row <= 6
show table "Orders with item attributes" with
Orders.date
Orders.id
Orders.ItemName
Orders.Quantity
Orders.UnitPrice { unit:"#(Orders.SellCurrency)" }
Orders.Revenue { unit:"#(Orders.Currency)" }
If you can see names and prices without any SQL, the dimension-driven join is working.
Filter Orders to a Time Window
Work from the sample dataset and keep the orders table, but slice it to a fixed window so downstream aggregations stay focused.
- Run the script.
- You should now see a summary tile with the total order lines versus the filtered slice.
- The table shows only orders inside the chosen window.
read "/sample/Lokad_Items.tsv" as Items[id] with
"Id" as id : text
Name : text
read "/sample/Lokad_Orders.tsv.gz" as Orders expect [id, date] with
"Id" as id : text
"Date" as date : date
Quantity : number
NetAmount : number
Currency : text
startDate = date(2017, 7, 1)
endDate = date(2017, 12, 31)
Orders.InWindow = Orders.date >= startDate and Orders.date <= endDate
Orders.Row = rank() scan [Orders.date desc]
ordersTotal = count(Orders.*)
ordersInWindow = sum(if Orders.InWindow then 1 else 0)
show summary "Windowed orders" with
ordersTotal as "Total lines"
ordersInWindow as "Lines in window"
where Orders.InWindow and Orders.Row <= 8
show table "Orders between July and Dec 2017" with
Orders.date
Orders.id
Orders.Quantity
Orders.NetAmount { unit:"#(Orders.Currency)" }
Change startDate and endDate to practice slicing different periods and watch the counts adjust.
Aggregate Orders into Weekly and Monthly Buckets
Aggregate demand from daily orders into weekly and monthly buckets. The expect [date] statement creates the Day, Week, and Month tables.
- Run the script.
- You should now see weekly and monthly tables with total units and revenue.
read "/sample/Lokad_Orders.tsv.gz" as Orders expect [date] with
"Date" as date : date
Quantity : number
NetAmount : number
Currency : text
Day.Units = sum(Orders.Quantity)
Day.Revenue = sum(Orders.NetAmount)
Week.Units = sum(Day.Units)
Week.Revenue = sum(Day.Revenue)
Month.Units = sum(Day.Units)
Month.Revenue = sum(Day.Revenue)
show table "Weekly totals" with
Week.week
Week.Units
Week.Revenue { unit:"$" }
show table "Monthly totals" with
Month.month
Month.Units
Month.Revenue { unit:"$" }
Build a Top-N Ranking Table (Top Sellers)
Rank items by revenue to get a compact top sellers list.
- Run the script.
- You should now see the top 10 SKUs ranked by revenue.
read "/sample/Lokad_Items.tsv" as Items[id] with
"Id" as id : text
Name : text
read "/sample/Lokad_Orders.tsv.gz" as Orders expect [id, date] with
"Id" as id : text
"Date" as date : date
Quantity : number
NetAmount : number
table SkuStats[sku] = by Orders.id
SkuStats.Name = Items.Name
SkuStats.Revenue = sum(Orders.NetAmount)
SkuStats.Units = sum(Orders.Quantity)
SkuStats.Rank = rank() sort -SkuStats.Revenue
where SkuStats.Rank <= 10
show table "Top sellers" with
SkuStats.Rank
SkuStats.sku
SkuStats.Name
SkuStats.Units
SkuStats.Revenue
order by SkuStats.Rank
Build a Two-Axis KPI Table with a Cross Table
Build a small KPI matrix that compares two dimensions side by side.
- Run the script.
- You should now see a cross table of KPIs by channel and region.
table Channels[channel] = with
[| as channel |]
[| "Online" |]
[| "Store" |]
table Regions[reg] = with
[| as reg |]
[| "North" |]
[| "South" |]
table Orders = cross(Channels, Regions)
Orders.Units = random.poisson(40 into Orders)
Orders.Revenue = Orders.Units * random.uniform(20 into Orders, 40)
show table "KPI matrix" with
Channels.channel
Regions.reg
Orders.Units
Orders.Revenue { unit:"$" }
order by [Channels.channel, Regions.reg]
Create Derived Tables with Table Comprehensions
Create a derived table so you can reuse a filtered or enriched dataset.
- Run the script.
- You should now see a derived table with only premium items.
table Items = with
[| as Sku, as Category, as Price |]
[| "A-100", "Core", 12.5 |]
[| "B-200", "Core", 18.0 |]
[| "C-300", "Premium", 35.0 |]
[| "D-400", "Premium", 42.0 |]
table Premium = extend.range(if Items.Category == "Premium" then 1 else 0)
Premium.Sku = Items.Sku
Premium.Price = Items.Price
show table "Premium items" with
Premium.Sku
Premium.Price { unit:"$" }
You now have a small derived table that can power downstream dashboards without re-filtering.