The more technical overview
Envision is a domain-specific language tailored for quantitative and predictive supply chain analytics. The syntax shares similarities with SQL and Python . Lokad offers both a web-based development environment and a cloud-based execution environment. Input data is expected to be provided as tabular files, either flat text files or Excel sheets, hosted within Lokad. The output of an Envision script is a dashboard, and potentially one or several tabular output files that result from calculations within the script.
The genesis of the domain-specific language
Envision is the result of Lokad’s years of experience working with hundreds of retailers. It was not part of our technology roadmap when Lokad was founded in 2008. Creating a new programming language is a major commitment, and between 2008 and 2013, we used mainstream programming languages only. However, as we gained more and more experience, we realized that developing a language uniquely tailored for supply chain would help us develop and carry out our bespoke client assignments much faster.
During 2014, as Lokad started its “dogfooding” process by using Envision for some of our internal projects, it became clear that, even with top-notch software developers, Envision-powered initiatives were systemically outperforming the alternative initiatives carried with generic programming languages. We do not claim that Envision is better than C#/Java/Python/SQL/etc. In the general case however, it’s just that – for the very specific cases found in supply chain – these languages, albeit being excellent, do not represent the pinnacle of business productivity.
The properties of a well-designed language
Let’s face it: the vast majority of Enterprise programming languages are barely good enough to qualify as “junk-grade” quality. As we designed Envision, we decided to make it an awesome language only limited by its non-negotiable focus on supply chain.
- No loop, no branches (yes, it’s a feature)
- Strong typing
- Function calls are free of side-effects
- Native code compilation
- Lightning fast execution with specialized algorithms
The development environment:
- Code coloring and code-autocompletion
- Smart compiler giving meaningful error messages
- Complete versioning of past edits and runs
- Contextual browsing of the input data
In terms of coding style, Envision is majorly inspired by the concise syntax of Python, but we borrowed some good ideas from other languages like C# as well.
SQL vs Envision
Both SQL and Envision share a strong data-affinity. However, while SQL focuses on querying a transactional data model, with a lot of emphasis on the ACID (atomicity, consistency, isolation, durability) properties, Envision just queries a set of flat tabular files. Indeed, as far as most supply chain optimization challenges are concerned, we do not need “real-time” data, we only require data up to yesterday. This data is past and immutable. As a result, such files can be processed orders of magnitude faster (1) than relational tables, because there are no such things as INSERT, UPDATE or DELETE to support, just READ.
In Envision, it is possible to display a table with a statement that is quite similar to the SELECT statement found in SQL.
show table "Product List" with Id Name Supplier
Experienced SQL developers would probably immediately notice that this statement is lacking the FROM party commonly found after the SELECT. In supply chain, we observe that nearly all the data revolves around products (or SKUs), and that nearly all the data history can be described as events also attached to products. Thus, instead of writing JOINs all over the place to endlessly repeat the same pattern, Envision features “natural” joins. For example, the script below shows how to produce a top seller list based on sales history without any explicit join.
end := max(date) // O is for Orders LastYearQty = sum(O.Qty) when date > end - 365 show table "Top Sellers" with Id Name LastYearQty order by LastYearQty desc
Also, in supply chain we have found that periodic calendar aggregations are very frequent. Managers need to have their numbers per day, week or month; and while it is a basic need, SQL makes it very difficult to display something as simple as a weekly line chart, which can be written in just two “dumb” lines in Envision.
Week.quantity := sum(O.Qty) show linechart "Weekly quantities sold" with Week.quantity
Finally, the tooling around SQL emphasizes a mental model of one query at a time. However, we came to the conclusion that a good dashboard typically requires a combination of a very specific business indicators in order to be productive. In contrast, an Envision script comes with the direct intent of generating a complex dashboard at once.
(1) Yes, it is possible to fine-tune your SQL database to achieve a level of performance similar to the one that is obtained when dealing with just flat files, but we found that the effort involved nearly always completely defeats the benefits that led to the introduction of a SQL database in the first place.
Excel vs Envision
While Envision is a programming language, it is still intended to be accessible to advanced Excel users. We absolutely don’t look down on Excel: these decades-old tabular sheets are tough to outperform and, being a team of data scientists and developers ourselves, we frequently end up using Excel ourselves, for example to quickly consolidate results for a series of experiments.
One of the areas where Excel shines is the capacity to quickly carry out operations over entire rows or columns, that is, vector-calculations, driven by the power of cut-and-paste. Vector calculations are highly useful but the cut-and-paste logic, much less so. Let’s say we want to compute the median product price over the last year based on past transactions. This could be written with a few lines using Envision.
O.UnitPrice = O.NetAmount / O.Qty end := max(date) UnitPrice = mode(O.UnitPrice) when date > end - 365 show table "Median Price" with median(UnitPrice)
The first line is a vector calculation equivalent to the introduction of a new column named
UnitPrice within the
O (for Orders) table. Then, the third line is also a vector calculation where the
mode (most frequently observed value) is computed for each product. Like with Excel, it is highly straightforward, with Envision, to introduce intermediate calculations and to compose them afterwards.
Envision also places an emphasis on the compactness of dashboards, much like a synthetic sheet in Excel. Each
show statement in Envision defines a tile to be displayed within the dashboard, and these tiles are aligned following an Excel-like grid.
show label "Hello World" a1d1 tomato show table "Product Lines" a2b2 royalblue with sum(1) show table "Order Lines" c2b2 darkorange with sum(O.1)
The script above defines three tiles respectively positioned in A1:D1, A2:B2, and C2:D2 in line with the Excel convention of using letters for columns and numbers for the lines.
No loop, no branch and more
Envision offers a neat functional syntax. We do not have loops, branches or nulls ; and just in case you might be wondering, Envision is not a Turing-complete language. In practice, these features are not omitted. Instead, Envision simply provides built-in constructs to achieve the same result, but with much less hassle. By not having such features in Envision, not only do we remove entire classes of hard-to-debug problems, but we also vastly increase productivity as well.
Let’s consider plotting the total sales volume of the top 10 products by every week for this year, and to compare these weekly totals with their totals one year before for the same top 10 goods. This can be achieved with a few lines below.
end := max(date) Vol = sum(O.Qty) when date > end - 365 Week.amt := sum(O.Qty) where rank() sort [Vol] <= 10 show linechart "Top 10" a1f3 tomato unit: "$" with Week.amt as "Sold this year" Week.amt[-52] as "Sold last year"
Also, Envision is fast, lightning fast. Not only have we managed to leverage many domain-specific algorithms (2), but also we also cache (almost) every single calculation node of our Envision scripts. As a result, when a slightly modified script gets re-executed, only the nodes of the calculation graph that have been changed are recomputed. In practice, once you have tasted what it feels like to process 20GB of data within 5 seconds, you are not going to want to go back to those sluggish SQL queries.
(2) We have a bucket sort algorithm that frequently outperforms the more usual quick-sort by a factor of 500x. Yes, despite the “theoretical” optimal bound of sorting algorithms at O(n.log(n)), speeding up a sorting algorithm by 500x is possible if the situation is favorable (for example: sorting by dates). In supply chain, situations are frequently very favorable in this regard.