Common pricing strategies
Envision is both a data visualization engine and a data processing engine, and one of the key goals of the latter is to help merchants obtain better prices. At this point, we have extensively covered the syntax of Envision, Lokad’s programming language. It is now time to use this knowledge to implement actual pricing strategies.
- Basic Pricing Process
- Visualization of the Pricing Performance
- Stock-Driven Strategies
- Behavioral Strategies
- Competitor-Driven Strategies
Lokad includes a toolkit for crafting highly tailored pricing strategies rather than simply being a bag of predefined recipes. Indeed, over the years, while working with many merchants across many countries and many verticals, the Lokad team has observed that pricing tends to be a very domain-specific exercise. While generic pricing strategies might yield reasonable performance, those strategies are frequently weak compared to truly tailored ones.
While pricing strategies themselves may vary a great deal from one business to another, what we have observed is that good strategies almost always require a substantial amount of data. Most pricing tools tend to exclusively focus on the prices of your competitors, but one hard truth of commerce is that, for any given item to be sold, there is someone, somewhere, selling this item at an unreasonable price. Hence, unless you have an extremely favorable sourcing position, aligning all your prices to those at the bottom of the market is not a viable option.
By decreasing the order of priority, we recommend that you obtain the following data:
- A detailed catalog, that is, the list of all the items being sold. A rich catalog comes with item names, item categories, purchase prices, supplier names and all the information that could prove relevant to compose your prices, but also that which enables you to make sense of the historical data. Depending on your vertical, specific dimensions may be of importance: colors and sizes in textile, release dates for cultural products such as books, dry weights for food, etc. We recommend that you make a special effort regarding the extraction of the catalog data, as it is frequently one of easiest datasets to obtain and one of the most useful too.
- The sales history, that is, the list of historical transactions. Don’t hesitate to include the quantity, the price, the applicable tax, and the customer identifier for each transaction line. Sales data are the most important pieces of historical data you have, and they are worth investing the time it takes to get the extraction right. In particular, the more history depth you have, the better. In practice, it takes two years of data to observe anything as far seasonality is concerned, and it takes three years to discern the difference between seasonality and a trend.
- The returns, cancellations, and frauds, that is, the list of transactions that were reverted later on, one way or another. Indeed, high return rates for certain items can cripple their profitability; and generate a great deal of bad will in the process. One common data mistake consists in mixing those entries with the regular transactions, counting them with negative quantities. In practice, this leads to all sorts of errors. Those issues need to be accounted for separately.
- The backorder history, that is, the list of transactions made to acquire items that have been sold afterward. Sourcing data can be approximated by simply extracting the latest purchase prices for every item. However, purchase prices vary over time, and the current purchase price might be very different from past prices. This can lead to significant discrepancies when analyzing the historical gross margin. This is why it is better to explicitly leverage historical backorders instead.
- For online commerce, the daily aggregated pageviews per item, that is, the web traffic, but which is re-aggregated at the item level. Web traffic provides critical clues when we compare item sales to item traffic. Unlike competitive intelligence, the data that you can obtain is usually very accurate, and it can reflect subtle - but important - market moves, such as a surge of interest for certain items. Raw web-traffic logs are typically rather heavy even for a small commerce. A decent sized online commerce, say above $50M per year, is likely to generate about 1GB of web logs per day. However, when daily aggregated, it takes a large commerce to generate more than 1MB of compressed data - which remains very manageable.
- Competitor prices, that is, the list of prices automatically retrieved from the websites of competitors using competitive intelligence tools. For example, Scrapy is an open-source tool (see http://scrapy.org/) that can be used to quickly devise a price extraction process. In practice, we observe that it is better to carefully monitor the two or three most relevant competitors as opposed to loosely monitor the entire web. It is also important to monitor site-wide promotions, for example where the competitor offers $15 off for every item or when the shipping fees are made free. Raw prices can give a false sense of confidence in the actual price perception by customers.
This list might seem overwhelming but you do not need all this data to get started with Lokad. In practice, to start composing a pricing strategy with Lokad that makes sense, you only need an extract of your catalog - even if there is not much beyond the item identifiers and an extract of your sales history with quantities and prices.
Another important point when writing a pricing strategy is that competitive intelligence comes last among the most desirable data sources. This is no accident. Most of the pricing tools take the exact opposite approach: competitor prices come first. Indeed, for most software vendors, approaches that rely only on the price of your competitors offer a significant practical advantage: there is no need for any actual data integration. The software can crawl both your site as well as the sites of your competitors. Then, it can identify all the products you sell, including all your prices, and then automatically match those products and prices with data that the vendor has already collected on other merchant sites. It is a one-click setup with no data integration, which makes it easier for the software vendor to sell their software.
We believe that this approach is mistaken. As already pointed out, there is almost always someone selling at an unreasonable price, so straightforward price comparison tends to be impractical. However, the problem is deeper than that. Aligning your prices is a very reductive pricing strategy. Let your competitors sell the defective products that generate myriad of phone calls and returns. It is pointless to keep being aggressive on prices if both your competitor and your common supplier are already out-of-stock. Sometimes, you are just lucky and you can acquire an important amount of items at a very low price. However, pricing these a bit lower than competitors may simply not be sufficient to get the sales boost you need to liquidate the large quantity you have just acquired. Pricing has a lot more depth than just shorting all competitors by 5 cents.
In the following sections, we cover many pricing strategies that can be used separately or combined together. These strategies all require data. We start with simple strategies that require limited data, typically available for most merchants, and we incrementally move on to more advanced strategies that require more data.
Basic Pricing Process
Pricing is both a strategy, that is, the logic used to compute the new prices, and also a workflow that includes all the steps that are required to publish the freshly revised prices. When Lokad is used, the typical pricing workflow can be structured in several steps:
1. Push the latest data to Lokad. This step is typically automated through FTP (File Transfert Protocol) using BigFile, Lokad’s file hosting service. It is also possible to manually push the files through BigFile’s web user interface.
2. Trigger the execution of an Envision project. This step can be manually triggered through the web user interface of Lokad. This step can also be scheduled through the web API (application programming interface) of Lokad.
3. Retrieve the new prices made available as files. Upon completion of a project execution, the output files can be downloaded from the web dashboard generated by Lokad, but the files are also automatically pushed back to BigFiles to be later retrieved by FTP.
4. Publish the new prices. Once the file containing the new prices has been retrieved, the new prices can be imported back into the system that manages the prices to be put on display.
Steps 1, 3, and 4 are mostly beyond the scope of the present discussion since we will be focusing on the actual pricing logic that happens in Step 2.
Assuming that purchase prices are made available, an elementary cost-plus strategy can be written as:
Price = PurchasePrice * 1.3 // uniform cost-plus pricing
show file "/myfolder/myprices.tsv" with Id, Price
In particular, the second line with the file
tile produces a file named myprices.tsv
that is written to the /myfolder
folder within your Lokad account. We suggest you save the files generated by Lokad to a distinct folder in order to avoid mixing your inputs and outputs.
Let us further refine this strategy with an odd-pricing twist. Let us say we want to have prices end with 0.95. This can be done with:
Price = round(PurchasePrice * 1.3) - 0.05
The function round
returns the closest integer to the input numbers. If you want all your prices to end with 0 or 5, this can be done with:
Price = round(PurchasePrice * 1.3 / 5) * 5
The round
function is simple and its usage can be adjusted to reproduce most of the odd-pricing patterns found in commerce.
While Lokad offers a powerful mechanism to reprice all items at once, sometimes this can be perceived as being a bit too powerful, since what may be the most desirable is greater pricing stability. In particular, for physical stores where price labels have to be manually updated, throttling the number of revised prices per day is frequently important in order to keep change manageable by local teams.
Let us assume that we have a merchant with 5,000 items, and this merchant does not want to touch the prices of more than 1,000 items at once. Assuming that the old prices are passed as input, this can be done with:
NewPrice = PurchasePrice * 1.3 // 5000 items here
where rank(abs(NewPrice - OldPrice) / OldPrice) < 1000
// only 1,000 prices exported here
show file "/myfolder/myprices.tsv" with Id, NewPrice
The rank
function sorts all values starting from the largest ones, that is, the items where the new price has the largest relative difference to the old price. The where
condition is used to keep only the top 1,000 items which are piped into the file
tile, instead of the 5,000 items that were present initially.
By repeating the price import / export cycle, it is possible to gradually revise all the prices without touching more than X% of the items at any point in time. Here we are giving priority to the largest price differences, but in practice, other factors could be part of the picture. For example, price changes driven by promotions could be given a higher priority than the relative difference.
Visualization of the Pricing Performance
Visualizing the input data is important. As the old caveat goes: garbage in, garbage out. If the input data is corrupted or incomplete, it can lead to all sorts of problems. Whenever you tackle a fresh dataset, start by plotting the historical sales. Assuming that you have an Orders
event stream, this could typically be done with:
Day.quantity = sum(Orders.Quantity)
Week.quantity = sum(Orders.Quantity)
show linechart "Daily quantities sold" with Day.quantity
show linechart "Weekly quantities sold" with Week.quantity
Daily quantities are interesting because through them it is easy to spot gaps of missing data. However, because of the strong weekly cyclicity observed in most businesses, the daily aggregated data tends to be poorly readable. Thus, you should favor weekly aggregated data instead when you know for sure that you are not missing data for any of the days. Weekly data is easier to analyze than monthly data, because a month can alternatively include four or five weekends. As a result, monthly totals tend to be deceptive, giving false impressions of increases or decreases which can be explained merely by the number of weekends in each month.
In order to monitor the health of your business, it is typically important to include a few KPIs covering the last week, or the last four weeks, such as the number of units sold, the sales volume, the gross margin in volume, etc. This could be done with:
oend = max(Orders.date)
where date >= oend - 7
show scalar "Last week quantity" with sum(Orders.Quantity)
show scalar "Last week volume{$}" with sum(Orders.Quantity * Orders.Price)
show scalar "Last week gross margin{$}" with sum(Orders.Quantity * (Orders.Price - PurchasePrice))
Above, the gross margin is approximated based on the PurchasePrice
value which represents the purchase prices as they are now. For an analysis over seven days, this is a reasonable approximation. However, if we were to consider a longer duration, it would be better to use the historical purchase prices instead. Assuming the historical purchase prices are available through BackOrders
which lists the orders passed to the suppliers in the past, we can write:
Day.Quantity = sum(Orders.Quantity)
Day.Margin = Day.Quantity * (latest(Orders.Price) - latest(BackOrders.Price))
where date >= 28
show scalar "Last 4 weeks gross margin{$}" with sum(Day.Margin)
In this example, we start by computing the historical prices using the function latest
, both for the historical selling prices and for the historical purchase prices. Then, the historical price difference is multiplied by the historical sold quantity to obtain the historical daily gross margin.
Sometimes, the total value does not provide sufficient insights. In these cases, it is more desirable to compute those KPIs at some intermediate level. For example, if a Category exists, the gross margin can be computed per category using:
Day.Quantity = sum(Orders.Quantity)
Day.Margin = Day.Quantity * (latest(Orders.Price) - latest(BackOrders.Price))
where date >= 28
M = sum(Day.Margin)
show barchart "Last 4 week gross margin{$}" with sum(M) group by Category
In particular, the reference M
is a vector here, not a scalar, and the values are aggregated by category within the barchart
tile. If you want to observe the gross margin over time, this can be done with:
Day.Quantity = sum(Orders.Quantity)
Day.Margin = Day.Quantity * (latest(Orders.Price) - latest(BackOrders.Price))
where date >= 28
show linechart "Daily gross margin{$}" with sum(Day.Margin)
In this section, we have barely scratched the surface of the data visualizations that can be produced with Lokad. Gathering KPIs that really matter for your specific business takes time and effort. However, since you cannot optimize what you haven’t measured, this is typically a profitable investment.
Stock-Driven Strategies
In order to achieve high service levels, a retailer needs inventory. In order to have just enough inventory, future demand needs to be anticipated, but the anticipation always comes with a degree of error. One of the goals of a smart pricing strategy is to orient the demand toward items that need to be sold the most.
If an item is heading for a near-certain stockout, there is no need to keep the price highly competitive, as it only hastens the stockout. Instead, it is better to gradually increase the price as the stockout becomes imminent. Indeed, as the stock-out is going to happen almost surely, some customers will not be serviced anyway. If the price is slightly higher, then the item will still be heading for stockout, but a better gross margin is obtained for each remaining unit. Moreover, the few clients that might be turned away by the higher price might purchase a substitute that does not suffer from an understock situation, hence improving the overall sales volume as well.
Let us illustrate how such a strategy can be implemented. Let us assume that future deliveries are provided through a Deliveries
stream, where some deliveries are positioned in the future. We also have the usual Orders
stream that represents the sales history and the StockOnHand
that represents the stock currently available.
oend = max(Orders.date)
where date > oend
DaysToCover = min(Deliveries.Date) default LeadTime
where date >= oend - 28
SalesLevel = sum(Orders.Quantity) / 28 // 4 weeks average
where SalesLevel > 0 and DaysToCover > 0
Ratio = StockOnHand / (DaysToCover * SalesLevel)
IsUnderstocked = Ratio < 0.3
where IsUnderstocked
Price = Price * 1.2 // 20% price increase on understocked items
The Boolean vector IsUnderstocked
is true
when the current stock is only 1/3 of the expected sales - calculated with a four-week average - until the next delivery. For these items, the price is increased by 20% in the example above.
In addition, the previous approach can be reversed in order to lower the price of the dead inventory, that is, the stock that is moving so slowly that there is a risk of it never being sold.
oend = max(Orders.date)
where date >= oend - 180 // 6 months
// 'or 1' to avoid zeroes
SixMonthSales = sum(Orders.Quantity) default 1
Ratio = StockOnHand / SixMonthSales
IsDeadInventory = Ratio >= 2 // more than 1 year of inventory
where IsDeadInventory
Price = Price * 0.7 // 30% price decrease on dead inventory
In the script above, we declare that if the stock on hand is greater or equal to 2x the quantities sold over the last six months, then the inventory is considered as dead. Obviously, depending on the business, six months could be too long or too short, but with a couple of numerical adjustments, this script can typically be used to identify and reprice dead inventory.
Behavioral Strategies
In this section, we review a couple of strategies that revolve around the behavior of clients, that is, patterns found in the act of purchasing itself or related to the act of purchasing. A good pricing system can be adapted to better fit the behavior of the clients, and, to some extent, slightly alter their behavior. Such strategies tend to be highly specific to the domain considered, but we will review a couple of strategies that can typically be adapted for most businesses below.
First-time purchases are special because clients are usually still feeling uncertain about the service. Hence, in order to grow the base of recurring clients, it is important to aggressively price the items that trigger the bulk of the first-time purchases. Let us assume that the event stream Orders
contains two columns. One column, named Client
, contains the client identifier and the Amount
column contains the total amount of the line. The selection of the top 100 biggest drivers of first-time purchases among all items can be performed with:
// first-time order only
where rankd(-Orders.Date, Orders.Client) == 1
FirstTime = sum(Orders.Amount)
// top 100 biggest first-time drivers
where rank(FirstTime) < 100
Price = 123 // insert actual pricing here
The use of the function rankd
might seem a bit obscure. The second argument should be read as a group by
argument. Thus, for each client, we rank their date of purchase starting from the earliest one - hence the minus Orders.Date
. Finally, we preserve only the order lines flagged with a rank equal to one, that is, the first-time purchases. Once we have filtered out all the lines that are not first-time purchases, it becomes straightforward to compute the FirstTime
totals for every item, and to select the top 100 sellers among them.
In the previous script, we computed the all-time biggest first-purchase drivers. However, it might not make sense to look at orders placed years ago. Hence, the script can be adjusted to include only the first-time purchases made over the last four weeks with:
// first-time order only
where rankd(-Orders.Date, Orders.Client) == 1
oend = max(Orders.Date)
where date >= oend - 28 // 4-week window
FirstTime = sum(Orders.Amount)
where rank(FirstTime) < 100 // top 100 biggest first-time drivers
Price = 123 // insert actual pricing here
Similar selections can be implemented to isolate items that:
- Generate many clicks through Google AdWords, but that do not yield enough conversions. Low conversion rates can be explained by inferior product descriptions and missing reassurance elements, but they are also frequently a signal for noncompetitive prices against competitors that may or may not be identified.
- Generate a lot more support phone calls. When products are defective or just too complicated, people tend to turn to the retailer for support. The cost can be unreasonably high considering the usual margin levels. Hence, for such items, it is better to make every sale worthwhile.
- Generate a lot more returns. Indeed, returns can hurt your business not only due to their cost, but also due to the fact that the merchant might have lost an otherwise loyal customer through a bad service experience. If an item generates more frequent returns, a higher price might be a more interesting alternative to just delisting the item.
The main challenge to implement these strategies is the consolidation of the all the relevant data.
Competitor-Driven Strategies
Over the years, the automated retrieval of all the prices of online competitors has matured as a technology, and nowadays many software companies deliver pricing intelligence as a service for minimal costs. Open source packages such as Scrapy.org are even available to speed-up in-house implementations. However, while gaining access to the prices of the competition is certainly a good thing, making the most of those prices is even better.
The most basic strategy - and the most aggressive too - consists in simply undercutting the prices of a particular competitor. Let us assume that the list of prices observed on a competitor site are collected in a flat file containing three columns Id
, Date
, and Price
for respectively, the identifier of the product being observed, the date of the observation, and the price observed. This file is loaded as an event stream named Contoso
with the time-vector Contoso.Price
containing the competitor’s prices. The following strategy undercuts all the competitor prices by 25 cents:
CPrice = last(Contoso.Price) default -1 // '-1' means 'no observed price'
where CPrice > 0 // undercutting only when prices are observed
Price = CPrice - 0.25
The aggregator last
is used to associate each item with the latest matching observation in the Contoso
event stream. Then, when undercutting the prices, we take care when restricting the scope where the competitor’s prices are available. Indeed, no matter how good the competitive intelligence technology is, the competitor might simply not sell all the items offered by the merchant.
In addition, when undercutting prices, it is important to ensure that the observed prices are not too old. The above script can give a false sense of confidence regarding the competitive intelligence accuracy because all prices may have been observed at some point in the past. In the following script, we adopt a more restrictive approach where observed prices must be less than seven days old in order to be included in the undercutting strategy:
oend = max(Orders.date)
where data >= oend - 7 // keeping 1 week
CPrice = last(ContosoPrice) default -1
where CPrice > 0
Price = CPrice - 0.25
Analyzing the prices of a competitor can also enable you to identify sourcing problems. For example, if a competitor has a retail price lower than your purchase price - assuming the competitor is not selling at a loss - this probably indicates a poorly negotiated sourcing deal. The following script identifies the worst offenders:
CPrice = last(Contoso.Price) default -1
// purchase price greater than competitor price
where PurchasePrice > CPrice
show table "Source problems" with Id, Price, CPrice
When such situations exist, it is particularly difficult to prioritize the sourcing problems because the existing sales volume are very likely to be extremely low. This occurs not because the demand is low, but simply because a very small fraction of the demand is heading your way.
Aligning all the prices just below the price of the competitors is very aggressive and frequently unsustainable. Hence, more realistically, the merchant needs to identify the set of articles where prices are going to be very competitive, and those where this will not be the case.
One approach to do this consists in leveraging the notion of units of need. We refer to a unit of need as a subset of the items, where all the items serving the same need (substitutes) are put together. Within a unit of need, items are usually differentiated by their brand (or lack of brand) which is regarded as a sign of quality. When such units of need are made available, instead of trying to compete in regard to all items, the merchant only competes regarding the item where the purchase price is lowest.
Let us assume that a vector UnitOfNeed
is available as input data; the following script illustrates such a strategy:
CPrice = last(Contoso.Price) default -1
where CPrice > 0
CPriceByNeed = min(CPrice) by UnitOfNeed // minimum price
// lowest purchase price ranked first
PPriceRank = rank(-PurchasePrice, UnitOfNeed)
where PPriceRank == 1
Price = CPriceByNeed - 0.25 // undercutting by 25 cents
Above, we took the lowest price observed on the competitor site per unit of need. Then, within each unit of need, we selected the item with the lowest purchase price - that is, the best candidate for a very aggressive pricing - and then, for this one candidate, we undercut our price in relation to the competitor’s price.
Another strategy employed in retail consists in maximizing the undercutting coverage while preserving the margins as much as possible. This strategy is frequently employed in food retail chains because clients usually do not cherry pick their purchases from multiple chains: all the weekly grocery purchases are usually made at a single place, typically a supermarket or hypermarket. Thus, the intent is to maximize the perception of being the most aggressive merchant when clients benchmark prices somewhat randomly. Obviously, the products picked by clients to establish price comparisons are never really random, but assumptions have to be made because usually this data is simply not available.
oend = max(Orders.date)
where date >= oend - 30
// total quantity sold over the last 30 days
L30 = sum(Orders.Quantity)
CPrice = last(Contoso.Price)
costPlus = 1.3 // default strategy, agnostic of competition
CuttingCost = L30 * (CPrice - PurchasePrice * costPlus)
// Prioritized list of items
show table "By cutting cost" with Id, Name order by CuttingCost
where rank(-CuttingCost) < 1234 // lowest undercutting costs first
Price = CPrice - 0.25 // undercutting by 25 cents
In the script above, we are computing CuttingCost
as a very rough approximation of the cost of aligning the prices below those of the competition. This calculation relies on multiple assumptions. First, we are considering that whenever there are no competitors, then prices are aligned with a gross margin of 30% (cost-plus strategy). Any alternative pricing strategy would have worked; cost-plus was chosen here merely for the sake of simplicity. Second, it is assumed that the demand would stay unchanged after the price change. The applicability of this assumption really depends on the business being considered. For food retail chains, this is frequently a reasonable assumption because clients will not significantly alter their basket because of marginal price changes. However, the merchant can erode its client base if prices are perceived as too high overall. Third, all items are weighted linearly against the quantities being sold. This is a bit simplistic because some items have a much stronger psychological impact on clients. A more realistic approach would put some extra weight on items identified as particularly sensitive from a client viewpoint.