Sorry, data isn’t really the new oil
Listening to this, one would assume that all e-commerce companies are making money hand over fist and all physical retail chains are contemplating shutting down.
Everyone—from an analyst at a venture capital firm to a student in a B.School—would tell you that an e-commerce company can figure out if you have a baby at home (because you are ordering diapers) and can use this knowledge to get you to buy other baby products. Statements like these are common and sound cool until you dig deeper.
The underlying tacit assumption behind a statement like this is that you are currently buying these products from some other store because you didn’t know that this e-commerce platform has these products; you took the pains to visit another store but didn’t bother to even check on the platform using the search option. Or, probably, you as a parent were unaware that your baby even needed these products, and if you are shown these products you would buy them.
Reality check
Let us take an online e-commerce company like BigBasket or Amazon. The amount of data that each of these companies have about their customers is mind-boggling. Every click on their app, or website, is tracked through sophisticated tools employing big-data frameworks.
The exact details of every click and every purchase is captured and analysed for purchase patterns, brand preferences, lifestyle choices, price sensitivity and other elements of the consumer’s persona. Surrogate data for family income, even if it is only partially indicative, in the form of preference for premium brands or residential address, is also available with these companies.
The idea is to use this vast trove of customer data to produce insights that would enhance retention and get a higher share of the customer wallet. One of the mechanisms of getting a higher share of the customer wallet is by making ‘purchase recommendations’ based on the insights.
On an e-commerce platform like BigBasket, when you scroll down the list of ‘frequently purchased items’ to place your order, there would be a couple of items that are ‘recommended’ for you. Typically, the number of recommendations is around 5% of the items in the frequently purchased list of items. And the success rate—defined as the per cent of recommended items actually purchased by the customer—is around 2%. In other words, the increase in the order value because of recommendations is nearly a tenth of a per cent (2% of 5%).
Therefore, if your basket size is a thousand rupees, all that this data crunching and insights engine is achieving is to increase it by a rupee. Doing anything that increases the basket value of a customer is perfectly understandable as long as the cost of doing it is insignificant. Hence, it is not a bad idea to make a small one-time investment to build a recommendation engine, but making a big noise about how crunching big-data can transform your business, at least in this context, is a bit far-fetched.
Just to clarify, the founders of BigBasket are smart and understand this well, unlike some others who either blindly believe that the value of monetizable insights is proportional to the amount of data or use this argument to impress themselves and their investors.
The recommendations work a little better in the context of books, where the likelihood of a customer buying an extra book based on the recommendations is higher. However, even Amazon rarely makes a noise about this, and seldom talks about it, and instead spends all its energy on the three fundamental drivers of its business, namely increasing the assortment, offering lower prices, and making quicker deliveries. Amazon clearly believes that these are three things that customers would always care about, and to remain relevant it needs to have a razor-sharp focus on improving these every day.
DMart, an offline retail chain, is hugely profitable and has a market capitalization of $30 billion. Every e-commerce company, big or small, has several hundred times more data about their customers than DMart has about its customers. Crunching big-data does not seem to be helping them given that they are burning big money. What DMart did really well, like Amazon, is that it made its strategic choices wisely based on a deep understanding of its target customers, and then went about ruthlessly executing on these without being distracted by fancy notions.
Similarly, in the world of taxi aggregation, both Ola and Uber had tonnes of customer data—their home and office locations, most frequently visited places, the frequency of use, their willingness to pay surge prices, etc. But this data may not have helped them, and both the aggregators slid down rapidly on customer experience. This created the space for a new player.
BluSmart’s success lay in doing the basics right by its target customer group. Like DMart, BigBasket and Amazon, BluSmart too made its strategic choices wisely and executed on them well. If there was any degree of personalization using data, it was really minimal and not core to its success. Both Ola and Uber, on the other hand, somewhere along the road, forgot the pain points of their target customers and instead focused on mindless scaling and devising algorithms that could price rides based on a customer’s ability to pay (like phone dying, pickup point, rainy weather, drop off address, etc).
Smart companies have and will always stay focused on deeply understanding the needs of their target customers at an aggregate level and do all the right things (both in terms of strategy and execution) to keep them absolutely delighted. Any personalization is just the garnishing on the salad.
Confusing the garnishing for the salad was the fatal mistake that many online companies ended up making and continue to make as long as there are takers.
Monetization models
Of late, there is increasing scepticism of business models that treat users as the product, by offering a free service that customers/users would otherwise not pay for, with the hope that their data could someday be monetized.
All monetization eventually boils down to either advertising income or interest income (through lending).
The ad-income model has created some wildly successful companies such as Facebook and Google. Amazon has also monetized its customer base to generate a decent income. The truth though is that companies like Google and Facebook are somewhat of an exception and a rarity. Building a business with the hope of monetizing, à la Google or Facebook, is extremely risky and naive. All other platforms with a customer base (or reader base) have struggled to earn ad-income. Most readers tend to skip ads, and the effectiveness of algorithms that drive the real-time placement of ads is highly questionable. There is also a growing realization that the only beneficiaries of Facebook and Google ads are Facebook and Google.
The ability to personalize ads is questionable. This writer has come across many friends and colleagues who continue to be highly amused by the jobs that LinkedIn keeps recommending for them based on its interpretation of their profiles and online activity. The recommendations don’t come anywhere near what they would be interested in. And this is what a reputed online social media platform that has access to some of the best tech talent in the world and has the ability to capture every ‘like and comment’ of a customer, churns out. This is not a comment on the quality of LinkedIn’s recommendation engine as much as on the inherent limitations in creating really meaningful insights from large troves of data created by crawling customer activity and profiles.
The business model of most fintech companies hinges on being able to evaluate the creditworthiness of borrowers accurately and quickly. The belief is that it would lower defaults. Successful lending has always been a trade-off between not lending to good borrowers (because of some wrong red flag) and lending to bad borrowers (because no red flag came up). Will algorithms do a better and quicker job of managing this trade-off? Only time will tell.
Is algorithmic credit assessment replacing humans because it is better at it, or because of the shortage of people with the right skills and price point, is really the question. And to make matters more difficult, fintech companies don’t have access to low-cost funds and borrow at high rates from banks and NBFCs.
Data has its uses
All this is not to say that data is not helpful. John Snow, an English doctor, used the power of data to pinpoint the source of cholera in London in the mid 19th century. There are hundreds of similar examples.
Data analysis with the help of algorithms has been used to create alerts on all kinds of fraud, but to assess whether there is actual fraud needs human intervention and investigation. E-commerce companies, too, have deployed algorithms to detect some common fraud patterns. For instance, there is an alert if someone orders a product in bulk because this is often a kirana store owner, pretending to be a retail customer, ordering a product on the platform that is being sold at a discount to resell it at MRP (maximum retail price).
Data analytics is an extremely evolved science and is an outcome of applying the knowledge at the intersection of statistics and computing power to solve several complex problems. For example, it has been extremely helpful in interpreting images from medical scans to pictures of remote parts of our universe. The whole science of image recognition through machine learning relies on the power of data crunching.
The ability to spot useful patterns and signals by the use of data has never been under question. What is under question is the ability to derive significant monetizable insights by crunching big data.
In conclusion
Crunching big data is somewhat akin to creating better image resolution. But when this is applied to business, it has to stand the test of the universal yardstick for evaluating the effectiveness of any tool or technique, namely, the impact it can create on the top line or bottom line. Unless the enhanced resolution results in recognition of new patterns that were not discernible at lower resolution with lesser data, there is no advantage of crunching this humongous data. And even if you assume that some additional patterns do show up, there is the non-trivial problem of monetizing them.
This is where the universal Pareto principle kicks in, which is, 80% of patterns are evident with 20% of the data. Beyond this is the valley of severely diminishing returns. When you have a hammer in your hand, everything looks like a nail. In this case, the hammer is computing power.
Nothing can substitute for a deep understanding of your target group of customers and good execution.
Someone wise had once said that when there is a gold rush the ones who make money are not the gold diggers but the ones selling shovels. And ironically, it is the gold diggers who always make the most noise about how the power of the new shovels would make them all very rich.
When there is a rush to create and monetize customer data, the ones who make money are not the companies that wish to monetize their customer data but the ones selling computing capacity.
The science of thermodynamics is based on the premise that everything that matters about a gas can be understood without having to crunch data on the positions and velocities of the individual molecules.
The day quantum computing becomes a reality it won’t be surprising if sellers of quantum computers reinvent physics and tell us how our understanding of thermodynamics would be enhanced by measuring what we all know is unnecessary.
T.N. Hari is an author and founder of Artha School of Entrepreneurship.