More data + Better models + More accurate metrics + Better approaches & architectures = Lots of room for improvement!
There are clearly massive foundational shifts taking place around big data. I am not sure how large conventional Fortune 500 firms can innovate and keep up with what’s going on. I have run into CIOs who have not heard of Hadoop in some cases.
It’s also fascinating to see how data-driven “bleeding” edge firms like NetFlix are pushing the envelope. Netflix stats are amazing: 1/3+ Internet traffic (NA / peak); 100+ Million hours per day; 65+ Million members / 50+ countries; 500 Billion Events / Day.
NetFlix is clearly reinventing Television and targeting 90 million potential subs in the US market alone. Binge-watching, cord-cutting are now part of our everyday lingo. What most people don’t realize is how data-driven Netflix is…. from “giving viewers what they want” to “leveraging data mining to boost subscriber base”.
Viewing -> Improved Personalization -> Better Experience is the virtuous circle.
Here is a glimpse at how their BI landscape has evolved in the past five years as they integrate 5 million to 6 million net adds for several years now. The figures are from a presentation by Blake Irvine, Manager Data Science and Engineering.
BI tools @ NetFlix pre-Hadoop
Facebook understands personalization. Do you? Facebook builds a custom Web page every time you visit. It pores over all the actions your friends have taken—their photos, their friends, the songs they listen to, the products they like—and determines in two-hundredths of a second which items you might wish to see, and in what order.
[SOURCE: Bloomberg Businessweek, “Facebook: The Making of 1 Billion Users,” Ashlee Vance, October 4, 2012]
A common CMO issue… Digital marketing is not working. Visits are up but sales are down, Site conversion is trending down. E-mail open rates are ok but click thru rates are down. What do we do?
- Can you predict what customers want before they do?
- Can you formulate the “next best action”?
- Can offers be better targeted or timed to improve customer acquisition and conversion?
Growing the customer relationship is the perpetual challenge of all companies. To change status quo, EBay bought Hunch to help improve its recommendation services. EBay uses Hunch’s “taste graph” technology to provide its users with non-obvious recommendations for items based on their unique tastes. E-bay applied Hunch’s technology to other areas such as search, advertising and marketing, in order to better surface product information based on its customers’ tastes.
It’s becoming a data-driven world. We are awash in data, but the problem is figuring out what we are supposed to do with it.
Data Driven Commerce & Retailing
Recommendations and promotions are the most effective when you target them on customer behaviors.
Recommendation and decision engines, an area of predictive analytics and decision management, is quite active right now in the digital arena. The early online pioneer was Amazon.com which used collaborative filtering to generate “you might also want” or “next best offers” prompts for each product bought or page visited.
Next best action, next best offer, interaction optimization, and experience optimization all share similar structure. A typical targeted offer analytics model is shown in the figure (source: blog.strands.com).
The premise of data driven commerce & retailing is simple:
- Acquire the right customers
- Offer the right products
- Personalize relevant offers
- Focus on the Right timing & Channels
To understand the impact that recommendation engines can have on sales, let’s look at a traditional brick-and-mortar firm doing direct to home face-to-face selling…Schwan Food.
Schwan Food – The Business Problem
The Schwan Food Company is a multibillion-dollar, privately owned company with 17,000 employees in the United States. Based in Marshall, Minnesota, Schwan sells frozen foods from home-delivery trucks, in grocery-store freezers, by mail and to the food service industry. Schwan produces, markets, and distributes products developed under brands such as Schwan’s, Red Baron, Freschetta, Tony’s, Mrs. Smith’s,Edwards, Pagoda Express and many others.
Schwan’s Home Service, the company’s flagship business unit, is the largest direct-to-home food delivery provider in the United States. Sales are done door-to-door by 6,000 roving sales people who deliver frozen products to homes of three million customers across the country.
Schwan home sales were listless for four straight years, beset by high customer churn and inventory pileups. So the challenge was: How to spark sales? How to get an uplift of 3-4%?
At the point of customer contact…Schwan wanted to personalize the experience. The goal is to dig deep into customer data, generate insights and engage customers in innovative ways.
What are primary drivers of sales? Schwan realized that by recommending to the customer, products that fit their profile, purchase history and interests there is a higher revenue potential for cross-sell and up-sell.
The challenge was to overhaul the current crude recommendation program that existed. Most firms like Schwan provide to the sales team data from the SAP back-end. Most of this data is stale and not dynamic. For instance, sales people could look at six weeks of orders, and suggest purchases from that list.
To completely overhaul the recommendation engine. Schwan began an analytics project with Opera Solutions.
The analytics project took it into more sophisticated territory: Matching seemingly disparate customers with similar purchase patterns in their past. Opera calls them finding “genetic twins.” It added ways to track whether customers’ spending was fading from certain categories—say, breakfast foods—and offered product suggestions and discounts to keep the spending intact.
How does this work? At the core of a recommendation engine is predictive modeling. This identifies and mathematically represents underlying relationships in historical data in order to explain the data and make predictions or classifications about future events.
Predictive models analyze current and historical data on individuals to produce easily understood metrics such as scores. These scores rank-order individuals by likely future behavior, e.g., their likelihood of responding to a particular offer.
Schwan’s database is now pushing out more than 1.2 million dynamically-generated customer recommendations every day, sent directly to drivers’ handheld devices. Opera says Schwan’s revenues are up 3% to 4% because of it.
It would be interesting to see the correlation between Schwan’s customer satisfaction scores and shopping basket mix with recommendations versus non-recommendations.
Netflix Real-Time Recommendation
The Netflix movie recommendation contest (blending of different statistical and machine-learning techniques) has been widely followed because its crowdsourcing lessons could extend beyond improving movie picks. The outcome: CineMatch recommendation solution built around a huge data set — 100+ million movie ratings — and the challenges of large-scale predictive modeling.
Netflix’s overview of the competition:
We’re quite curious, really. To the tune of one million dollars.
Netflix is all about connecting people to the movies they love. To help customers find those movies, we’ve developed our world-class movie recommendation system: CinematchSM. Its job is to predict whether someone will enjoy a movie based on how much they liked or disliked other movies. We use those predictions to make personal movie recommendations based on each customer’s unique tastes. And while Cinematch is doing pretty well, it can always be made better.
Now there are a lot of interesting alternative approaches to how Cinematch works that we haven’t tried. Some are described in the literature, some aren’t. We’re curious whether any of these can beat Cinematch by making better predictions. Because, frankly, if there is a much better approach it could make a big difference to our customers and our business.
So, we thought we’d make a contest out of finding the answer. It’s “easy” really. We provide you with a lot of anonymous rating data, and a prediction accuracy bar that is 10% better than what Cinematch can do on the same training data set. (Accuracy is a measurement of how closely predicted ratings of movies match subsequent actual ratings.) If you develop a system that we judge most beats that bar on the qualifying test set we provide, you get serious money and the bragging rights. But (and you knew there would be a catch, right?) only if you share your method with us and describe to the world how you did it and why it works.
Serious money demands a serious bar. We suspect the 10% improvement is pretty tough, but we also think there is a good chance it can be achieved. It may take months; it might take years. So to keep things interesting, in addition to the Grand Prize, we’re also offering a $50,000 Progress Prize each year the contest runs. It goes to the team whose system we judge shows the most improvement over the previous year’s best accuracy bar on the same qualifying test set. No improvement, no prize. And like the Grand Prize, to win you’ll need to share your method with us and describe it for the world.
Netflix announcement of winner:
It is our great honor to announce the $1M Grand Prize winner of the Netflix Prize contest as teamBellKor’s Pragmatic Chaos for their verified submission on July 26, 2009 at 18:18:28 UTC, achieving the winning RMSE of 0.8567 on the test subset. This represents a 10.06% improvement over Cinematch’s score on the test subset at the start of the contest.
Interestingly several people think that “what your friends thought” feature to be extremely accurate in predicting and suggesting movies…more than the recommendation feature.
Netflix announced a second recommendation contest that was later discontinued. Contestants were asked to model individuals’ “taste profiles,” leveraging demographic and behavioral data. The data set — 100 million entries will include information about renters’ ages, gender, ZIP codes, genre ratings and previously chosen movies. Unlike the first challenge, the contest will have no specific accuracy target. $500,000 will be awarded to the team in the lead after six months, and $500,000 to the leader after 18 months. This contest was cancelled in May 2010 after a legal challenge that it breached customer privacy with the first contest.
Building on Netflix model, California physicians group Heritage Provider Network Inc. is offering $3 million to any person or firm who develops the best model to predict how many days a patient is likely to spend in the hospital in a year’s time. Contestants will receive “anonymized” insurance-claims data to create their models. The goal is to reduce the number of hospital visits, by identifying patients who could benefit from services such as home nurse visits.
I expect to see a lot more activity around Predictive Recommendations as mobile technology makes it easier to influence buyers or convert prospects into customers. Also technology like Hadoop makes it easier to build predictive insights that can be leveraged in real-time.
E-mail Based Recommendations
In multichannel customer-facing business processes, marketers must continually and automatically optimize all offers and customer interactions through all channels, business processes,and touchpoints such as sales, marketing, and customer service. E-mail based recommendation models are pretty advanced.
The same push based recommendation model can be leveraged via e-mail (in addition to mobile handheld direct sales). Williams-Sonoma, all things kitchen and cooking, has a database of 60M households tracking variables like income, number of children, housing values, etc. They leverage these variables in e-mail targeting programs.
Offers embedded in e-mail are tailored to the recipient at the moment they’re opened. In less than 250 milliseconds, analytics software can assemble an offer based on real-time information: data including location, age, gender, and online activity both historical and immediately preceding, along with inventory data. These offers have lifted conversion rates by as much as 30%—dramatically more than similar but uncustomized ad campaigns.
Targeting customers with perfectly customized recommendations at the right moment across the right channel is sales and marketing’s holy grail. As the ability to capture and analyze highly granular data improves, such recommendations are possible.
Perfecting these “next best product recommendation” models involves four steps: defining sales and marketing objectives; gathering detailed primary or secondary data about your customers, your products, and the contextual prompts that influence customers to buy; and using data analytics and business rules to devise and execute offers.
As the amount of data that can be captured grows and the number of channels for interaction proliferates, companies that are not providing recommendations to influence buyers will only fall further behind.
Notes (and Interesting Factoids)
- A recommendation engine generates tailored, and context-sensitive recommendations to guide decisions and actions taken by humans, automated systems, or a combination thereof. For Recommendation Engines background: http://en.wikipedia.org/wiki/Recommender_system
- In the late 1990s, predictive recommendations were created by Amazon and other online companies that developed “people who bought this also bought that” offers based on relatively simple cross-purchase correlations; they didn’t depend on substantial knowledge of the customer or product attributes.
- See of Opera Solutions work at Schwan’s: Dennis Berman’s article in the Wall Street Journal, “So, What’s Your Algorithm?”
- Additional Insights that can improve Sales Effectiveness
• What are the characteristics of my most loyal customers? Least loyal?
• How do customers feel about our company and products?
• Which items drive sales? Which items are frequently purchased together?
• If I discount an item by X, what impact will it have on sales and revenue?
• How do my internet sales compare to brick and mortar in terms of revenue and cost?
• Which prospects should I target to convert into loyal customers? What products or offers would be most effective?
• Will my inventory levels meet sales forecast? When will we run out of stock?
- Every vendor recognizes the power of data. For instance, Salesforce wants to be the center of data-driven customer strategy. To that end, the company introduced the Internet of Things Cloud @ Dreamforce 2015, which is supposed to pull in data from devices, sensors and non-IoT sources like app behavior and social streams. In Salesforce’s view, it’s all in the service of the customer, grabbing data and wrapping a rules engine around it to drive automated Next Best Offers or Actions for the customer.
- How to convert Lookers to Bookers…
- How to create unique and effective Digital Experiences that impact probability of purchase or likelihood of return.
- What offers might result in higher “take rates”
The change in consumer behavior and expectations that e-commerce, mobile and social media are causing is hugely significant – big data and predictive analytics will separate brand/retail winners from losers. This won’t happen overnight but the transformation is for real.
Retail Industry makes up a sizable part of the world economy (6-7%) and covers a large ecosystem – E-commerce, Apparel, Department Stores, Discount Drugstores, Discount Retailers, Electronics, Home Improvement, Specialty Grocery, Specialty Retailers and Consumer Product Goods suppliers.
Retail is increasingly is looking like a barbell – a brand oriented cluster at the high-end, a very thin middle, and a price sensitive cluster at the low end. The consumerization of technology is putting more downward pricing pressure in an already competitive “middle” retail environment. The squeeze is coming from e-commerce and new “point, scan and analyze” technologies that give shoppers decision making tools — powerful pricing, promotion and product information, often in real-time. Applications in iPhones and Droid, like Red Laser can scan barcodes and provide immediate price, product and cross-retailer comparisons. They can even point you to the nearest retailer who can give you free shipping (total cost of purchase optimization). This will lead to further margin erosion for retailers that compete based on price (a sizable chunk of the market in the U.S, Europe and Asia).
Data analytics is not new for retailers. Point of sale transactional data obtained from bar-codes first appeared in 1970s. A pack of Wrigley’s chewing gum was the first item scanned using Universal Product Code (UPC) in a Marsh Supermarket in Troy, Ohio in 1974. Since then, retailers have been applying analytics to get even smarter and speedup the entire industry value chain.
More recent use cases of retail analytics include: Read more