Blogs

Delivering Minimum Viable Analytics

Analytics tool development can shirk elegance as a design goal to hasten insight and tool enhancement

An employee working for a small consulting company, who had some industry experience and had learned a lot about how to deliver analytical solutions for large enterprises, presented results to an internal IT group at a large retailer. After finishing the presentation, a member of the retailer’s IT team said, “Sure, it works, but it’s not very elegant.”

This attitude is one of the primary reasons large enterprises have difficulties implementing analytics, especially big data analytics. To be clear, the use of elegant in this context is shorthand for perfect and perhaps overengineered. Although great strides have been made in adopting agile development techniques, the common refrain, “tell us what you want, and we’ll go off and build it,” remains.

Compounding the problem is the existence of many enterprises that are entrenched in their current business model and loathe experimentation. This approach is not the way to do analytics. Based on many years of experience, very few internal project teams at client organizations can describe exactly what they want at the beginning of an engagement. Sure, there are some exceptions in which the business goal is simple and the technical goal is difficult—for example, predicting revenue for the next quarter—but these situations are in the minority. The majority of good analytics projects involve collaboration among the business, IT, and analytics functions.

 

Staying ahead of the big data tide

Big data makes this problem even worse. Enterprises are collecting rising volumes of data on their operations from social media and from sensor networks—for example, radio frequency identification (RFID), store cameras, and so on. Executives want to use this data to improve their operations and increase revenue through monetization.

With ever-growing data and the ability to rationalize data across data siloes, there are more opportunities than there are resources. Most analytics solutions cannot afford to have elegance as a design goal. This statement might be a bit controversial. Analytics practitioners are professionals, and deliberately arguing for inelegant solutions seems counterintuitive. There are too many analytics efforts that failed when the analytics techniques were too sophisticated for the quality of the data.

Consider a forecasting system designed with a tight deadline. The original project plan was to deliver an analytics system with improved functionality and accuracy over the current forecasting system. However, as the project progressed, the team started to work toward an elegant solution. The project scope grew increasingly aggressive, with data coming from multiple systems and a statistical modeling algorithm that allowed forecasting at different product and geographic hierarchy levels.

The project team was confident that the new system would be a marked improvement over the current forecasting process. However, pulling all of the data together across the different hierarchies took more time than they anticipated. When the data problems were sorted out, they discovered that the modeling algorithm couldn’t handle both the product and geographic hierarchies. Ultimately, the project was significantly delayed, which hurt the business financially and caused the analytics team to lose credibility.

 

Keeping the design goal simple

To address the problem of elegance in analytics, organizations can adopt what could be called “minimum viable analytics.” This relatively simple idea combines the concept of iteration that is central to good analytics practice with the idea of a minimum viable product coming out of lean start-up principles. Although this concept is not identified by this name, it comes from a discussion with Bill Franks, chief analytics officer (CAO) at Teradata. This idea works particularly well when the deliverable is an analytics tool.

Consider the forecasting example mentioned previously. The first iteration should be to deliver the simplest forecasting tool possible. This deliverable is accomplished probably by using trending, seasonality, or time-series methods at the highest level of the product or geographic hierarchy. The second iteration may use sophisticated algorithms with economic variables or expand the analysis to multiple levels of the hierarchies. However, when the second iteration is delivered, it should keep all the functionality of the first one.

In addition, the analytics team should track the accuracy of the simple forecasting algorithm during the development of highly sophisticated algorithms. Subsequent iterations should expand the analytics system by adding variables across hierarchies, and use modeling algorithms of elevated sophistication. As input variables are added with increasingly sophisticated algorithms, the forecasts can be combined to provide a range of forecasts. There are many good studies indicating that combining prediction algorithms can produce good results.

 

Offering a short path to insight

The key to minimum viable analytics is to deliver the iterations on a short time line—preferably on the order of weeks. By launching a tool that isn’t elegant or perfect, not only can the organization procuring the analytics tool start generating insights sooner, it can also find any problems with the assumptions used to build the tool.

Please share any thoughts or questions about this combination of analytics and lean start-up principles in the comments.

 

References

“Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy,” by Ludmila I. Kuncheva and Christopher J. Whitaker, Machine Learning, May 2003.
“Popular Ensemble Methods: An Empirical Study,” by Richard Maclin and David Opitz, Cornell University Library, arXiv.org, article-id: 1106.0257, June 2011.
“An Overview of Ensemble Methods for Binary Classifiers in Multi-class Problems: Experimental Study on One-vs.-One and One-vs.-All Schemes,” by Mikel Galar et al., Pattern Recognition, volume 44, issue 8, August 2011.

 

[followbutton username='@mjcavaretta' count='false' lang='en' theme='light']
 
[followbutton username='IBMdatamag' count='false' lang='en' theme='light']