Capacity Problems That Hinder Analytics Progress

See four common ways in which talent or resource limitations can encumber big data and advanced analytics

Solution CTO, IBM

After recently engaging with a couple hundred customers in 2014, now seems to be a good time to share some observations about what are referred to as common capacity problems that can limit results or block forward progress. These capacity challenges are occurring at the intersection of big data, advanced analytics, and organizational dynamics and can make innovation difficult—or impossible—if not addressed.

Four types of capacity problems

There are four common types of capacity problems in today’s organizations and enterprises. Capacity problems are gaps, limitations, or restrictions that surface in predictable ways that can be categorized by client organizations. They include insufficient environments, data manipulation shortcomings, not enough analytics fingers on the keyboard, and absent or underdeveloped data science skills.

Insufficient environments

Simply having insufficient space or high-performance data management environments for analytics processing is not uncommon in client organizations. Examples of this capacity constraint include having time rationed on enterprise data warehouse (EDW) jobs and an inability to run computationally expensive jobs—analytics involving full table scans or processor-intensive, machine-learning models. Other examples include data that has been overly summarized to limit consumption of disk space, or not being able to easily pull in third-party data because of rigid schema restrictions.

Data manipulation shortcomings

Client organizations looking for insight from analytics processing tend to be challenged by use cases that combine multiple data sets. These organizations often have data silos that contain the answers they need, but they simply can’t put them together effectively—primarily because the data analysis teams don’t have access to this siloed data. Or in some cases, the teams don’t play nice together, or the project has historically been prolonged and cost-ineffective to the point of never reaching completion because it requires new environments, schema changes, or internal charge backs for the work.

Not enough analytics fingers on the keyboard

There are client organizations that have data environments, can manipulate data, and can effectively control projects, and they may even know what they want to do with the data. However, they may lack the time or the resources to get it done. Simply stated, they have limited hours of the day devoted to analytics—or not enough analyst fingers on keyboards. Typically, they have more worthy projects than there are resources to address them. Projects with identified return on investment (ROI) go unaddressed because of a lack of skilled people, and when that happens, innovation and corporate learning as a strategy are not progressing as they should be.

Absent analytics and data science skills

In the case of the absent analytics and data science skills capacity challenge, there is also a bifurcation between large enterprises and midtier firms. Very large organizations generally have data science initiatives or teams doing work they believe in and have adequate competency to fulfill them, but they need help with the first three capacity problems discussed previously. Midtier and small firms commonly struggle with skill set limitations because they do not often have budgets or on-staff skills to run data science projects. Large firms may be challenged by getting locked into specific practices, but at least they are moving forward and engaging, while midtier firms typically need help on how, where, and when to even think about getting started.

The next step toward resolution

How are these capacity problems addressed? That topic is best left for another article that dives into how to use advanced methodologies to get past these capacity problems. In the meantime, please share any thoughts or questions in the comments.

[followbutton username='thomasdeutsch' count='false' lang='en' theme='light']
[followbutton username='IBMdatamag' count='false' lang='en' theme='light']