Going Beyond Data Science Toward an Analytics Ecosystem: Part 2
Discover requirements for successful big data analytics and the makeup of a core analytics ecosystem
A successful big data analytics program requires a number of ingredients and calls for a comprehensive skillset in analytics, business, and technical areas. Part 1 of this series introduced the emerging role of the data scientist and listed some of the challenges organizations may need to contend with when they consider implementing a data scientist role. This installment will look at how big data analytics programs can be successful and take a deeper dive into analytics ecosystems. A recent comprehensive survey of organizations that have successfully implemented big data analytics identified nine levers that together enable organizations to create value from an ever-growing volume of disparate data:
- Culture: Availability and use of data and analytics within an organization
- Data: Structure and formality of the organization’s data governance and security
- Expertise: Development of and access to data management and analytics skills and capabilities
- Funding: Financial rigor in the analytics funding process
- Measurement: Evaluating the impact on business outcomes
- Platform: Integrated capabilities delivered by hardware and software
- Source of value: Actions and decisions that generate results
- Sponsorship: Executive support and involvement
- Trust: Organizational confidence
These levers handle many aspects of the program including strategy, architecture, training, data stewardship, and governance. They should link an organization’s objectives with the successful execution of its big data analytics program. Of course, there is no one-size-fits-all structure, process, or technology; it all depends on the analytics maturity, strategy, culture, and the needs of the organization.
Big data analytics ecosystem
Big data analytics touches many functions, groups, and people in organizations. Its application may begin as an experiment, but as it evolves it can have a profound impact across the organization, its customers, its partners, and even its business model. A successful big data analytics program requires many interacting elements. It requires, of course, data, which has to be integrated from many sources, different types of analysis and skills to generate insights, and active stakeholders who need to collaborate effectively to act on insights generated. In addition, there are assorted technologies involved in the tools, applications, and infrastructure. Because of their interdependencies and constant evolution, an effective approach to introducing, establishing, and nurturing big data analytics capability is to view its participants, components, and environment as an ecosystem. An ecosystem is a network of interconnected and interdependent entities (see Figure 1). Knowledge from biology, ecology, and emerging business ecosystem practices9 provide lessons on how to introduce, sustain, and evolve a thriving ecosystem. A big data analytics ecosystem contains individuals and groups—business and technical teams with multiple skillsets, business partners and customers, internal and external data, tools, software, and infrastructure. Furthermore, an organization can be viewed within a larger data ecosystem that consists of other organizations and entities sharing and exchanging data to generate economic value. Although exhaustively enumerating all possible roles and components needed for a successful analytics program would be outside the scope of this article; providing an outline of key elements helps establish a big data analytics capability. Figure 1. A thriving big data analytics ecosystem facilitates collaboration among core, extended, and external ecosystems Three interacting spheres define the analytics ecosystem:
- Core ecosystem: Individuals and technologies assemble the data that is required, analyze the data to generate insights, and determine actions based on these insights to achieve business outcomes.
- Extended ecosystem: Individuals, groups, and systems direct the analytics projects, collaborate with the core team, provide raw data, consume the outputs, and act on the insights.
- External ecosystem: Customers, business partners, vendors, data providers, and consumers interact with the organization to help deliver the full potential of big data goals.
Core analytics ecosystem
The core analytics ecosystem consists of the main roles and technologies needed to introduce and sustain an analytics capability. The core ecosystem does not imply a separate organizational unit. It could be organized over a number of configurations depending on the business needs: either centralized, decentralized, or other variations. The key criterion is the visibility of the value of analytics and the ability to respond to needs across the entire organization. One popular arrangement is the use of the Center of Excellence (CoE) model. The CoE model takes care of common areas such as training, introduction of tools, innovation, and communication among analytics stakeholders. It can also showcase business results to entice different parts of the organization to adopt analytics and encourage reuse of common patterns. The following brief descriptions define the key roles in the core analytics ecosystem. While this list is not exhaustive, some other supporting roles—such as administrator and project manager—would be necessary, and a single individual can perform multiple roles.
Chief analytics officer
A key objective is to move organizations from discovery and experimentation with analytics to a systematic, pervasive application across different areas and groups with measurable business value. One way of achieving this objective is by making the value visible through an analytics champion. A chief analytics officer (CAO) can personify the shift toward a data-driven mind-set and can be the person business users go to for insights.10 The CAO leads the Core Analytics Ecosystem and coordinates resources to ensure analytics is used to deliver the desirable business outcomes. The role can be seen as the culmination and blend of the other three roles described here.
Data scientist and/or analysts
While avoiding excessive and exclusive focus on the role of the data scientist here, this discussion in no way diminishes the critical importance of mathematical and statistical expertise in gaining insight from data and unlocking its economic value. A data scientist applies advanced data mining and analytics techniques on data to uncover hidden patterns that can be exploited to achieve desired business outcomes. However, these skills can also be provided by a combination of existing staff enhanced by training and augmented by external consultants and modern analytics tools. Tools such as the IBM® SPSS® Modeler predictive analytics platform can automatically perform some of the tedious but necessary tasks such as data preparation and the generation of associative models. In the initial days of big data analytics, data scientists had to develop code to perform these monotonous tasks. Now, using a modern big data analytics platform and tools, they can significantly reduce this effort. Different functions of data science can be performed by other complementary roles supported by second-generation big data analytics tools (see Figure 2). For example, subject-matter experts (SMEs) and business users can leverage modern tools to experiment and discover how data can help them with business problems. These tools would be connected to data on a platform that integrates data from internal and external sources. Figure 2. Second-generation tools supporting collaborative efforts that achieve traditional data science objectives cost-effectively and sustainably
Analytics architects leverage the heritage of business solution architecture that links strategy, business objectives, and constraints and develops a viable plan for execution. A key responsibility for this role is to operationalize analytics beyond discovery and experimentation. Operationalizing analytics means insights that are discovered—perhaps in a snapshot data extract—can be implemented using a live data feed from operational data sources. Conversely, an analytics architect helps ensure insights are acted upon by feeding these insights back to enhance the business processes. This cycle of analysis, insight, and action has to be performed in a responsive, reliable, and scalable fashion. It requires not only understanding data and analytical models but also the operational systems, applications, business processes, and infrastructure.
The data steward is responsible for ensuring the quality, integrity, and governance of data. Data stewards confirm that policies and procedures concerning the acquisition, access, dissemination, and disposition of data are in place. Doing so includes understanding security, confidentiality and privacy aspects, any legal considerations of different data types, and the lineage of the data—where it came from. The core analytics ecosystem also contains the data repository—which may include a traditional enterprise data warehouse (EDW) in addition to big data and Apache Hadoop-style data stores—along with discovery and analysis tools and the required infrastructure. Part 3 concludes this series with the details of extended and external analytics ecosystems. Please share any thoughts or questions in the comments. 9 The Death of Competition: Leadership and Strategy in the Age of Business Ecosystems by James F. Moore, Harper Business, April 1996. 10 “The Emerging Role of the Chief Analytics Officer, by Andrew Foo, IBM Data magazine, May 2013.