Blogs

Insight Ops: The road to a collaborative self-service model

IBM Fellow and VP, CTO for Information & Analytics Group, IBM
Big Data and Data Warehouse CTO, IBM

In a previous blog we discussed how you enable a highly collaborative and data driven organization through the concepts of multi-speed or bi-modal IT.  We then expanded on this through a discussion on the overall information and analytic lifecycle and the interaction with five persona across that lifecycle. You can read those blogs here: 

Interestingly enough, Forrester Research recently published a report titled “The False Promise of Bimodal IT” which was referenced in an article on CIO.com.

Forrester argues this paradigm is fundamentally a mistake as it creates a two class system with the implication that you have a slow moving entity focused on back office systems (IT) with a second group focused on fast roll out of digital products. From an organizational perspective the arguments being made are valid, but when I think of the bi-modal model around insights (analytics) we very much have a different paradigm in mind.

When we think of a bi-modal or multi-speed model, our focus is not on the organization construct. The focus is on how to foster a highly self-service and agile tool set for the Knowledge Worker, Data Scientist and Application developer plus a reliable and effective transition of the most promising discoveries into production where it can provide repeatable actions and value to the enterprise. Deployment of insights could be into an existing back office system or a new business process that is directly supporting the digital engagement model.

Thought about in these terms, the combination of the different phases of the information and analytic lifecycle has very similar traits to the Dev Ops method that is used to develop and deploy assets in this cloud era. It begins with finding information, accessing that information, preparing into a new representation, model development/training, deployment, monitoring, and the constant refinement and redeployment to get to a deployable analytic model capability of delivering repeatable insight. This is the “development” part of the dev ops method. The “operations” part is the deployment and execution of the analytics into the production systems. And this is not a one-time transfer. Feedback from the production systems that records the effectiveness of the analytics is needed by the creator of the analytics model to iteratively improve its behavior.

This new model will embrace and encourage an agile environment for discovery and exploration and manage the transition necessary to deploy the insight to make it actionable.

During our discussions that worked through how we actually evolve the value higher in the stack and focus on the five persona and how they collaborate on the lifecycle of information and analytics, we realized that we need to invent a Dev Ops paradigm for insight development to deployment. We are naming this paradigm the “Insight Ops” model. This new model will embrace and enable an agile environment for discovery and exploration and manage the transition necessary to deploy the insight to make it actionable.

We believe that the path forward to provide a radically simplified way of supporting this Insight Ops model requires use of social collaboration techniques and more importantly capturing information from the collaboration into a Knowledge Graph and then analyzing that graph to help guide the persona along this Insight Ops path.

http://www.ibmbigdatahub.com/sites/default/files/insightops_embed.jpgCollaboration model

The best way to start the shift to this new model is to understand how business users and data scientists work on projects.  People work in small tribes who collaborate with each other and leverage the crowd source knowledge of communities around them.  Each tribe will discover insights through iteration, conversations, as well as through known data science techniques. Individual users ask for and receive recommendations from diverse audiences and extended social networks.   

We live in a world where open communication and collaboration, along with new technologies that support the broader community are disrupting the way we think and work. That collaborative model and the knowledge of the crowd are going to have a radical impact on how we simplify and the Insight Ops model.

The challenge today is scaling this process in an organized fashion that allows for knowledge to be captured and capitalized. The capture of subculture and/or tribal knowledge is key.

  •          Who is the person that knows this?
  •          What information is available that will help me
  •          What have others who are in similar roles looked for and used?
  •          What questions have those like me asked in the past?

So much valuable information ends up being locked in email trails, private chats, phone conversations etc. Ideally, answers to questions should be crowd sourced, with the business community providing the answer, in context of the data itself.

Needles in haystacks

Nearly every role in the enterprise from the knowledge worker to the Chief Data Officer organization application developers and even the Chief Data Officer organization spends an inordinate amount of time trying to find information assets. These assets include corporate data, third party data, public data, machine learning algorithms, models, documents, reports, transforms, APIs, etc. To interpret those information assets, users must also find the appropriate people to help to understand those assets and put them in context.

At the same time, there is a significant advantage to a business that can find the right information quickly to answer business questions and capitalize on a market opportunity. Businesses that are born in the digital marketplace and in the cloud can pop up quickly, grow quickly and steal market share. To enable this responsive business model, today’s business users must have open access to more information, be able to find appropriate information more easily in context, and be empowered to perform analytics quickly. They must be able to do so collaboratively and in a more agile way to in order to compete.

So how do you meet this challenge? There are two problems we have to address:

  •          How do you open up awareness of the information available while not overwhelming the end user?
  •          How do you enable collaboration within and across persona?

The collaboration model is going to be the key aspect here as this model will assist in not only finding assets, but also other individuals working on similar problems or who have working knowledge but within the context of the analysis being performed. It is also going to be the critical glue that will help drive the transition from the discovery/exploration phase through deployment and across the feedback loop between these. 

When we talk about collaboration there are several critical aspects to that thought: 

  • Enable a consistent collaborative communication model across different hierarchies of users. Think of what Slack does today.
  • Capture information about the collaboration in a knowledge graph model and marry that with a metadata repository of the assets they are using.
  • Leverage analysis from that graph to help guide individuals as they are working across the lifecycle.
  • Support simple workflow models that allow transitions from persona to persona. For example, the transition from a data scientist who built a model to a data engineer or application developer who will deploy into a business process to make actionable.

When it comes to finding that needle in the haystack we will leverage this tribal knowledge, which was captured from the collaboration model, to provide the context to help narrow the search to as specific a location in the haystack as possible.    

Lets give an example. A user, Bill, is in a tribe that includes six other individuals, and one of those individuals, Mandy, has been collaboration with another tribe who have performed similar searches and recommended specific data sets for the problem they were working on. Stephanie is another member in the tribe and she has tagged some metadata that is relevant for the problem but was missed by the other tribe. By linking these chains together and using that as a filter over a broader search of the underlying metadata the tribe is able to get more contextual knowledge of what data is relevant to the problem they are working on.

Our goal is to build this contextual search into an overall shopping for asset experience that will significantly reduce the time taken to look for the needle.

In summary 

We strongly believe that a new model is emerging in IT and that this new model we call Insight Ops links directly to the information and analytic life cycle we have observed at many client engagements. The bi-modal push requires a high degree of agile self-service access to information and previous analytical models in the exploration/discovery phase as well as a smooth transition from exploration to deployment of high value repeatable insights. Effective iteration and handoff between the teams supporting the Insight Ops lifecycles relies on a collaboration capability integrated throughout their tools. Collaboration and open metadata are going to be key to driving us to an ecosystem that allows for the repeatable transition from self-service to action. 

IBM is strongly committed to driving the industry down a path to support and evolve this Insights Ops model. We will aggressively embrace open source technologies such as Spark in this quest and also drive forward to open and metadata and open governance models as key foundational elements to support the model.

Click here to view the entire InsightOut series, and be sure to register for World of Watson, your new home for putting data to work.