Getting the right mix of analytics specialists in data science teams
Data science is a team sport that involves specialists with complementary skills and aptitudes. It’s not a playground for mythical do-it-all unicorns who have mastered all the necessary skills and knowledge.
Successful data science initiatives leverage high-performance team collaboration, and data science teams require the right mix of individuals who have diverse aptitudes, skills and roles. These teams are increasingly open to a wide range of specialists, including business analysts, data-driven application developers, data engineers and statistical modelers. Increasingly, these teams also incorporate fresh blood, which may include self-taught citizen data scientists and expertise from data science crowdsourcing communities. For additional details, take a glimpse inside the mind of a data scientist.
Open data science collaborations drive the creation of high-quality predictive, machine-learning and other advanced analytics models and insights. The best collaboration environments enable data science professionals to flexibly share feedback, guidance, ideas, models, requests, samples and templates on diverse projects. These environments should offer simple, on-demand access to advanced tools for statistical modeling, data engineering, development and other key tasks. The collaboration environment helps simplify and accelerate many tasks ranging from up-front data discovery and acquisition to downstream data wrangling, exploration, governance, modeling and refinement in the lifecycle of any data science initiative. We can measure the effectiveness of a next-generation data science collaboration environment by the extent to which it enables teams to achieve several goals:
- Produce, refine and deploy a far wider range of machine-learning and other statistical models and applications more rapidly than ever
- Develop these artifacts in a wide range of tools and languages
- Design a greater number of models than before, which incorporate highly complex feature engineering and a wide range of predictors
- Construct these models from much larger and more diversified libraries of algorithms
- Train and score the models from large volumes and varieties of data sources more rapidly
- Accelerate data acquisition, transformation and preparation in a highly automated fashion
- Deploy models into a much broader range of business applications more rapidly and efficiently
- Easily and securely track, manage, and control all of these assets at every step in their lifecycles
Now available on IBM Cloud in open beta, the IBM Data Science Experience (DSX) provides a flexible, collaborative environment for data science teams. DSX offers a unified environment for data scientists and other analytics developers. They can connect with one another while accessing project dashboards and learning resources; forking and sharing projects; exchanging development assets including data sets, models, projects, tutorials and Jupyter notebooks; and sharing results. Follow-on releases to DSX will support comments, user profiles, data science competitions, Zeppelin notebooks and real-time collaboration.
The opportunity to get involved
To boost your data science team’s collaborative potential, join the DSX. And if you’re a working data scientist, data engineer or data application developer, register to attend the IBM DataFirst Launch Event. It takes place on 27 September 2016 in New York, New York and affords you the opportunity to engage with leaders and practitioners in the open source community. Learn how to accelerate your processes for putting data to work in your burgeoning cognitive business.