Addressing the Big Data Skills Gap

Big Data Product Marketing Manager, IBM

“Dammit, Jim. I’m an analyst, not a data scientist.”

You may not work on the Starship Enterprise, but chances are you’ve heard something like this in your office. Everybody’s talking about the growing big data skills gap. Take the latest Harvard Business Review article, for instance. In their recent survey of senior Fortune 500 and federal agency business and technology leaders, they found that 85% had funded big data initiatives on the horizon. Not surprising given the massive amount of buzz big data has garnered. Yet what keeps these leaders up at night is how they’re going to staff these initiatives. The U.S. Bureau of Labor Statistics predicts there will be a 24 percent increase in demand for professionals with data analytics skills over the next eight years.

Finding good talent is always important, so why is there a sudden onslaught of data scientist job requirements? And why is it difficult, if not impossible, to find enough data scientists? More importantly, what are we going to do about it?

Closing the big data talent gap requires tackling the problem from both sides: the people and the technology. Let’s start with the people. Sure, adequately training the data scientists of tomorrow is an obvious and necessary step. But what about the non-data scientists out there? It’s a lot like math. Not everyone’s going to be a mathematician, but everybody needs to learn it.

Disclaimer: I am a math nerd. It was my favorite subject growing up, I was president of my high school math club, and in fact, one particularly lonely summer after moving across the country, I actually took sample actuarial exams to pass the time.

There’s a reason math is a key component of the K-12 curriculum. It’s not because we’re trying to create a sea of mathematicians to come up with new theorems one day. It’s because a solid math foundation teaches you a way of thinking, of approaching problems, of using your brain in different ways. It gives you a perspective that you can then apply, whether knowingly or not, to other aspects of your life.

And so it is with big data. Big data is not a fad, and it’s not just something that a handful of data scientists need to understand. The big data world is the world in which we live now, in which businesses operate. And if we’re going to solve big data problems – the kind of problems that even a few years ago were unthinkable – then doesn’t it stand to reason that we need to think in new ways? Let’s not limit the discovery and learning within this world to a select group that will go on to boast a sexy new job title.

New curriculum prepares students of all majors for big data careers

Look at what’s happening at the University of Montana (UM). Recently, IBM, TerraEchos and UM collaborated on the school’s first analytics curriculum designed to teach students how to analyze big data to tackle complex challenges, including how to use smart meter data to help consumers use energy more efficiently; weather and sensor data to respond faster to forest fires; and consumer sentiment from social networks for smarter digital marketing.

This course isn’t just for the data scientists. It’s designed to attract a broad range of students – from business to marketing and mathematics to computer science – with access to IBM's unique InfoSphere Streams software, curriculum materials, project-focused case studies, and IBM thought leaders as guest speakers. Students learn a mix of technical and problem-solving skills necessary to prepare for big data careers in today's competitive job market.

This new analytics course will help non-technical students learn how to use big data analytics technology, while showing technical students how to look at business and government challenges.

“The collaboration between IBM, TerraEchos, and The University of Montana is a creative and definitive step in educating students for today’s workforce needs. Introducing our students at the undergraduate level to the concepts of big data is innovative education at its best," said Royce C. Engstrom, president of the University of Montana.

As part of its Academic Initiative, a program that offers colleges and universities access to the latest advances in technology and business industry expertise, IBM is working with more than 200 schools around the world to create and strengthen analytics curricula.

Big data technology for everyone

Now what about the technology side? What can we do to make the technology more accessible to the people? If companies are saying that they don’t have the in-house skills to do something with big data, then doesn’t that imply that the existing big data technologies are just too complicated? What if we could make them easier to use?

This thought was behind many of the new features in the latest versions of InfoSphere Streams v3.0 and InfoSphere BigInsights 2.0, released on November 16, 2012. InfoSphere BigInsights, IBM’s Hadoop-based offering, enhances this open source technology to withstand the demands of your enterprise, adding administrative, workflow, provisioning, and security features, along with best-in-class analytical capabilities from IBM Research. InfoSphere Streams software can analyze and share data in motion, allowing for sub-millisecond decision making in environments where thousands of decisions can be made every second.

Both products are part of IBM’s big data platform and each boasts a list of new features that promote ease of use, allowing users to leverage big data without having to develop new skills. Whether you’re a business analyst, developer, administrator or data scientist, you can unlock the value within your data, bringing all relevant data together for analysis and eliminating silos.

BigInsights now comes with new visualization, application and monitoring tools that not only allow data scientists but ALL users to develop, maintain and launch big data applications without having to code. A centralized dashboard and integration with IBM InfoSphere Data Explorer, formerly Vivisimo Velocity, allow users to visualize their data to gain insights, view analytic application results and monitor metrics. Similarly, with the new Streams graphical editor, users can now build applications simply by dragging and dropping operators while automatically synching on the back-end.

Three new application accelerators (Social Data, which comes in both Streams and BigInsights, Machine Data, which comes in BigInsights, and Telco, which comes in Streams) improve time-to-value for big data deployments, leveraging IBM experience and best practices around implementation of a given use case.

What can you do now?

If you haven’t had a chance to explore these new product releases, I encourage you to do the following three things:

  1. Tune in on Tuesday, December 11 for a webinar with John Choi, IBM Director of Product Management for Big Data to learn more. The replay will be available at the same link for six months.

  2. Register on to start learning about big data technologies. Like the more than 50,000 students already registered, you can sign up for courses and learn at your own pace.

  3. Visit the IBM Big Data YouTube channel to watch a series of big data tutorials, hear customer stories and learn about the products

Related Information