Hack the weather with deep data science

Big Data Evangelist, IBM

Weather affects everything. It is both an essential resource and a significant risk factor for life on Earth. What’s more, weather is one of the most unpredictable aspects of human existence, yet we have no choice but to try our hand at predicting it. After all, our very lives depend on our having ample supplies of foodstuffs that are themselves dependent on adequate precipitation, growing seasons and so forth. Weather is no small matter.

Forecasting the weather using predictive analytics

Whereas ancient civilizations relied on shamans to make sense of the shifting winds, rains and clouds, the modern era looks to the atmospheric sciences, most notably to meteorology (for near-term weather forecasting) and climatology (for longer-term trends related to global warming, creeping desertification and the like). Indeed, atmospheric scientists have relied on predictive analytics tools for many decades, beginning in the 1950s with the use of computerized predictive models for weather and climate modeling.

Yes, what we now call “big data” has been the heart of predictive weather analytics from the start. In every era of computing, meteorological models have greedily devoured every high-performance computing resource—storage, memory, CPU, bandwidth—thrown at them. And the fine-grained parallelizability of most atmospheric models has driven development of increasingly distributed architectures. Just look at how the availability of new observational data sources—especially satellites and radar—has continued to enhance predictive speed, accuracy and granularity.

Seeing the big data at the heart of weather forecasting far as prediction granularity is concerned, the precision of weather forecasts depends on several factors. One of these is the network of observation sources that supply fresh data to predictive models. As noted in the previously cited article, the Internet of Things (IoT) promises to substantially deepen this granularity. Coming advances in weather forecasting will leverage new data sources, including sensors embedded in smart cars, smart homes and drones, among many other platforms. But predictive granularity in meteorology—street by street, moment by moment—depends on the geospatial resolution of data and models.

Before long, we may expect to see ubiquitous urban weather observation networks whose countless sensors feed real-time meteorological information. Indeed, Apache Spark is already playing a growing role in scientists’ efforts to increase the accuracy of weather forecasts by using IoT and other big data sources. Moreover, Spark-wielding data scientists are leveraging massively parallel machine learning, streaming analytics and graph models to tune weather forecasts in real time even as Bluemix and other cloud-based computing resources are being brought to bear on forecasting challenges whose complexity is staggering—such as predicting the likely effects of the next El Niño cycle.

But weather does more than just affect our day-to-day lives. In fact, weather is perhaps the single largest external swing factor in business performance across many industries, being responsible for nearly half a trillion dollars in economic effects in the US alone each year. Important decisions and operations in most industries depend, to varying degrees, on weather forecasts. For example, energy and utility companies use weather data to assess the effects of temperature and humidity increases—even of just a few degrees—on power consumption. They also use weather data to assess the fiscal effects of outages caused by adverse ice, wind, tornado and hurricane events or even by common lightning strikes.

Reducing costs of weather events

Organizations everywhere are on the lookout for ways to reduce the costs associated with weather-related incidents, ways to manage the risk posed by potential weather-related events and ways to anticipate business opportunities. Developers, for example, require the ability to easily combine weather functionality, including forecast and observational data, with numerous data and application services to build advanced web and mobile applications that leverage cognitive capabilities, analytics, mobile and IoT services without wasting cycles installing or managing software and deploying hardware.

Accordingly, IBM and The Weather Company (TWC) are co-sponsoring “Hack the Weather,” to be held September 18–20 at GalvanizeU. In this hackathon, teams of data scientists will use TWC weather data packages and IBM data science tools, including a powerful workbench and Apache Spark as a Service, to create smart data applications for meteorological forecasting in hopes of winning a grand prize of $5,000 in cash and the opportunity to present their final project to a panel of judges at IBM Insight 2015 in Las Vegas on October 27.

Do you have a solution that is innovative, relevant and functional? We look forward to seeing you at “Hack the Weather.”

Learn how you too can use Spark as a Service to hack your weather-related analytic challenges.