Sparking change: How analytics is helping global communities improve water security
Most of us are privileged enough to live in societies where clean water is a given. When our physicians and the media advise us to stay hydrated, it’s a simple health consideration. But in many countries, access to water is a huge issue – approximately one third of the world’s population does not have access to safe drinking water.
According to the World Health Organization, 1.8 billion people lack access to safe water, 2.4 billion live without adequate sanitation, and 842,000 die each year from conditions related to inadequate water supplies – that’s over 2,000 people every day.
Thankfully, organizations like Water Mission are working to give thousands of communities the help they need to build and maintain safe, sustainable water supplies for themselves. Water Mission provides water treatment and chlorination equipment, water pumps and distribution points, and even solar panels and card payment systems. They are also expanding their operations to use technologies like Apache Spark, sensors and data to help them manage the challenges they face in helping remote communities.
Water Mission’s work involves social change and incenting new behaviors, which is a tricky endeavor. Water Mission needs a good understanding not only of the systems it installs, but also – and more importantly – of the behaviors of the people who use it.
To this end, the organization uses a system called PUMP to manage a huge amount of detailed information about each of its projects. The data is captured from technicians’ reports, financial transactions, and even IoT sensors embedded in the pumps and water dispensers themselves which transmit data by satellite connection. Much of this data exists in unstructured or semi-structured formats, such as PDFs.
Until recently, Water Mission was struggling to find the best way to turn this data into actionable insight. The organization turned to IBM® jStart®, who are technical champions of emerging technologies, and who work with IBM clients and others to drive rapid business transformation.
I presented the jStart Water Mission story at Strata Hadoop San Jose 2017, and this blog is about what I learned in preparation to tell their story.
Their first step wasn’t necessarily to find answers – it was to decide which questions to ask. This is a critical starting-point for any analytics initiative, and it’s particularly challenging when your data is as complex and multi-faceted as Water Mission’s. To help explore these varied, semi-structured datasets and identify the most fruitful areas for further investigation, the jStart team recommended using a combination of IBM Analytics for Apache Spark and Jupyter Notebooks.
By providing an ability to process large volumes of data almost instantaneously, and perform rapid exploratory analysis, Spark allows Water Mission and jStart to dive into even the largest and most complex datasets in the PUMP system. Spark’s distributed computing model empowered the jStart team to load the PUMP data into the cloud and divide it across as many computational nodes as necessary to transform, parse and analyze it quickly.
Meanwhile, Jupyter Notebooks gave the data scientists on the team a powerful interface that they used to code sophisticated Spark jobs, test hypotheses, visualize the results, and share them with decision-makers in a way that is easy to understand and act upon.
On that point, at Strata Hadoop I saw a quote in the expo hall that O’Reilly Media had picked out from Mike Barlow’s book, The Changing Role of the CIO. He wrote: “Things get done only if the data we gather can inform and inspire those in a position to make a difference.”
That’s a lesson that Water Mission has really taken to heart. The results of its analytics projects are not confined to data scientists – they’re used by everyone from the program evaluation team up to the President and COO.
One good example is an analysis that revealed an unsuspected relationship between water usage and the way people were paying for their water – using prepaid cards that need to be scanned to activate the water dispenser.
The use of Spark and Jupyter Notebooks had uncovered a puzzling pattern in several communities. Although individual customers’ water consumption tended to gradually increase over time, the community’s overall usage of the water system was declining. This was a problematic trend, because to remain sustainable, each system needs to generate enough revenue to pay for its own maintenance.
According to Program Evaluation Coordinator Kristen Check: “When we analyzed the data, we realized that some of the prepaid water cards – often with several dollars of credit remaining on them – were no longer being used.”
Having just a few dollars on a water card is very significant because water refills cost just one to three pennies each. A few dollars of credit could supply a family’s water needs for a considerable period of time.
Check explained: “[The analysis] helped us work out the most likely explanation: these cards had been lost, and the customers who owned them hadn’t purchased a replacement. So, we’re now looking at ways to refine our business model to make it easier for people to replace their cards if they need to.”
A second example relates to another usage pattern that Water Mission noticed when analyzing the data: customers were visiting the water dispensers more frequently than the team had expected, and were taking smaller amounts of water. On further investigation, the team realized the explanation: families were sending their children to the water dispenser, and the children can only carry relatively small and light containers – so they need to top up their supply more often.
Equipped with these kinds of insights, Water Mission can take action to adapt not only the technical solutions that it uses for each project, but its entire holistic approach. Analytics is leading to innovation in business models, in community education, and in cultural change management.
As a next step, Water Mission is building analytics into its business processes in a more programmatic fashion. Data from the PUMP system is now imported daily into an alerting system application, built on IBM Compose for MongoDB, which can instantly notify users if there’s a problem with a particular project. Financial transactions, water flow, outages and other types of data are also used to calculate metrics, which are then displayed in online dashboards, giving decision-making project managers the ability to monitor trends and make smarter decisions.
To learn more about open source technology and surrounding enterprise tools can help turn large, multi-faceted datasets into actionable business insight make sure to check out this resource.