Rich Data Sources for Abundant Innovation

Data scientists are poised to exploit emerging data sources that can revolutionize healthcare

Current healthcare data warehouses contain lots of data—perhaps even millions of rows of tabular data—and hospital data may be linked to pharmacy and physician data as well. Most of that data is from internal sources, and is relatively easy to collect. From a data scientist’s perspective, having lots of data is great; however, strong predictor variables are even more valuable than having more data. External data sources contain many strong predictor variables, but currently they are challenging to collect. Fortunately, progress is being made to make these rich data sources available to data warehouses.

There are two categories of newly available, potential data sources: real-time data and external data. Access and use of both these data types have the capability to significantly contribute to improved patient outcomes, which in turn helps reduce costs and improve the quality of healthcare for society.


Data feeds to monitors and tracking devices

Real-time data sources include digital activity trackers that have been used by physical fitness buffs for close to 15 years. Some brand names include Fitbit, Jawbone Up, and BodyMedia. These wearable devices track activity and other biometrics by monitoring location with Global Positioning System (GPS), heart rate, skin temperature, perspiration, calorie expenditure, and quality of sleep. Many of these trackers are generally worn on people's wrists and are Bluetooth-enabled, which allows the data to be downloaded to a smartphone or computer in real time.

Activity tracking can also be used to assist elderly patients who want to live independently, allowing their adult children or caregivers to monitor the patient’s health from afar. For example, a statistically significant deviation in heart rate, temperature, or activity level will alert an adult daughter to a potential problem, prompting her to check in with her elderly mother. Thus, activity trackers can encourage taking proactive steps before a medical crisis takes place, and in this example help an elderly mother avoid an emergency hospital procedure and the accompanying cost.

Large technology companies such as Apple are also getting into the game. The yet-to-be-released Apple iWatch is rumored to feature monitoring of harmful ultraviolet (UV) rays, heart rates, and blood-oxygen levels; using optoelectronics to measure oxygen saturation of hemoglobin; and monitoring the sound of blood flow through arteries.

Programmable pacemakers have been around for years, and even more advanced and smaller surgically embedded devices are currently being developed. These new devices are expected to be embedded into blood vessels, lungs, the brain, and possibly other organs to track and monitor patients’ vital measures such as temperature, oxygen levels, blood flow, and blood pressure in real time. The data will be collected with Bluetooth-like transmissions to be analyzed on smartphones, and a text message can be sent or a phone call can be made to a primary care physician when something is amiss.

Physicians are now entering their notes on tablets or laptops, making this electronic data available in real time. Doctor notes are a rich source of in-depth, detailed information about the nuances of a procedure or visit. Text-mining tools, such as those available from IBM, that extract key words and phrases will be important methods of collecting the information embedded in these types of data sources.


Social data for sentiment communication

External data sources such as social media channels—Twitter, Facebook, Yelp, and LinkedIn—are potentially important data sources that can be used by hospitals to monitor and ascertain their standing in the community and public perception. Every hospital has something that needs to be fixed or improved upon, and although no one likes to read bad reviews, the insights that may be revealed through social media data sources can provide extremely valuable information.

Negative reviews may expose areas where improvements can be made and give organizations an edge, especially in a healthcare climate in which hospitals and clinics are striving and competing to levels of care. Voluntary patient surveys at admission or at discharge are also good data sources that help healthcare practitioners ascertain the challenges and limitations that may affect patients’ overall health, impact their recovery outcomes, and help decrease the likelihood of costly readmissions.

Population demographics from sources such as the US Census Bureau also provide valuable background information about the population that resides within driving distance from a local hospital. Trends, changes in income, demographics, unemployment levels, and so on can give organizations a perspective on potential future changes and support agility in meeting the needs of the community.


Data warehouse transformation for healthcare

Data warehouses of the future are expected to continue to process batch-generated internal data stored in relationship tables. However, they are also likely to expand to encompass real-time and external data that data scientists can capitalize on to build accurate, predictive, and innovative models.

Please share any thoughts or questions in the comments.