How do you find and define your customers in a big data environment?
Every day hundreds of articles and blogs are written about the theoretical value of big data, but many don’t offer actionable specifics. At IBM Insight on October 28, we are correcting this historical shortcoming and showcasing how big data—specifically InfoSphere Big Match for Hadoop—is enabling a single view of a customer in a Hadoop environment.
Big Match is IBM’s industry-leading Probabilistic Matching Engine (PME) reengineered to run in a Hadoop environment. This is very important, as 50 to 75 percent of today’s big data use cases, or those envisioned for the near future, are seeking to address the customer perspective of unstructured data. These customer-centric use cases involve different data based on the particular industry perspective, including consumer, business, patient, member, suspect, prospect, provider and beneficiary to name a few.
The customer-oriented big data use cases may include:
- Identifying customers in unstructured data (such as social media, text and call center data) and associating the customer data view to mastered customer data to deliver better service
- Identifying prospects from purchased lists and third party data, and applying levels of confidence to these records based upon the intended use
- Reporting for compliance or regulatory purposes when addressing legal or customer inquiries
- Reducing bottlenecks in data parsing and integration (from transactional databases) by loading all data in the Hadoop environment where analysis and integration can be faster, and without constraints
- Evaluating data warehouse data that may be constrained due to volume, response time or historical warehouse limitations
- High-speed demographic matching for data sets containing 100s of millions (or billions) of records
Significant efficiency can be gained when using Big Match, including:
- Reuse of the PME matching algorithms already proven in traditional MDM (master data management) deployments
- Gaining technical efficiencies since the IT staff will already be familiar with IBM’s MDM and PME
- Readily configuring new data sources and thresholds, if appropriate, for data linking using the skills already incumbent in the IT staff
- Knowledge transfer, which can become a continuous priority, with self-sufficiency as the ultimate goal