Spark takes on the big security threats

Vice President of Product, Platfora

Amid an onslaught of ever more sophisticated cyber attacks, organizations everywhere are seeking enhanced cybersecurity capabilities that can help them respond to the growing threat to data. Understandably, then, businesses in a variety of sectors are turning to big data solutions such as Hadoop for help. Indeed, Hadoop, which is designed to handle data without respect to variety or volume, is the natural choice for an organization wishing to build an enterprise security solution that can manage a wide range of risks.

Using Apache Spark, businesses can enable a workflow that encompasses the full range of assets in a Hadoop environment, thus making behavioral analysis accessible as well as iterative. Indeed, Spark can obviate the need for many different and specialized resources, instead enabling self-service solutions that put core analytical capabilities directly into the hands of business analysts. Because a Spark-powered big data discovery environment can help provide the event series analysis and advanced segmentation capabilities on which analysts rely for security analysis, a Spark environment is uniquely equipped to address two of businesses’ most formidable security challenges: mitigation of new threats as they arise and detection of advanced persistent threats.

Mitigating emerging threats the modern business environment, organizations put in place a wide range of security mechanisms designed to prevent security breaches. Even so, breaches can occur. When a new threat emerges, everything depends on the timing and coordination of a company’s response. However, chief information security officers (CISOs) and their teams often lack the data and tools that can help them efficiently investigate and mitigate breaches, not least because in many organizations, information is scattered among multiple systems. In such an environment, security analysts must pivot among multiple data sources—network, endpoint, user behavior data and the like—to conduct a security investigation in which they can answer questions such as the following:

  • Which internal servers make connections to internationally based servers?
  • How has a user’s pattern of access to internal resources changed over the past year or month—or week, day or hour?
  • Which users have demonstrated abnormal patterns of behavior, such as by connecting using nonstandard ports or applications?

Detective work of this kind is an iterative process, requiring analysis across multiple data sources. And although Hadoop can bring together the disparate data sources whose data bears on these questions, analysts rely on a self-service big data discovery solution such as Spark to help them work through sequences of criteria, testing for and identifying high-risk behavior and advanced threats.

Detecting advanced persistent threats

An advanced persistent threat, however, is unlike other cyber attacks. Whereas some attacks aim to inflict sudden and immediate damage, advanced persistent threats remain ongoing and undetected, allowing attackers to steal data rather than inflict damage on a network or organization. Indeed, some of the biggest headline-grabbing security breaches of the new millennium have been caused by this kind of attack.

Security information and event management (SIEM) systems are floundering among ever more furious seas as they attempt—and increasingly fail—to identify and stop the full range of threats that face businesses. Unable to pivot across data sets, such solutions often lack the data or analytic capabilities with which they can understand advanced persistent threats. Instead, their capabilities are limited by the data they ingest—typically only standard security data. Thus, without the ability to reference other data sets for segmentation and pattern detection, SIEM systems often lack the context in which they could detect advanced persistent threats. Not surprisingly, as cyber attacks become ever more sophisticated and attackers deliberately sidestep the rules that SIEM systems employ to detect attack, the security gaps left by such systems are becoming increasingly apparent.

However, analysts can use Spark-powered big data discovery solutions to detect anomalies and outliers within large data sets, laying the foundation for identification and elimination of modern security threats. When investigating an entity—a specific user, an IP address or a computer—across terabytes of data, analysts rely on visualization capabilities to help them detect such outliers. In particular, they require a macro-level picture of user or machine behavior conveyed through interactive visual charts. Using such visualizations, an analyst can quickly target groups of entities for further investigation through filtering, data point isolation and segmentation, pivoting between large event data sets as needed to discover how specific segments of an entity (such as users) have behaved amid separate events. By combining event analytics and segmentation, security analysts can thus uncover insights that help them quickly identify false alarms—or call team members to escalate their response.

Gaining a long-term advantage

Prevention is undoubtedly the most effective strategy for achieving cybersecurity—as well as every other kind of security. But prevention doesn’t always work. Cyber criminals are as persistent in developing and deploying their technologies as any legitimate technology firm is. Ultimately, businesses must ground their security infrastructure in a flexible and rapidly evolving set of technologies.

Fueled by Apache Spark, big data discovery provides diverse and powerful capabilities that are designed to give organizations an edge in the ongoing battle for cybersecurity. To learn more, find out how Platfora Big Data Discovery is defining a new generation of Spark-enabled security solutions for businesses everywhere.