So what happens now when we go beyond the frontiers of the data warehouse and into the world of the data lake? – the world of Hadoop, of NoSQL, the world of schema on read, of discovering the data as is? For many organizations, the holy grail is to reap the benefits of the data lake while retaining
Making your data lake a “governed data lake” is the game changer. Without governance, organizations risk securing the data and as well as protecting it. When data is cataloged and governed, an organization can effectively discover, classify, track history and lineage, quality of data and thereby
The data lake may be all about Apache Hadoop, but integrating operational data can be a challenge. Learn how to deliver real-time feeds of transactional data from mainframes and distributed environments directly into Hadoop clusters and make constantly changing data more available.
Upon reading his own obituary in the newspaper, famed author Mark Twain is said to have remarked that reports of his death were greatly exaggerated. I can only imagine that if the data warehouse appliance were a 19th century American novelist, it might say the same thing. For a while now,
Perhaps one the single most significant changes to the analytics landscape in recent years had been the emergence of the data scientist. This role is continuing to evolve, with many organizations still in the process of establishing how best to incorporate this relatively new discipline into their
It’s easy to be blinded (and impressed) with the rapid innovation and evolution in the arena of big data. Today’s most technically sophisticated companies have the opportunity to exploit big data tools to address mind-numbingly cool use cases and produce very enticing results. However, so many
For today’s data scientists and data engineers, the data lake is a concept that is both intriguing and often misunderstood. While there are many good resources about data lakes on ibm.com and other websites, there is also a lot of hype and spin. As a result, it can be difficult to get a clear
Building a data lake is one of the stepping stones towards data monetization use cases and many other advance revenue generating and competitive edge use cases. What are the building blocks of a “cognitive trusted data lake” enabled by machine learning and data science?
Quite often, we see that the need for data security and governance makes some organizations hesitant about migrating to the cloud. This is perfectly understandable given the types of data gathered and used by businesses today, the regulations they must adhere to on both a local and global level,
In many cases the data lake can be defined as a super set of repositories of data that includes the traditional data warehouse, complete with traditional relational technology. One significant example of the different components in this broader data lake, is in terms of different approaches to the
When the data lake is deployed as an infrastructure to be exploited by different users in various departments with their own needs, their own different requirements and often their own dialects in terms of a business language, then a universal translator can become very useful. Especially with the
There is so much talk about data as a new natural resource. The amount of data organizations and citizens across the globe produce, is authored in many systems and consumed by various organizations and users in different formats. This begs the following questions: Who owns this data? And why it is