Data Decoded: 2018 data trends with William McKnight and Yves Mulkers
From machine learning to blockchain to artificial intelligence, data is dominating the conversation in the tech industry. In the first episode of Data Decoded, William McKnight, CEO of McKnight Consulting, and Yves Mulkers, founder of 7wData and a data/business intelligence architect, discuss the hot data trends of 2018: the hype behind each, which trends will realistically impact businesses, and how organizations can adapt these trends to build a trusted analytics foundation.
00.28 Read “Influencers assess 2017 and make predictions for 2018.”
01.47 Machine learning is everywhere.
02.15 Read “Scaling the AI Ladder” by Rob Thomas.
04.43 What is master data management?
05.12 Read “Understanding the new Gartner MDM Magic Quadrant and the IBM position” by Nancy Hensley.
06.26 Read “How the growth of IoT is changing data management” by Kayla Matthews
09.44 Compliance doesn’t have to be a chore; it can be a business opportunity.
10.43 Clean, manage, and make available reliable data across your organization.
12.51 Read “Companies that share data can’t ignore governance” by Michael Lock.
13.12 Are you navigating big data or just stuck in a data jam?
13.15 Read “Providing transactional data to your Hadoop and Kafka data lake” by Davendra Paltoo.
15.16 Explore the endless possibilities of Think 2018.
15.26 Learn how you can build a trusted analytics foundation with IBM.
15.38 Follow William McKnight on Twitter.
15.41 Follow Yves Mulkers on Twitter.
William McKnight: Welcome to Data Decoded, a new IBM podcast series dedicated to demystifying the world of data from data lakes to master data management to data governance and everything in between. I’m your host, William McKnight. I’m President of McKnight Consulting Group. Today, we’ll be discussing what 2018 holds in store for the world of data and what organizations need to do to stay on top of 2018 data trends.
Now I know we’re a couple months at least into 2018 already, but we still have 10 months to go. I’m joined today by Yves Mulkers, the founder of 7wData. Yves, could you tell the listeners a little bit more about yourself?
Yves Mulkers: Hi, William, thanks for the invite and I’m happy to be on the podcast. You mentioned already that I’m the founder of 7WData.be, which is a blogging platform that tries to inspire people on what is happening in the data scene. I’m working as a data architect/business intelligence architect to help people get their data ready for the future and get that all sorted out in a short way of telling.
William McKnight: Got it. OK, well, data is very hot today, and there is a lot going on, so let’s get right into it. Really glad to have you on, Yves. Let’s hop into today's main topic. As I mentioned, it’s 2018 data trends.
Now, Yves, what data trend with regard to data in artificial intelligence and everything that’s wrapped up in data are you most excited for this year, because there's a lot going on. There’s cloud, there’s privacy, AI and machine learning. So Yves, what’s going to get adopted and really make a difference out there in companies?
Yves Mulkers: Well, you spoke already, the hot keywords, but I see the trends happening as well. If you talk about cloud, more and more organizations are looking to have SaaS, software as a service, or a managed services platform for their analytics or data activities.
Furthermore, I see artificial intelligence entering into all the data manipulations. Before, it was a very much related to the analytics scene, but now I see that it is happening as well into data management whereas before we were manually — let's call it manually — modeling and preparing that data. I see now that algorithms are being embedded into the tool sets and are helping human activity move faster. So the time to market is again, improving or minimizing thanks to artificial intelligence and machine learning.
I always have a problem with the words artificial intelligence and machine learning. If I talk to experts, they say it’s just a piece of code that is able to learn by itself and learn from the insights by looking at the data. Another hot topic I see happening which you didn't really mention is blockchain, where we see that will help as well in contracting and in getting a more secure environment without some intermediate organizations.
And, of course, data privacy. We see that more and more companies are really putting in more attention to that to see where the data is. So for me, it's a kind of opportunity where they are focusing on the data privacy and they say, “OK, we need to know where our data is stored, in which systems, who has access to that and we need to be able to provide that kind of insight.” So finally, they're looking into all the organization of their data assets as well and that's why the data is becoming even more hot and certainly in the first two months of 2018.
William McKnight: Well, thank you, Yves. Those are certainly hot data trends, and I would say data is really a strong part of competitive advantage out there today for companies, and it really behooves all of us to be bringing this great information back to our companies back into the plans that we’re making for our companies so that we can stay competitive. Yves, I'm going to have a few things to your list if you don't mind: self-service business intelligence, stream processing, and good old data warehousing and master data management. I see a resurgence in all of these things, do you Yves?
Yves Mulkers: Yeah, William, it's nice that you say self-service business intelligence. These days it's all about the analytics, artificial intelligence and machine learning. But in that self-service aspect, to my impression, is more of bringing the data insights to really the knowledge workers and that's the effort what we are trying to gain. You threw in a few key words like stream processing, MDM, and data warehousing. Let's tackle that a bit.
In order of relationship to business intelligence, if you say data warehousing, there is a lot of trends happening as well where we see that automation, the pre-processing of the data warehouse or the population of the data warehouse is moving a bit more in towards the data warehouse automation. So the modeling is down by algorithms. The ETL building is down by algorithms, but in an automated way as well so you can leave that bit to people not having the skillset of dimensional modeling or whatever type of modeling aspect you are using. Doing the ETL and all the stuff, you can leave that to people that are a bit tech savvy, but at least mold the business very well and can use these tools to get into that to self-service BI or analytics environment as well.
Stream processing, that's depending a bit on very likely the industry or the market. It’s a very hot topic in IoT, where you see at the events popping up all the time and you want that real insight. For example, if you're looking at machinery you want to spoke that ride on, if there's something happening with the machine and you can get that analytics to optimize the performance of the machine for example. And at the end, MDM, that’s very often forgotten in the complete set-up of your data environment, which is in fact the most important part. If I look at the number of four articles and information that is shared about that, it's just 3 or 4 percent of the articles that are popping up about MDM. We have few people sharing that information. Yes, few people reading that information, but it’s so important that the definition of your data sets are pretty clear. What do I mean with having those definitions clear, that means you all talk the same language.
Master data management for me is defining of one of the examples: a customer. What is a customer? Is it your active customer? What is an active customer? Is that somebody who bought your products in the last year in the last two years? Those kind of definitions, if you talk about your sales organization, which geographical or organization are you talking about. That kind of language definition that is in fact even more important than the technical assets of what we just mentioned, like stream processing, data warehouse automation, and all the like.
William McKnight: Absolutely. I just don’t want our listeners to lose sight of the fundamental things that can help you advance into artificial intelligence and so on, because data is really a foundation for artificial intelligence, machine learning and the things that we need to be doing eventually to stay competitive within organizations. And so getting a good handle on your data now is really important and you know the sequencing of these initiatives and the speed with which we are accomplishing them within our organizations is also pretty important. I think organizations really need to get a high priority to these data initiatives in order to maintain that competitiveness that we're talking about. What do you think, Yves? do you think that organizations are moving fast enough into this world of data?
Yves Mulkers: Uh, no. There is just a few companies that are really moving very fast in this area, but if you look at the traditional companies — if I look at the insurance companies and banking companies, they're moving very slowly. They’re experimenting and trying things out but the biggest the difficulty, what they have is that they need to change their business model. They need to change procurement. They need to change a lot of legal stuff where there sometimes tied in and I think that's the biggest issue why they can’t move that fast like a startup, where there are not yet tied into older compliancy stuff and so on and so forth. And that's the biggest difficulty that the big companies are feeling for the time being.
William McKnight: That's right. That's a great insight there. It's almost like you have to isolate part of the organization in order to accomplish some of these things and one of those things might be artificial intelligence. Everyone is talking about artificial intelligence, but notice I said talking and not necessarily doing. In order to get to artificial intelligence, you need to have a great data foundation, because data is what is used to train the algorithms and you really need to train those algorithms really well in order for artificial intelligence to make a significant impact on your organizations but what do you see, Yves, given that there's so many companies that are actually becoming data driven at this point aren't even ready for artificial intelligence?
Yves Mulkers: Well, we're kind of in a loop in this, I think, because I see the data cleansing and the data preparation tools are onboarding more and more artificial intelligence that helps them in cleaning up that data so it's a self-fulfilling prophecy in a certain way.
But I think yeah, artificial intelligence. Ready for that? It's changing the mindset. If I'm trying to bring in older ways of managing your data in a different way than the traditional way where you say, “OK, we have some data modelers here in the place or we have for the people that are able to build on the logical models and data modeling.” So they go talk to the business experts and model that's in it in a certain way and if you say, "Hey you know these small systems that exist?" They can run through your unstructured or structured data, just let them scan all your data assets and they will come up with a proposal of how your data assets look like, how the model looks like. A lot of people are reluctant to that. I see a lot of resistance where they say, "OK, we tried it but it didn't work," and then I say, "You know, there are so many different algorithms out there, so it's not a one fits for all." It's a lot of trial and error as well. And then they say, "Yes, but then we can at least do that manually as well in that way." So it's trying to get that use case that proves that you can use that new way of working with your data, helping data to optimize or to get your data ready for artificial intelligence in this way.
William McKnight: That's right, get the data ready and get the people ready too. There’s going to be a lot of organizational change management to be done as we transition our companies into artificial intelligence, but what about — could we get back to some of the things that are sort of over the top of all this, the foundational day's activities and initiatives that companies need to have in place and those things, like data governance or optimizing one’s data warehouse and getting that self-service access across the enterprise. I know we started talking about it before, but these things are important in order to provide that foundation for the future, so do you see more reasons why they're important, Yves?
Yves Mulkers: Well it's important, it's always when big data was small, there was so much hype. Everybody started to build their Hadoop clusters and they were saying, "OK, yeah it's not only that, but let's build a data lake but they weren’t really understanding why they were building all these new approaches to their data because the data warehouse did fail and the business intelligence didn't bring it. Then, so, let's try something else, because apparently that will solve all our problems," but they never fixed the basis where you say what do you want to do with the data and how should we approach that and there's always two ways of working with that.
You'll you ask your sandbox environment where you can try out things and learn from that and say okay now we can do that right but it means that you have to put in place your governance, you have to pull up put in place your full data lifecycle management what you say okay this goes out of scope what will we do with that, how can we move from that sandbox environment into a production environment. And a lot of extra stuff. And the thing that’s still the basis of not only data management but just in general working with applications and software development, and yet they're still lot of immaturity within companies and understanding of why this is so important looking at your architecture in a general way for all the kind of things what you're doing. And all tying it together in a matter of fact.
William McKnight: Yes, architectures have become more important than ever now that companies have so many options to choose from. I would say let's try to get the data store correct for the workload and then string it all together with some data virtualization, data integration but truly every organization is going to be different. Well, thank you Yves, for joining us on the first episode of Data Decoded. The next time we talk to each other, we’ll be at the IBM Think conference in Las Vegas. I'm very excited for Think this year. If you'd like to learn more about the upcoming Think conference, visit the Think website and learn more about how IBM Unified Governance and Integration helps you achieve a trusted analytics foundation.
You can keep up with Yves and I at our respective Twitter handles @WilliamMcKnight & @YvesMulkers. Thank you for listening to Data Decoded and go out there and make sure you take some steps today to help your organization become data-driven. So long.