Making Data Simple: Faster data science

Making Data Simple: Faster data science


How is data science changing with the availability of high-performance data platforms? Vikram Murali, director of IBM Integrated Analytics Systems and Db2 Event Store development, Jay Wentworth, STSM of appliance architecture, and Thomas Chu, director of offering management and hybrid data, join Al Martin this week to talk about developments in data science and data platforms to accelerate your business decisions.

00.30 Connect with Al Martin on Twitter and LinkedIn.

00.40 Connect with Vikram Murali on Twitter and LinkedIn.

00.45 Connect with Jay Wentworth on LinkedIn.

00.50 Connect with Thomas Chu on Twitter and LinkedIn.

01.00 Learn more about the IBM Integrated Analytics System.

02.20 Learn more about IBM Cloud Strategies. 

05.00 Learn more about Hybrid Cloud Management at IBM.

10.30 Learn more about IBM Db2 software and capabilities. 

11.00 Learn more about other IBM Appliances.

17.30 Read Common Analytic and Fluid Query Engine by Ramarao Kothamas.

17.45 Learn more about Data Warehousing on Cloud here.  

19.00 Want to learn more about Data Movement Tools?

Ready to dig deeper? Check out our previous podcast episodes of Making Data Simple.


Al Martin: Welcome to the Making Data Simple series. This is your host, Al Martin. Today, we're going to talk about appliances in hybrid data management. I have several guests, and we're going to have an ongoing discussion here. One is Vikram Murali. Vikram is the executive responsible for our development. And Jay Wentworth, who is our appliance architect. And I have Thomas Chu, who leads offering management for the appliance in warehousing business. 

One thing that IBM had a release of the what we term as the IBM integrated appliance in September of 2017, and this isn't about IBM. This is an agnostic discussion. We're not trying to sell anything, of course, but I want to have a chat about the appliance in the industry because I thought it was good timing how appliances play into a data management strategy where my case hybrid data management, so I think it's an interesting topic. I get a lot of questions from the field, and I thought, you know, together, you know, we can answer some of these questions. So I'm going to just jump right in, call on, you know, we'll just see how it goes. Again, this is just a  picture us sitting around the tabl,e just shooting the breeze. So this week, let's start off with the simplest of all questions: what is an appliance? It's a refrigerator, right?

Jay Wentworth: You know, that's right. All I mean is that that's how we want to approach. We want it to be that simple. I've been doing this for a long time now, and we say over and over again keep it simple. So to me, I'll keep the definition of the appliance simple. It's a collection of hardware and software assets, with this single management interface that provides a simple, easy-to-use, highly optimized platform to solve a very focused need. 

Al Martin: Pretty good. You seem like you've rehearsed that or something. I mean. you know. well how many years in the business? 

Jay Wentworth: Yeah, 15 years doing this, it's become a mantra. That's what we say and make it simple. Keep it simple. Make it optimized. 

Al Martin: Let me let me challenge offering management here in Thomas. Thomas in the in the name of cloud right every  all I thought, all hardware is supposed to be free and software is open source so it's also free. Why in the hell would I ever need an appliance? 

Thomas Chu: Yes, Al, so thanks for having me today. On the offering date of the warehouse in appliance base year and, you know what? I get asked this question a lot. We've got cloud now which I can our I can be by bytes of data and things like that. Why do I still need appliance, right? 

And you know what? Yes, it's very simple. It's all about return of investment. It's really about, you know, what does love people saying the reason is how you buy a cellphone plan. Starting to sell a new cell phone for more than a thousand dollars and everybody say is ridiculous expensive concert that, but you know what I say to them? You go to this coffee shop to buy a coffee every day for five bucks every morning, and you know what, probably within seven to eight months to have enough money to actually buy that most expensive cell in the world. Is that that, but you're gonna use a cellphone for free? Use of the right, you know? So that is really what what that's about, right? The whole company since what we are trying to do is have a feeling that is extremely cheap to start with the ideal date when you want to do everything with a high-performance everything.

You tell me what it was like at the end. Yeah right so that is really  you have to carefully calculated return investment. What do you want to do and things like that. And the other part, which are really important and what this is really about, not everybody can be on a cloud. Now I have finance customers, everybody that needs to be more secure and it'll  Vikram, you and I talk about this all the time, we talk to customers almost agreed in everything so you can talk about your experience, too, about why it is so important behind a firewall and some of these things. 

Vikram Murali: Thanks, Thomas. So yeah, this is Vikram and I lead development for the appliance business unit at IBM and as Thomas is just mentioned Al. One of one of our a critical requirements that we keep hearing from customers all the time is that everyone now has a cloud strategy. They want to go to the cloud, but they can't do it today. It is a long journey. It takes time and then they have personal data which is healthcare information. 

It could be banking information that they don't want to move right away, and they want to do with that stages. So think of it as not as cloud on-prem, but it's the kind of behind my firewall plus a cloud strategy, and then you suddenly have a hybrid strategy. So that's where we come into play because when we are building this appliance, we are not thinking about how do we keep it just behind the firewall. But without latest technologies  without common yeah sequel engine, we can easily scale between what we have behind the firewall and to the cloud and have a hybrid solution. 

Al Martin: So, Vikram I've got to ask if you follow up questions relative to your last comment. You made a comment that somehow the appliance will help that make it meant that transition to the cloud. Can you describe that in more detail in the other thing is it was referenced earlier, too? You know plants is is going to be on a value-based assessment in other words, to return on investment. 

How do we make that assessment? I mean, when we're doing client engagements. You know how you know what's the protocol, what's the the process by which to make an assessment of that?

Vikram Murali: Sure, Al, so one of the many things that we do as with this  IBM integrated analytics appliance is that we use Db2 as the court engine, and Db2 is now available not only in the outlines form factor but it's also available on the cloud. You can be part of the cloud, so some of the main inhibitions for customers and when they say appliance is because I'm going to be stuck with it. No, you are not.

Whatever you do with the appliance behind a firewall, now will you have an option where you can just take the exact same workload, exact same applications without any changes, and immediately people are on the cloud. So we've made it simple, so whatever and wherever you are in that business journey in your hybrid journey going from an alternate solution to a cloud solution we have been offering that actually helps you do that and do it at your pace. 

So that's that's one thing. The second thing is when we talk about appliance, there's a lot of them. There's several customers and actually want to keep the data behind a firewall and for one reason or the other, they want to utilize tools around business out of date on that date I which is you know behind a firewall and and they don't have easy access to the tools everything has become you know most of these open source tools have become a cloud only, and that's why we have embedded in our data science experience. 

It is obvious signs and machine learning set of rules within the appliance, so from a data scientist perspective, from a data engineer perspective, no matter which persona you are, you're able to immediately access the docks still living in a firewall and do all of your, you know, analytics that you usually do. When you're ready and you want to move to the cloud, no problem. All these tools are available on a cloak back home as well and with with little to no work you can take the work that you have already done and just move to the cloud. So we do all the work for the customer that way they can just read the benefits. All easy transition back. 

Al Martin: And again, maybe I wasn't clear on this either, Thomas, or order how about the value basis doesn't mean how can a customer easily decide look the return on investment is better for appliance or better for the cloud or or whatever.

Vikam Murali: So I think I'm on value-based assessment. I usually  we don't, you know, force the customer into going one way or the other. We go with them. Because of them, we said we ask them what their business problem is and again, from an appliance perspective, most customers don’t like when they think they are stuck with software. The value comes when we have optimized everything where they don’t have to buy anything or spend time on anything or tuning and so on, we do all of that for them right out of the box. 

Al Martin: Where is the landscape of the industry right now as it relates to to appliances? I think this may have been stated already there already by Vikram or yourself, but if you were to riddle down IBM's point of view on what in an appliance is, and how it did seem to harbor data management your you know riddle through some bullets, what would that look like?

Thomas Chu: The appliance battleground right now is pretty intense right now. We've got vendors like Terror Data and Oracle are still having the appliance, and we know IBM definitely believe in that. That that's why we built IBM integrated analytics system on last year, but the main difference between us and our people is really about is hybrid data management strategy, right?

The appliance is one of the form factors based on a common single engine across our data platform. What does that mean? That means we are a company that would allow people to decide on their own pace and just like what Vikram said, on their own about where you can store the data on the prem in appliance which would guarantee performance versus where they can go on cloud. 

Whatever did they want to do, 20/80, 50/50 or whatever, this is where he comes in a common engine guarantee. Applications running on  correct me  right once one everywhere and you will be able to seamlessly doing it back and forth whenever you want to work along with the data science that Vikram is talking about. So that is we need a strategy here. 

Allowing the flexibility of customer to move to cloud moved to the data signs world, analytics at their own pace while the bread and butter business is just is continuing with every day we have to date. I think that is really good, so as important as the appliance we have here, it's really one important piece of a bigger picture and is one important element with a lot of all the elements we have to form this picture for any enterprise. 

Al Martin: Good answer. So Jay, somewhat  you know what, here's the problem with being an architect. You get all the very tough questions, right? These guys give give all the fluffy answers and I say are let's see what's real so here's my first question on appliance thing everybody including myself have the expectation that it just works. 

The performance is fantastic and, frankly, all I need to do is walk by time to time and check if the light is on. That is in how my mind it should work. So the question is what is fact and what is fiction? What is required if I am still a client?

Jay Wentworth: You know that's that's the whole key. That's the pretty exact perception that we want you walk by you check the lights if it's on it's on it's working it's working. 

Some of the key elements that we bring together architecturally to make sure that's the case is real careful attention to our selection of hardware components and software components. We make sure we bring them together in an environment where it's guaranteed to be highly available we handle the failures of individual components both on the hardware side and on the software side, and equally importantly, especially to the value question that you've been asking we make sure that we build the balanced configuration so we have the ability to be very careful about how many cores at what speed how much memory how much network bandwidth because we designed this thing from the ground up is ecologically balanced system so you're not buying more than you need and you're not getting less performance. It's fully optimized so some of it she architectural things that go into the value of delivering an appliance and what is the user need to do we really don't think it's that simple it comes in we turn it on it works.

And, you know, it used to be you monitor if something broke to make sure it gets fixed well we're getting even better taking full advantage of our own data science and machine learning and we're getting to the point where we can learn if the machine is breaking or is going to break and predict what kind of performance and what kind of service you're going to need. 

Al Martin: OK, I got it. So tell me about flexibility and elasticity. One of the advantages of going to cloud is being able to grow and only pay for what you use you know on the appliance side you know we're gonna ship in a box that's pre-integrated has a lot of performance advantages, et cetera, however, how does it compete from a flexibility and elasticity standpoint?

Jay Wentworth: Yeah, there's always been something that's very important to us on the appliance delivery model I don't think we'll ever get to the pristine flexibility over your cloud offering ability to scale up and scale down, but we're taking huge strides in that direction even within the appliance formal factor. 

We make sure that when our service our sales personnel and CCPs get ready to size a particular deployment we take into account the size of the existing workload in the short term growth of that and make sure we can accommodate it we also have the ability to do in place expansion of these appliances and rather fine grained increments so that we can grow. Again, it's not going to be like the cloud, where you you swipe your credit card you had a couple of chords you had a little bit more memory it's going to be a little bit more plan growth but I think it's a great model for most of our business cards. 

Al Martin: So am I right to say when I think of an appliance on one of the fundamental. I mean, you take it to the bank and that is performance. If I'm thinking of an appliance, I'm thinking, look, nobody beats us the appliance whatever in terms of performance. Is that all true did mean is there any nuances there any comments questions or concerns that I should have with that or is it just flat out it mic drop?

Jay Wentworth: You know what I would  everybody else have it but for me, that's it that's the appliance optimized it you're getting every bit of performance out of that, however you possibly can, and we we don't fool around with thousands of different tweaks and options and changes it's baked in. 

Al Martin: Thomas, Vikram, any challenges or any additions to add their comments? 

Vikram Murali: I agree with what Jay just said, and the the whole goal here is that we on the customer to have a very simple and easy to use experience so we don't allow them or are they don't have the necessity to go in and stop tweaking different knobs and fine tune the outlines of more than what they've already don,e so we take care of all that. All you have to do is wheel it in, plug it in and, as Jay said, make sure the lights are on and you're good to go. 

We are trying to make it that simple, it's only that simple, but with a machine learning and other capabilities, we are making it better. 

Al Martin: You know one of my mantras there is everything should attract the developer, it should be very easy simple with the developer now. You mentioned, I think, the Data Science Experience earlier, and I want you to elaborate on how the appliances fit into that model. 

Vikram Murali: Yes, historically enterprises usually tend to gravitate towards one programming language or the other but what we are seeing with the data scientists and engineers in companies nowadays is that we have a mixed set of people with different skills. For example, in the data scientist world, data scientists use code they use R they use Scala, these are you know very popular languages and they are used to notebooks, which are open source notebooks like Jupiter, Zepplin, R Studio and so on. 

What we have done is we have integrated all of those in a really nice form factor under our data science experience umbrella, and that comes straight out of the box within the IIAS appliance and what it does what it means is going to be an engineer or a data scientist is. I mean, a company might have a team of 15 to 20 people, a few of them might want to code in python, some of them might want to code in Scala, or R no problem we support all right off the box so we don't want to lock someone in one format or one programming language, so no matter what you're comfortable with, you can just come with that knowledge and still work on the appliance. The best thing is all these tools that are available right whether it's all embedded into our appliance which means that the data sits right there your tools it right there you don't have to go download nothing, everything happens right there. 

Al Martin: Very good. Thomas, so I got a question for you. You talked about the fact that, you know, I think you mentioned the common analytics engine across all these different form factors if I have different form factors I'm using with that common analytics engine, which I presume to be, you know, whether it's database or or data warehousing on cloud that or I'm hitting the the appliance. How do I create that common you know SQL client a common analytics engine experience across all these different form factors? 

Thomas Chu: Well, so, you don't really need to do anything other than knowing that when you write your application when you have the data access and when you go across mobile phone factor you know hybrid management portfolio, your application will work the same way. Now obviously with the appliance form factor just like what Jay say you run faster. You want the fastest, OK, but when you do tests to understand functionality you can use the cloud offering to really reach the same goal. To reach the same result and all the other things contested and then implement it in the same way on that. 

But more importantly there is actually a YouTube video on on the web of all IBM Analytics integrated system that we actually show how we can actually federate aggregate orders on factors including structured and unstructured data to we have stock portfolio for your recommendation in terms of a use case. So basically these days a lot of new data is born on the cloud and a lot of them are unstructured so that demonstration video actually show you how to keep customer data in your appliance form factor, get historical data from from the cloud using the structure warehouse on cloud and also getting the unstructured news data from master or some you know any newspaper using big sequel and a structured data and aggregate to get up and using the DSX and data science and Spark to be able to do anything on top that. 

So that is really the strength and power of the platform cell and as a developer you will have the ease of mind developing new application using whatever form factor or a combination of these form factor which we sell so that is really the whole power of the whole form with the appliance being the form factor with. But that is really something with just search analytics integrated system on YouTube and you will be brought to that exciting demo. 

Al Martin: My point of view is, you know, based on whatever the you guys are talking about having you having managed the hybrid data management business is that we've got a value-based appliance that just works. Nobody beats us on performance. That's my point of view in terms of the value proposition. You never lose your investment, meaning you can move to other form factors at your own pace so if I invest in appliance for example what our point of view is that as we develop this common analytics engine you got a database engine you could have an appliance you could pull that out and have it on prem on your own hardware or you could say, "Hey I'm gonna put in the cloud."

Anyone of this form factors use the same application the same common SQL, so that means I can relax as a customer I can make the proper business decision, and meanwhile, I am an appliance that just works, I don't have to worry about it I get the best performance in the industry one question I have for you though Thomas is how do I manage all those databases if I do have a common analytics engine across form factors is there a way to to to manage it from one source or do I gotta have DBAs in one area DBAs in another area that kind of stuff outside of the appliance. 

Thomas Chu: No, so the data server manager or whatever we call it a single people cost you know in a platform is really the place not only don't allow you to manage aggregate feathery on that and you'll be able to see a status of monitoring of those. so that is really what our commitment has been. Not only we we know that you're going to be having multiple form factors of the common engine, we are going to give you a calming interface to be with many just see all these things together.

Which is really exciting and that is how we want you to manage it in a very easy way you have a common feeling also how do you get to face with every one of them and obviously because the engine is the same so you'll be reading the phase I sending you request sending you app request the same way. So that is really what we do so, from a back end it’s common and from a front end its also common. 

Al Martin: So I got it I got it so Jay, two more questions for you, mister architect. Number one is I think that Vikram talked a little bit about this in in terms of what can we expect with cognitive built in. I mean I at what what kind of autonomous, self-driving features do we have? And second pieces none of this is this is really meaningless unless we don't have the proper data movement you gotta get data in then you gotta get data out and in a proper fashion with the speed that's I guess on president everybody wants more speed with data movement or can be a pain in the rear any comments on those two areas. 

Jay Wentworth: Yeah, absolutely. We start with the first one about how we taking advantage of our own abilities in in cognitive and machine learning and one of the biggest areas is we are pursuing here is the ability to take our real time in service in flight data from the users all of clients and recognize our official those appliances are being used what the workloads look like, if there's growth potential if it looks like we need to have it usual resources and if it looks like there might be a problem with a particular piece of hardware that needs to be replaced real soon. I think that's exciting that we can take that in automatically find out what's going on with the system and not only one system or we can aggregate all that data across all of our deployments and recognize when maybe something's going out of spec you would need to do something about it so I think that aspect of machine learning and data science is very exciting for us would be to be able to deliver that kind of service. 

The next one, data management, data movement, this is huge I think it's been very exciting to hear Vikram and Thomas and even yourself talk about our ability to, play a role in this broad offering will we have cloud on premise and in the appliance and it's the same sequel engine everywhere and so this does allow us as we talk about some of the the video that shows one use case about why a single query on a single piece of that platform can integrate data across all of it. 

What is an administrator we now give you the ability to decide, I want to move not just data, but workloads everything that goes along with that data from one form factor to the other whether it's your appliance highly optimized behind your own firewall only into the cloud to have broader access to the cloud tools into the cloud data that's a you know a huge usually exciting part of the offering. 

Al Martin: Well, nice work. You speak much more eloquently than I do.That’s just that's the Midwest in me I guess so here's what we talked about flexibility and elasticity we've talked about the developer performance, data science and machine learning built in, cognitive or a Thomas self driving this after the nature of the clients we talk about data movement and we were talking about how to manage the a unified console what we missed? Did we miss anything or have we nailed it? 

I got a few personal questions for you guys both personally and professionally in and just one simple question for each one of you. I will start with you Thomas, so what's up next for you personally what do you think about and what's up next for you professionally?

Thomas Chu: Why don't just talk about was about me professionally then? 

Al Martin: We want to hear the per personal too. You live in Toronto. It's gonna be cold. You gotta be doing something. 

Thomas Chu: So, first of all, I am looking forward to a little bit better weather in Toronto. We have a bitter cold get a cold day here in Toronto. Al, I know Al personally, that you hope that your little football team in your little city is doing the best compared to where Jay is. I know that. Seriously though, I have been an offering manager for a while now, I have been in the development team for the past 17 years here at IBM. 

I see how this team to get it and how I will form factor has been evolving and a lot of good decision that we make on there. In this industry I see things change starting from because I was I join industry during the dot-com boom and then we see how things got burst and then we see cloud and I think I am very very excited and committed all the whole machine learning and analytics future. I think you all see day-to-day life right now on everything and AI machine learning starting to hold to do we help our life to be better and making the right decision all business perspective personal perspective also so I do see, in my next generation when my two daughters going up and when do you have kids there, gonna have a different perspective how does was like and how we make decisions and how we use data to drive a better world so I do feel a little responsibility to help me for structure and full product informally strategy perspective to help my next generation to be born into that direction and to me having smarter decision and getting smarter world so I do feel I think this is all professional personal hoping that all the next generation will be at a distance of than us and everything.

Al Martin: Wow, that was pretty deep. Jay, can you beat that? And I don't want to hear about the the patriots. 

Speaker 1: I promise, but I know I can't do that although I am a little nervous since I see the  recording light is still flashing but that's OK. So for me coming up by my daughters getting married this summer and we're kind of focused on that. Real excited about it. So that's making us happy in my personal life and something to look forward to. Professionally, I keep thinking to myself, have I been doing this appliance thing for too damn long? 

Al Martin: No the answer is no, I will answer that for you. 

Jay Wentworth: All right so we'll see what can we do with the appliance, here is what I think. I'd love to continue to refine this notion of the appliance platform and then what runs on it and and just see where that goes me you know it's not maybe it's not just an analytics platform it's this appliance or what can we do with it but I want to balance that we don't go too far I know that I'm guessing that Thomas and I know that how might want to take this too you know the infinitely scalable cloud in a box.

To me that reminds me of the old time appliance battle will be a living world like who's going to be the king of set top box was it my DVR or is it my TV or if my cable box and you know it's funny that you you keep trying to make any given appliance do everything it does everything poorly look who's winning that battle now it's so silly little ask Alexa and ask Google. It is the box that controls the appliances behind it and so I want to keep working on the appliances in the background you know make them as good as they can be and do just the jobs that they were meant to do. 

Al Martin: Nicely answered by the way were still recording for a reason. Vikram you are up, man. Now you have to beat both of them since you have plenty time point of time to think about the question. 

Vikram Murali: I know, I know. Well, I can't complain about the weather. You know being in the area. So I think from a professional standpoint, I think we are at a great place right now. If you look at what appliances in the past 10 to 20 years, they are very silo doing one thing. To what Jay said we don't want an appliance doing everything, if we can purpose build an appliance where performance is the key factor and we give customers what they are asking for what they are creating for around data science mission learning which is like the next big thing right now, I think we have a solid product and what interests me not a professional life right now is working with this team working with extended IBM teams working without customers. I think this year is going to be a great great year or IBM especially on the appliance business beyond the distance. 

So that really excites me from from a personal standpoint I don't know what we'd just go enjoy the sun I guess but I have a 15-year-old and she and my wife wanting lights in the backyard for like a year so or me who leads the development team who can build appliances. I think I can go with the lights on I think that's what I'm gonna do this weekend. Not looking forward to it but something to work on. 

Al Martin: Last one for you. I hear you're zip lining is that true? 

Vikram Murali: Yeah,  we were vacationing last week and that was my first time zip lining and and my daughte,r she was the one who are forced me to do it, but absolutely fun. I think also haven't done it afraid of heights whatever it I felt very safe and and it was really really it's it's a great experience.

Al Martin: Alright, well I used to go skydiving. That's your next challenge.

Vikram Murali: Sure.

Al Martin: Vikram, Thomas, Jay, thank you very much. You guys are fantastic. It's a great podcast. I think folks will give us a ton of information from and I certainly did end up for those you listening thanks as always rate us, and I'll talk to you next time. Thank you.