Making Data Simple: Cloud computing, part 1

Making Data Simple: Cloud computing, part 1


Do you know how often you are using the cloud every single day? In part one of our discussion with IBM Fellow Sam Lightstone, learn about cloud computing and why it is increasingly important in our data-driven world. Also, learn alternatives to loading private data to the cloud, data movement, and how to use the cloud to benefit your organization.

Show notes

00.30 Connect with Al Martin on Twitter and LinkedIn.

01.40 Connect with Sam Lightstone on Twitter and LinkedIn or his personal website.

04.45 Visit Onalytica to learn and see more IBM Influencers. 

07.45 Learn more about IBM QueryPlex.

10.30 Learn more about IBM Hybrid Cloud Platform.

12.00 Learn more about Aspera. 

Hungry for more? Check out our previous podcast episodes of Making Data Simple:


Al Martin:             Welcome to Making Data Simple. This is your host, Al Martin. Today I have with me the myth, the legend, the IBM Fellow Sam Lightstone. He’s even got the rock star name. Sam, how are you today man? 

Sam Lightstone:    I’m good, I’m good. Thanks for inviting me. 

Al Martin:             Hey, first question I have for you is, do IBM fellows put their pants on like everyone else in the morning?

Sam Lightstone:    Sir, no, we don’t put our pants on like everyone else, and I can’t tell you how we do it or I’d have to kill you. 

Al Martin:             So look, I’m away from the office today because this is turkey week in the US, but I’m so excited to talk to you that we – I came in so that we could have this discussion. What do Canadians do during turkey week? That’s what I want to know.

Sam Lightstone:    Yes. It’s — we already had our Thanksgiving. So, mostly we just relax because we have a lot less e-mail this week from all of our American friends, so. 

Al Martin:             So you get two Thanksgivings then. 

Sam Lightstone:    Yes, pretty much, pretty much. 

Al Martin:             Nice, nice. All right, very good. Hey, well, I kind of talked over you. Why don’t you give yourself a brief intro, then I’ve got some – few questions for you. 

Sam Lightstone:    All right, sure. So, let me just — I just talked a bit about who I am, and what I do at IBM. I’m an IBM fellow, as you mentioned, and my work is mostly in product development and the technologies that I worked on that relate to relational databases, data warehousing, cloud computing, analytic appliances that we’re working on — the distributed analytics for Internet of Things. 

                              And I’m starting to engage on how we can leverage machine learning, and tap into machine learning and infuse it in the full domain of data management. 

Al Martin:             Awesome. So, I invited you for several reasons, but I think two come to mind. Number one is that I’m count on — counting on you and your technical acumen to provide Horizon 3 Technology within our portfolio. 

                              So, to be honest with you, I’m trying to get on your good side. And number two is that you’re an interesting dude, and I know recently you just received a — some super-duper cloud award from the industry, and I think I’d like to start with that, talk about cloud, build into how you got the award, any thoughts you have there. 

Sam Lightstone:    Wow. Okay. So, where do I start? So I — probably all of our — all of the people listening to this podcast are familiar with cloud but, you know, for those who are not, we all use cloud every day. Every time you use Google, or Netflix, or Facebook you’re using cloud services. 

                              And the whole idea of cloud services is you’ve got some company, whether it’s Google, or IBM, or Microsoft, you know, we’ve going to provide you capabilities over the Internet, and you won’t have to worry about the storage devices, and the Internet, and CPUs, someone else will take care of that for you. 

                              So, that ability to leverage IT infrastructure, and software services, software runtime, software middleware, software applications over the Internet, is called cloud computing. It’s a way of abstracting the way of IT infrastructure and it’s really been taking off over the last many, many years. 

                              We’re coming into a serious adoption, not only for personal use, like Facebook and Netflix, and so on, but real serious use for Fortune 500 companies. So it’s — five years ago people were talking about it, today it’s totally mainstream.

                              I — some — a lot of my work at IBM is in the domain of cloud computing, and handling serious, you know, real industrial strength problems for large scale companies in the cloud with the data technologies that we provide at IBM, which are really phenomenal technologies. 

                              Like you can see on Db2 Cloud, Db2 Warehouse on Cloud, are composed a suite of products our cloud (in offering) for (Jason) and many others. So recently when I got this word for being a top 100 influencer, and it was awarded by a company called Onalytica, and they publish this list every year of the top 100 influencers. 

                              I actually don’t know how I got on the list, to tell you the truth, but it probably has something to do with my presence on social media, I do like to blog and tweet. So it’s something that I’ve done online, tweet there a few learning algorithms. I don’t know if it was a particular tweet that went viral, but I’m told that their selection is completely mathematical, it’s not based on (human element, or anything like that). 

Al Martin:             It’s just your reputation precedes you. You know, they know you’re the expert. I like it. Hey, so let me pause for cloud — on cloud for just a second. I mean, it’s related to cloud, the question I’m going to ask you, but I’ll get back to more in-depth cloud here. 

                              But, you know, the interesting thing, to me, is that as of late — and you and I recently talked about this, I don’t know if you remember it. But Andreessen Horowitz 5:05 had a blog, I think it was a video blog, that talked about the end-of-cloud computing. 

                              And he’s actually suggesting we’re heading — I guess — well back to the edge, if you will, if we ever were there to begin with. You know, in other words, trying to figure out where the puck intended — hey that’s a Canadian reference by the way. See, see how I do that? 

Sam Lightstone:    That’s good. 

Al Martin:             But it’s intended to be predicting the future, he’d say it’s at the edge, and the interesting thing about that is, is I know that you’re doing some Horizon 3 work that kind of suggests that you kind of believe it too, because you’re doing some work at the edge as well. So, I guess what I’m trying to ask, in a not so eloquent way, is where does Cloud sit as it relates to the edge?

Sam Lightstone:    Yes, phenomenal. It’s a phenomenal question. So the honest answer is, first of all, none of us knows for sure. So, all I can share with you are some philosophies and observations. 

                              And, you know, the way people do — there’s so much data that’s now coming from the edge, let’s talk about what we mean by the edge because I think some people probably are not familiar with the terminology. 

                              But, you know, all of us are familiar with the idea of wearable devices, you know, you have a watch that’s connected to the Internet, your phone is connected to the Internet, there are glasses that are connected to the Internet. 

                              Your refrigerator, and your stove are connected to the Internet as well. Your home is connected to the internet as well. All these kind of remote devices, or what we call edge devices, and sometimes they’re more serious computers, like your phone is actually a pretty serious computer, and sometimes they’re really move very simple censors, they’re just collecting data and spitting it out. 

                              All these things are called edge computing, and the way we do analytics on these things today is we suck the data out of these edge devices, and we put it into data repositories where we left on-premises, like a relational database that’s sitting inside of a bank, or perhaps in the cloud, you know, some big data cluster in the cloud. 

                              And we put the data in there, and we have to keep feeding it in because the data’s arriving any second, and then we run the analytics there in a traditional way. But the thing that’s interesting to me is that increasingly the majority of our data is going to come from these edge devices.

                              There’s already, you know, each of us has many, many devices in our lives that collect, that are — the generating of data. So we’re talking about huge numbers of devices — I mean, you’re talking about dozens of devices, if not hundreds of devices per person on the planet. 

                              So that’s a lot of load of data generation that’s going to totally swamp what we’ve been doing inside the Fortune 500 with the on-premises computers, and every one of those devices has some amount of computation available, some amount memory and storage available usually. 

                              So, it’s going to be a wild inflection, just wild. We announced this compute that’s at the edge, is going to be (owners) of management larger than the amount (of computer) that your can put on premises, or even in the cloud.

                              So my theory is that we should be able to tap into all that computational power, and that’s why we’ve started projects like the Query Plex project that we’ve already started to speak about publicly a little bit conferences, so I’m about — I think I’m allowed to say it on the podcast as well. 

Al Martin:             To that end then, in my simple mind, that would essentially say that rather than taking all this data and putting it in the data lake, what you’re suggesting is technologies as we move forward, if they can use that computation at the edge, we can do — they can do a computation at the edge. 

Sam Lightstone:    Yes, that’s — in putting it in a sort of theoretical sense, that’s what we hope to achieve with that, but I think there will be cases where there will certainly be problems domains where that is not the case and let me give you an example of one. A lot of the censors — a lot of the Internet of Things, censors that they put on buildings and smart heaters, many of them actually don’t collect any data. 

                              They literally sent data and spit it out, and have no persistence stored. So a censor that has no persistence stored, is probably not a good (use) now, and not a good opportunity to do the kind of hard core data analytics there that we could do on premises because there’s no history of data, and usually with analytics you want to do it with semantic history. 

                              So there will be cases where it’s a great opportunity, and there will be cases where maybe it’s not a good fit, and time will tell. 

Al Martin:             All right, cool. I went off into the future, I think that’s fascinating. By the way, those that are listening, when you get a load of Query Plex, you’re going to be as excited as I am.

                              But we’ll pause that for a second, we’ll get back to — into the present, which is fine. So, going back — coming back to the cloud, where we’re headed, where we’re at, I guess. What are the biggest challenges you see organizations face, as they consider moving to the cloud? 

Sam Lightstone:    Yes, I think there are a couple of problems as people move to the cloud. You know, cloud is so compelling because it’s like it’s all these services on tap, computation on tap, and databases on tap, run-time’s on tap, it’s all these stuff on tap. 

                              Use what you need, pay for what you need, and don’t need it you stop input. And it’s all there at the click of a button, so it’s like so super compelling. But there are these splotches along the way, and one of the biggest splotches, I would say, is that no company can move all their stuff to the cloud all at once.

                               So it begins, and it’s a journey that’s going to take a lot of time and a lot of these companies have IT infrastructure and applications that they’ve building up — that they’ve been building up for decades — literally decades. They’re not moving all that stuff overnight. 

                              So they’re need to move, again is, in a journey, which is going to take several years and that means, during that transition, they’re going to have a certain amount of applications and data that’s sitting on premises, and another amount that’s in the cloud. 

                              And when on premises that maybe standing in, you know, standalone servers, or maybe in increasingly in what we call private files, where they try to immolate the quality into the cloud within their own data census.

                              So to make that possible, you know, if you’ve got to rewrite your application every time you move it to the cloud, that’s extraordinarily expensive and disruptive. That’s one of the biggest problems.

                              That’s why at IBM we put such a focus on what we call hybrid cloud, and the idea of hybrid cloud at IBM was that you can write your application, and you can use middle row — middleware services from IBM, and they will work wherever you put them. 

                              They’ll work on premises, they’ll work on the cloud, they’ll work in private cloud. And we have a very aggressive technique for doing this, which is, essentially, shared code modules ideas, where products are built literally with common code in a cloud, and on-premises, and on private cloud. So whichever the — whatever your target is, you kind of make sure those — that the application will run equally and less impact on the (unintelligible) driver, and all that good stuff, is working in the cloud and on-premises system as well. 

                              So that’s one problem though, is this journey of having to get from environments totally on premises, to an environment that’s totally or mostly on cloud, knowing that it’s going to be a multi year  journey and you don’t want to have to rewrite the applications next month.

                              Another big one is this friction — this friction of public Internet. You know, the Internet actually is not that fast, and it seems fast when you’re watching Netflix, and know you can watch a movie over the Internet. But, man, if you’re trying to do real enterprise class computing, the Internet is a serious point of friction there if you’re trying to move data — terabytes of data, it can take forever.

                              So, how do you move serious enterprise class data from on-premises into the cloud? That’s a major sticking point, and another one that we’ve been tackling pretty aggressively at IBM. I think it’ll lessen closely, you may have notice we acquired a company called Aspera. 

                              It has given us some very profound capabilities to move data from (ground to pod) quite quickly in, you know, hundreds of gigabytes per hour. Now that — a hundred gigabytes per hour makes no sense to other people who are used to, you know, how fast you can move data inside your organization.

                              But actually moving data — hundreds of gigabytes an hour over a public Internet is magic. It’s really magic, and that means that you can start moving tens of terabytes of data over a weekend, so we can keep the job (all have it running). 

Al Martin:             So, Sam, you may have already talked about this and maybe it’s a little bit obvious, however, you described in your view what you see as the big advantages for organizations as they make the move to cloud. And maybe from not so obvious advantages that you’ve seen as you, you know, have been working with many clients as I know you do. 

Sam Lightstone:    Yes. I would say there’s a few huge advantages of the cloud. The first one is actually perhaps not the main thing that people have in mind when they go to the cloud, you know, they’re often thinking about skills reduction or cost considerations. 

                              But it actually — probably the single biggest thing that is maybe the unintended consequence of the Cloud is the speed, the speed of innovation, the speed of prototyping, because all of a sudden when you use the cloud, you have all this technology that is literally on tap. 

                              The hardware’s — you know, the hardware capacity, storage capacity, the compute, the networking is all on tap. You can fire it up, you know, you don’t have to go to your finance department and start begging for some large amount of money so you can go buy some hardware, and then after a few months they come back when you order the thing, and then it comes back after a few months there. 

                              And by the time (that a week) roll is done today on-premises environment, you know, it could easily have taken you the better part of a year to get — just to get the hardware that you need to start some project. Granted, the cloud you can literally fire it all up in minutes, and you’re — and you can start prototyping and try your new project, and literally for just pennies and sometimes entirely free. So that opens up this whole inflection point where you can really start prototyping and trying ideas on the cloud at very low costs, and again sometimes you can — entirely free. 

                              That’s one big, huge unexpected consequence of the Cloud. On the more expected front, you know, a lot of these cloud services that we offer, you know, they come in many forms, but truly the strategy that we have at IBM is to offer manage services. 

                              And what we mean by manage services is that not only are we providing you the compute, the network, the storage, and some software capabilities, but we’re doing it at as a manage service on the cloud that we’re taking care of the installation, the configuration, the tuning, the adaptation, the resiliency, the higher availability, the disaster recovery, so that you don’t have to and you can focus your energy on the business logic and the innovation that you’re trying to create. 

                              So you don’t have to have those other set of skills, we take care of those sort of more mundane skills. I say mundane they are highly specialized. They’re mundane because nobody wants to wake up in the morning, get excited about high availability and disaster recovery, but – or, you know, database tuning, stuff like that. 

                              But these are very specialized skills, and so if you can tap into these services and not have to worry about those, somewhat rare skills and you can focus your energy on the things that matter to your business, the innovation that’s unique to your business, that’s just a huge win. That’s a huge win in your efficiency as an organization on how you can focus your technology energy. 

Al Martin:             So who, in your mind, wins? I mean, when I say that, who wins the cloud wars, or what wins the cloud wars? Is it those that can move data the best? Those that separate, compute, and storage and maybe have unique business models therein around that, so the, you know, they compete on price? Who do you think wins? 

Sam Lightstone:    Well, I think it’s a little too early to know for sure, and we’ll find out in hindsight a few years from now, but there certain things which are table stakes. People definitely expect continuous availability. You know, when you log onto Google, or Facebook, you expect it to be there, right? 

                              You know, it’s sort of almost unimaginable. What do you mean Google’s down? What do you mean Facebook is down? I mean, you don’t imagine those things, and that’s true not only those services that we use in our homes, it’s true of the kinds of cloud services that our people use in enterprise applications. 

                              And so continuous availability is paramount, and if companies fail to achieve that they’re not going to be successful on the cloud. Separation of compute and storage is another major theme, and that’s the notion that people want to be able to make a decision about growing their compute, or growing their storage, or trapping, and they want to make those decisions independently.

                              And that was very hard to do on premises where you had to buy the storage, buy the computers, it was a very fixed decision, but on the cloud people are expecting the use to be more elastic and be able to change their minds, not only over time, but maybe even several times a day.

                               And so those are table stakes. I wouldn’t even call those differentiation points. Those are table stakes. Companies that don’t engage in continuous availability, and separation of compute and storage, elastic allocation of these resources, they’re not going to be in the cloud business for very long. 

                              I think that, you know, different companies are taking different tactics, or different strategies for winning. And certainly one of the things that we’re very proud about at IBM, is this tactic around hybrid that you can – knowing that the movement to cloud will be a journey, we want to make sure that it’s easy for you to move incrementally day-by-day, increasingly moving applications to the cloud, and you may decide in some cases to move them back from cloud on prem. 

                              You may have by the way, in many cases, where you want to run applications in both places, and I’ll give you an example of that because you may, you know, people hear about this they go what are you talking about Lightstone. 

                              But seriously, there’s piles of great — good reasons while you’ll want to run some things on premises and some things on the cloud, and here’s a great one. Maybe, for a whole bunch of security and policy reasons, you have certain applications that you must run on-premises, you’re not ready to put that application or its data into the — into a cloud service, or it’s across some public Internet, very understandable. 

                              Nevertheless, you may have a team of developers who need to write code for that application, and wouldn’t it be great if you could just fire up a development environment for them in the cloud, at close to zero cost. And they can fire these things up, and shut them down any time, many times a day.

                              Super low cost, if not free. But your quality assurance systems, and maybe you can do that in the cloud, and again, at super low cost, fire them up, tear them down, while your production environment, which is the reason you want to keep on-premises, stays on-premises. 

                              That’s the kind of thing that you can do hybrid strategy that you can do at IBM. That (goes to the companies in terms of) support. So, that’s one strategy that we’re using at IBM that we’re excited about. 

                              Other companies are going all in on cloud and nothing on premises. Some companies are holding out on premises and nothing on cloud, and everything in between. 

Al Martin:             So, here’s — so here’s a question. I visit a lot of clients like you do, and honestly, many clients are struggling in terms of how to get started. In fact, one of the go to, you know, strategy sessions that I often take with me is (maturity curve), you know, starting everything, talking about client servers, data science, machine learning, you know, and everything in between. 

                              But since many of these clients went digital, they’ve been on premises, right? So, I think their tendency is to be reluctant, to be cautious toward the cloud. Some of them are — like you said, some of them are going all in, but some of them are just kind of playing with like new use cases and what not, but with breaches, with the security of their beloved data, and I get it right, if data is the new loyal, you’re going to give it away, if somebody takes it, or you know, somehow you open it up and it’s not safe. 

                              So, I guess, back to the simple question again, how does a company get started with cloud? What do you see as the first step to making that leap?

Sam Lightstone:    Yes, I think the first step really is to start small. Pick something just, you know, put your foot into it, so to speak, but dip your foot in the pool. Pick some application, and maybe something that’s got a relatively small amount of data, and try it out, you know, and have that experience of optimization on tap, resources on tap, someone else is worrying about how to make it run smoothly, excellently, continuously available, distributed. 

                              Let that be someone else’s problem. And, you know, the great thing about the cloud is you’re always getting the latest. You’re always getting the latest software, you’re always getting the latest optimizations, you’re always getting the latest hardware. Someone else takes care of that for you. 

                              So start with something small, and I — and you know, many cloud providers — cloud services offer free versions, or sometimes what we call (freenian) versions where they’re free up until a certain amount of usage, so you can really start playing with this stuff literally for free. 

                              And I think that, you know, what I hear from clients is that once they start using cloud services, especially the fully managed services where we take care of all the configuration tuning for you, they get hooked. They’re like wow this is great, I can up and running so fast, I want more of that. Just start small. 

                              Start small, start with something that’s close to free and try it out, and, you know, sort of find some examples and opportunities where you just got a few gigabytes of data. And with usually with a few gigabytes of data you can start using cloud services for free. 

Al Martin:             Hey folks as usual I had more questions than time allowed for so we split this podcast into two so part two will be next week so stay tuned and we’ll be right back at you.