Making Data Simple: Machine Learning for Dummies

Making Data Simple: Machine Learning for Dummies


Kick off 2018 by exploring machine learning and what it can do for your business. The authors of Machine Learning for Dummies – Judith Hurwitz, and Daniel Kirsch – are here to help you.

In this episode, Judith, Daniel and Al discuss the state of machine learning today, how to use it to advance your business as well as discoveries they made while writing their book. Learn how small and large businesses alike can find insights from data to enhance relationships with customers. We’ll also share where you can get a copy of Machine Learning for Dummies at no cost.

Show notes

01.00 Connect with Al Martin on Twitter and LinkedIn.

01.10 Connect with Kate Nichols on Twitter and LinkedIn.

01.15 Connect with Fatima Sirhindi on Twitter and LinkedIn.

02.00 Learn more about Hurwitz & Associates.

02.10 Connect with Judith Hurwitz on TwitterLinkedIn and find her blog here.

03.20 Connect with Daniel Kirsch on Twitter and  Hurwitz & Associates

04.00 Read Machine Learning for Dummies by Judith Hurwitz and Daniel Kirsch.

04.40 Learn what neural nets are here.

04.50 Learn more about Arthur Samuel here.

05.00 Learn more about how Deep Blue beat the world chess champion.

15.39 Learn more about Apache Hadoop

17.30 Learn more about IBM Watson.

26.50 Find Cognitive Computing and Big Data Analytics by Judith Hurwitz, Marcia Kaufman and Adrian Bowles.

27.45 Find Everybody Lies: Big Data, New Data and What the Internet Can Tell Us About Who We Really Are by Seth Stephens-Davidowitz.

Ready to dig deeper? Check out our previous podcast episodes of Making Data Simple.


Al Martin:             So Welcome back to the Making Data Simple series, and welcome to 2018, I’m not sure about you but years go by like minutes it seems like particularly the older you get. I don’t feel like I’m that old. But we’re excited about 2018. I desperately want to thank all the listeners out there because we have been growing exponentially, beyond my expectations so thank you very much. We will continue to find interesting topics and folks to bring in that talk both technology, leadership and everything under the sun. I also want to thank the producers that work tirelessly on this, Kate Nichols and Fatima Sirhindi, they find they great guests, IBMers or external too, industry experts and I know that they have a few surprises for 2018 so stick around. Please give us feedback and if you’re on iTunes or elsewhere please rate us we like to know how we’re doing.

So thank you, and here’s to the new year.

Al Martin:             Welcome to Making Data Simple.  This is Al Martin.  I have with me today Judith Hurwitz and Dan Kirsch.  So welcome guys.

Judith Hurwitz:     Thank you.

Al Martin:             Judith Hurwitz is the President of and CEO of Hurwitz & Associates, a strategy consulting and research firm.  And they're focused on distributed computing technologies.

                              Dan Kirsch is also at Hurwitz & Associates.  He is a research analyst focused on security, governance and privacy. So we've got some big hitters today.  The topic that we're going to delve into is “Machine Learning for Dummies.”  And before we jump in there, I see that both of you have very accomplished careers and I would be remiss if we didn't have a formal introduction.

                              And we'll start with you Judith.  I am not the sharpest tool in the shed.  But I do notice that Hurwitz & Associates also matches your last name.  You can start there, please.

Judith Hurwitz:     Certainly.  So, as you said, I'm Judith Hurwitz.  I've been in the industry for more than 30 years.  Also, I am the author of eight books.  In fact, our most recent book is called Cognitive Computing and Big Data Analytics.  So that came out around 2015 and we continue to do a lot of research, thought leadership content.  Dan and I are both spending a lot of time right now on machine learning, AI and looking at that also in the context of the hybrid cloud and the requirements for security.

                              So it's sort of ironic that the three topics that we spend most of our time talking to customers, researching, analyzing are all coming together.  If you look at cloud, if you look at machine learning, and AI and cognitive computing, as well as security they're all entwined right now.

Al Martin:             Fantastic.  Thank you so much.  Dan.

Dan Kirsch:           I'm the Dan Kirsch, Principal Analyst and Vice President of Hurwitz.  Been with the group for a number of years, but have been in the industry for fewer years than Judith.

                              But, what I've been focused on is machine learning and as Judith mentioned, we're really seeing machine learning throughout technology.  So we're seeing people applying machine learning to security, to hybrid cloud management, and deciding that just because these are the areas we focus on and we're seeing it intermixed.

Al Martin:             Fantastic.  Thank you both for being here.  I greatly appreciate it.  So with this podcast, I like to provide a ton of differing point of views.  I give mine from time to time.

                              In fact, I just did a keynote that I put into a podcast that talks to, you know, my point of view on the overall making data simple strategy if you will.  Today, you guys are kind of on the hot seat in that.  I want to start with the book, Machine Learning for Dummies.  I want to dive into further to machine learning specifically and then come back to the book.  So hopefully that's okay with you guys.  Sound good?

Judith Hurwitz:     Sounds good.

Al Martin:             All right.  So first of all, let me say that I thought you did a nice job with the book, because I often take notes around machine learning and I noticed almost every topic was the essentially outlined in the book.  So it was very comprehensive, so kudos on there.

                              I want to start by - you already mentioned, Judith, you have several books “for Dummies.”  And you've got a perfect audience in me right there.  But why machine learning now.  And I'll kind of preference this saying, one of the things that you talked about is what is old is new again.

                              And I agree with you because despite the high machine learning is really not new nor are neural nets, 1959 and if I could give a plug, a shameless plug to IBM, Arthur Samuel created a program to play checkers.  You even have this in your book, in 1996 Deep Blue beat the world champion in chess and in 2011 we took on Jeopardy.

                              But when you wrote Machine Learning for Dummies, is it because it's just a relevant topic in the industry right now or do you feel like we're on some kind of threshold.

Judith Hurwitz:     As you were saying, machine learning has been around for a long time.  I think, you know, why now?  Because I get a lot of those questions from people, why now are we talking about machine learning and AI, neural networks when they've all been around for a while?  Why now?

                              And I think the reason is because again, reaching into some of the things happening with cloud.  With cloud, we've gone to an era where storage and compute are much cheaper than they were in let's say 1959, where actually you could not afford to store as much data as you need for machine learning.

                              You could also not do the level of compute and sophistication, performance of compute that you need.  So setting that aside, now we have the ability.  We're sort of in an era where we've reached as far as we can go with traditional programming, which is why you see the advent of micro services and containerization and you're now seeing that applied to data.

                              You know, we typically wrote codes based on our prejudices and our assumptions about our businesses.  What, machine learning begins to give us is the ability to use a variety of algorithms and the models derived from those to begin to learn from data.

                              So if you learn from the data, the data is going to start directing you to, you know, hey, I'm going to show you a pattern or an anomaly that you don't necessarily see within the code.

                              So you begin to use data to spot things that you couldn't do before.  I mean remember one of my first IT jobs.  The CEO of the company wanted us to be able to – this was in HR's company, if you wanted to be able to leverage all of the data across all of the business units and lines of business, to start looking at what we learned.  Not just from each department or each offering, but across all of these to truly understand our customers and prepare for, you know, the next generation, what's coming next.  And it couldn't be done, then.  This really begins to open up the world in a much more sophisticated way to really put data to use.

Al Martin:             Great.

Dan Kirsch:           I'd like to add…

Al Martin:             Go ahead.

Dan Kirsch:           So besides the cloud and all these technologies, nearly every business can do machine learning.  There's also vendors have come out with tons of software tools that abstract a lot of the complexity of machine learning.  So machine learning isn't just the purview of PhDs, who have studied this science for, you know, their decades, but now business analysts can use some of the machine learning technology without needing to necessarily understand all the fundamentals of the underlying algorithms.

Al Martin:             If you could help me, though, can we take one-step back and start with a little taxonomy or even definitions in that?  Can you describe, Dan, the difference between AI, deep learning and ML, maybe even cognitive?  Because there's a lot of folks in the industry that tend to use it interchangeably and I've got to say I'm even guilty at times of doing so.

Judith Hurwitz:     If you want, I can get started and then Dan can pipe in.  Let's start with the idea that we have an umbrella market or umbrella category called artificial intelligence.  Now, I don't think that the work artificial intelligence is going to be with us in the long term because we're really not talking about artificial.  There's nothing artificial about it.

                              What we're talking about is the ability to argument intelligence, and that's human intelligence and machine intelligence.  You put those together and so if we can change the word artificial to augmented, then we start to get to the core of this.

                              So if we think about AI as this sort of umbrella term, it encompasses a lot of the key areas that we're focused on, machine learning, neural network, advanced analytics, predictive analytics.  All of these are or the capabilities that are under the category of augmented intelligence.  And Dan, maybe you want to talk about that a little bit.

Dan Kirsch:           Yes.  Sure.  So, yes, as Judith explained, we see AI as sort of the top-level way of describing all of these different technologies.  And then within AI you've got reasoning, you've got NLP, which is natural-language processing.  You've got planning and then you've got machine learning.

                              And so, machine learning is its own area within AI.  And then even if you want to go deeper, within machine learning, you have a number of technologies.

                              So one of them is deep learning, and neural nets, there's some other things, like supervised learning, unsupervised learning, and those are sort of the fundamentals within machine learning.

                              And then you can even go deeper within each one of those categories into different algorithms for supervised learning, different deep learning techniques.  And you can keep going deeper and deeper and deeper.

Al Martin:             So thank you for that.  Here's a question I have.  I'm just curious, is machine learning today premised around software writing software or is it really in terms of bettering the algorithms to predict and to prescribe the outcomes that you're going after.

Dan Kirsch:           Well, often times I think machine learning is - depends who you're talking about, often times, success with the algorithm.  However, sort of what we see is, you know, experienced data scientists, will often choose different algorithms that they like to use.  I mean, you have some people who always use a neural network, even if you don't necessarily need one to solve that problem.

                              And then you have others who, you know, just say, "Oh, you can do that with that a regression algorithm.  The way I see it is, right now, its machine learning is focused on algorithms constantly improving the underlying model.

                              So it's not improving the algorithm, but instead, improving the model and then of course to improve the model, you need good data and to frequently update the model.  So it's no longer - you aren't creating a model and then setting it out and leaving it alone for six months or a year, but we're seeing companies continually adding data to the algorithm to update the model.

Al Martin:             You mentioned this in the book.  You know, where does the line of distinction reside between supervised learning, unsupervised learning and reinforcement learning.  Where are we kind of at today?  Where's the focus?  Can you give me some information around that?

Judith Hurwitz:     So when we think about supervised learning, that's most of what we do today.  And supervised learning is you sort of know the area that you want to focus on.  You know that you understand the nature of the data. It's very structured and the problem is relatively well-known.  So you know exactly what type of data, what type of model that you need.  And you apply that subset or that cluster of data to that problem.

                              Unsupervised learning is when you don't really know what the data's going to tell you.  It's more of a free for all.  You're going to throw a lot of data at it and then see where the data leads you.  So it's a very different approach.

                              You obviously, for unsupervised learning, you need a lot more data.  You need to be able to cluster that data and iterate on that quite a bit.  It is starting, you know, I think we're much earlier in the use of unsupervised data because it's best suited for what we call unlabeled data, social media, things like Twitter.

                              The data is looking is, you know, is looking for a pattern, but it's not always really well understood.  So in healthcare, you might be collecting huge amounts of data about a specific illness or disease, to help doctors understand the patterns of symptoms, but you can't go in and sort of label, I know that this data is why you don't have the ability to do that.

                              So now, there are techniques that are evolving to help you, even if the data isn't labeled, to help you incorporate labels into that and it determines what the labels are.

                              Now, reinforcement learning is a - what we call a behavioral learning model.  So its algorithms get feedback from the data and the user is guiding it to outcomes.  In a way it's another form of supervised learning because the system isn't trained with the sample data set rather the system learns through trial and error.

                              So for example, you would use reinforcement learning to management the movement of the robot.  You might use reinforcement learning for a self-driving car, and which of course then leads us into neural networks, which is a form of reinforcement learning.

Al Martin:             So from a consulting standpoint, where are you helping most of your clients?

Judith Hurwitz:     So what we are finding is that the sophisticated scientist doesn't need us because they understand what problems they're solving and they love to work with these models, and algorithms and tools.  It's when, how am I going to put together a strategy that allows me to use this sophistication to better understand my customers to do a better job in, you know, using this as a strategic tool and as part of my business strategy.

Dan Kirsch:           Yeah.  As Judith was saying, what we're often doing is helping companies spot areas where they can apply machine learning and these different learning techniques.  So oftentimes we encourage people to look for a, you know, quick win.  Rather than trying to solve every problem that a company has is where can you have the machine learning for a complex problem, but one that you can sort of get a proof of concept done in maybe a couple of weeks, rather than a six-month project.

                              Because where we've seen companies fail with machine learning is when you have a massive project, it's going to take a year to develop.  At the end of the day, you didn't have a clear business problem and you might have some interesting insight from using machine learning, but you don't really have any results to show.

Al Martin:             It's interesting to me that you say that, because I visit a ton of clients, it's part of my role.  And I find that a lot of clients, as unbelievable as it may seem, are not defining the problem before they're implementing the solution.

                              Like for example, I was talking to a client the other day and they were talking about how they made the transition to Hadoop, but then they went on to tell me how they were structuring everything, you know, getting everything in Hadoop's structured.

                              And the more they went on, I was sitting there thinking, well, you know what would be great for that, is just a relational database.  I'm not sure why they went to Hadoop.  And I think sometimes it's they've got that on a checkmark.

                              But almost looking for magic is that crazy or, I mean are you guys seeing that too, I guess in your area.

Judith Hurwitz:     That was really what Dan was talking about.  So what we see a lot is when you're letting the technical developer be expert, you know, the data scientist drive strategy, initially, they may be interested in cool tools that they can take advantage of.  The reality is - I sort of tell a joke that if you're using to measure the distance between two points, a ruler is the best tool to apply to the problem.

                              If you start with cool technology as opposed to what you can do it with - and it's funny because I see this both from customers who say, "We want to use Hadoop.  We want to get into machine learning.  We want to use AI in our business."

                              You first have to start with what problem you're trying to solve and I see the same things with emerging vendors in spaces like this.  They will say, we have a cool tool but then they'll forget to talk about what problems did that solve for the test.

Al Martin:             Yes.  The way I handle that, I mean it's very difficult.  Is usually I bring you back to what data that they have at hand.  And then I identify where they are in the maturity curve and what I find out is that they're often still in data preparation.  And then we talk about, okay, what problem you're trying to solve again, and then match that back up.

Judith Hurwitz:     Right.

Al Martin:             But, I mean, that's like a pervasive issue that I see across the industry.  I mean, they're going to Hadoop before they know why.  I mean, Hadoop's great.  It's got a perfect use case, anyway.  It's interesting that you're seeing the same thing.

                              So this brings me to another story in that I had a key note at a client success conference, some time ago and I walked out of the keynote and, you know, it was in the area where they had all the booths.

                              And there was one booth that had - talked about client support with artificial intelligence.  So I went over to the gentleman that was leading the booth and we started talking.

                              I used the knowledge that I had had and we started talking about algorithms.  I started talking about data science.  I started talking about tools.  Kept going on and all of a sudden, at some point in time, he paused and said, "You know, because we are using this is in IBM, we're using Watson, to identify problems, kind of like you'd seen for those that are listening, in the jeopardy games, so it really works for us well.

                              But he paused for a moment, and he said, "This isn't like IBM Watson or anything."  And almost under his breadth he said, "These are just fundamental statistics and some algorithms."  So my point in that is, I was kind of like, "Hmm, my point in that is I see machine learning, DP in research and development."  I also see companies grossly stealing the name, even AI or otherwise to just get attention because it's the theme of the day, if you will.

                              But I also think to your earlier point, unlike ever before, we've got datasets, we've got tooling, the plummeting cost of compute and storage, ML can be a reality, which brings me to my question to you, what's you view on the practical implementation of ML today and where are you seeing it applied in providing real impact?

Dan Kirsch:           All right, yes.  So some of the most common areas of machine learning use, is in financial services companies, who are often ahead of the curve in terms of technology investments, are using it in all sort of areas.  So for instance, broad analysis, if you want - if they need to make instant decisions on every single credit card swipe on whether this is fraudulent or not and if they allow fraudulent changes to go through, they have to reimburse the customer.

                              So that's a huge area where we're seeing companies using machine learning, where you're looking at the patterns of the customers, you're looking at where the purchase was made, how large was the purchase, what kind of day is it.  Maybe whether or not you know whether the customer is traveling, other areas that we're seeing machine learning is of course retail.

                              So, you know, everyone has seen on Amazon, you know, customers who bought this, also liked this and that.  We're seeing machine learning being a big part of ecommerce sites’ strategy.

                              Another way is also security systems.  So looking at network security along with employee activity that spot, sort of the next generation of security operations are implementing a lot of machine learning to give context to their traditional security devices.

                              So they have thousands of alerts a day that this behavior might be malicious.  Well, if you're able to look at that data in context, you might be able to eliminate 75% of those, rather than having an analyst spend two minutes looking at this alert, just to say nope, it's not important.

Al Martin:             So let's talk a little bit about that technology though.  In your book, you provide organizations recommendations on how to, you know, get started, applying machine learning.  Can you summarize some of those key points?

Judith Hurwitz:     So in terms of getting started, you have to be educated.  And as we were talking about before, you also have to understand what the business problem you're trying to solve.  It may be the type of problem that a traditional BI tool will solve.

                              So you have to make sure that you have enough data that your problem could be supported by a sophisticated machine-learning model.  But in many cases, what organizations are finding, they have a lot of hidden data.

                              They have a lot of data in places where they don't even think that they have data.  They have data from maybe from partner organizations, and they also have to understand, and say, "What are the data sources?  What type of data are we talking about?  Where is that data?  Do I have, you know, data siloes across 30 different divisions of my company?"

                              So first, you have to understand what your problem is, where the data is, the nature of that data.  And only after you've done that, is understand, you know, what's holding you back and the problems you want to solve, only at that point, can you begin to sort of take it to the next level from a business perspective.

                              And then you also have to then look at what resources, which are on the market, can help you.  So, you know, figure out the tools and technology that are most - a best match for the problem you're solving.

                              And you don't want to go from that point to all of a sudden, investing billions of dollars in a massive effort.  You want to first do a pilot project.  And you want to gate it so that it's manageable and something that you can do in few months because that's the only time that you'd really be able to understand the impact of what's possible, because you're doing something that's manageable, you can get a quick result or maybe you fail.

                              But you learn a lot from that.  So being able to sort of pilot and sort of try things out is quite important here.  And then, you know, evaluate what you've got.  What happened, what worked, did we suddenly learn new things that we didn't even understand before.  Has it taken the biases out?  Were we making assumptions that the data's telling us, no, that's actually that's not true.

                              So once you have the pilot, and you have some failures and some successes, that's when you can begin to build a true plan to involve data science, involve some of the higher-level tools that are beginning to come onto the market.  So it can help you.

                              Here are some of the tools that I really like, is the idea that you feed the tool your data and it helps you determine which is the best algorithm and the best model to solve your problem.  So things like that, and being able to pre-train some of your data.  So if you're dealing with a specific problem, you can use pre-trained data which will save you a tremendous amount of time.

Al Martin:             Well thank you Judith.  Where do you think we're going in the future now?  And I was pleased to see that you had a future section in your book as well.  I mean, do you think we're heading towards machine learning as a service.  And I guess the second part to that question, if you can answer it is how does the little guy compete with the IBM, the Facebook, the Googles of the world, as it relates to machine learning or do you see it purely as complimentary?

Judith Hurwitz:     When you say the big guys, do you mean from a vendor’s standpoint?

Al Martin:             I mean it's an open general question.  But you look at IBM, Google, I mean, they've got armies, you know, focused on augmented intelligence, focused on machine learning and if you're wanting to compete against them, I mean, what is your - the little guy trying to get into this space, what should you be thinking, using their platform or…?

Dan Kirsch:           Well, I think it's complimentary, because, you know, you can use some of the machine learning tools from the big guys, so the IBMs, the Googles, but smaller companies, research groups, other vendors, they're going to have unique data that IBM or Google doesn't have.  Hospitals might have imaging data that no one else has.

                              And so I think it all comes down to the data.  That you've got unique data or a different way of looking at that data, and I think just, they're taking Twitter data or some other data that's probably available but then augmenting it with your unique data and you're going to have output that other people aren't going to have.

Al Martin:             So that's very interesting.  Thank you for that.  So let me let the listeners know that, very quickly, where they can find your book.  It's on ibm.co/mlfordummies.  Again, ibm.co/mlfordummies.

                              I'd like to do a quick lightening round, just ask a few personal questions.  Don't worry.  They won't be that personal, so our audience gets to know you a little bit.  So quick questions and thank you so much again.

                              Judith, I've got to ask you this question.  So clearly you're an entrepreneur.  You're on several boards.  You also seem to be very technical as well.  How do you management being technical and still running a business?

Judith Hurwitz:     Well, I've got great people like Dan helping me.  And that, you know, I think one of the little secrets is, as I said, there's nothing new under the sun.  So when there's an emerging new technology that I pick up on, I know where it came from and I typically have done research or consulting in that area.  And so it's like, "Okay, I know what that is."  Then it's not a big leap.

Al Martin:             Very good.  So Dan, similar question to you.  So you've got this book, you've written it with Judith.  How do you find the time?  I find that leaders have some form of practice or cadence that they, you know, they get up every day, they do some kind of-- they write, they work to learn.  Do you have a practice or cadence that people could learn from?

Dan Kirsch:           Well, I subscribe to a few newsletters that I always like to read every morning, some email newsletters and then I think the way that Judith and I work is we try to collaborate on nearly everything.

Al Martin:             Teamwork.

Dan Kirsch:           Exactly.  So I think it's passing things back and forth.  If you're struggling with something, hand it off and take something they were doing.  And sort of collaboration and teamwork, I think.

Al Martin:             Hey, one last question.  I am a collector of books.  And that's why I'm so interested in Machine Learning for Dummies.  So the first recommendation I'll give the community is go read Machine Learning for Dummies.  Having said that, are there any other books and all your other books, by the way, but any other books that you'd recommend that you think are just stellar in that, you know, people should read?  I keep a list, because I try to get through them all.

Judith Hurwitz:     Well, I would definitely recommend our other book, Cognitive Computing and Big Data Analytics.  You go through a lot of these issues and it puts it in context with a lot of issues around unstructured data, so I recommend that you get a whole of that.

                              There are so many good books, you know.  Off the top of my head, I can't tell you the authors and the names but when I was doing the, you know, original research for the Cognitive Computing book, I read a bunch of books on, you know, Machine Learning, and NLP.

                              NLP is getting, you know, Natural Language Processing getting so hot right now.  And there's some really stellar leaders who have written very technical but approachable books in those areas.  Sorry, I can't rattle the names off the top of my head.

Al Martin:             Fair enough.  Fair enough.  Anything else you'd add, Dan?

Dan Kirsch:           No.  I mean, books are great.  Then also there's an interesting book, Everybody Lies.

Al Martin:             Everybody Lies.  Oh, the name is interesting.

Dan Kirsch:           Yes.  And there's another book.  Yes.  I cannot…

Al Martin:             That's all right.  All good.  Hey, just for the listeners, one last time, ibm.co/mlfordummies, go get it, live it, love it, learn it.

                              Judith, Dan, a pleasure having you today.  Thank you for joining.  I've learned a lot.  And I appreciate it.  Until we run across one another again, thank you so much.  And I'll talk to you next time.

Judith Hurwitz:     Excellent.  Thank you.  It's been fun.

Dan Kirsch:           Thanks.

Al Martin:             Thanks, Dan.

Judith Hurwitz:     Take care.