<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=2006193252832260&amp;ev=PageView&amp;noscript=1">

16 Min Read

Watch Karen Hao's Session from MAICON: What Is AI? (Video)

Featured Image

Wondering how to get started with AI? Take our on-demand Piloting AI for Marketers Series.

Learn More

Can you answer the question: What is artificial intelligence?

If the answer is no, don’t be bashful. It’s a complex topic; one that Karen Hao (@_KarenHao), AI Reporter for MIT Technology Review, tackled on the main stage at the 2019 Marketing AI Conference (MAICON). 

Not surprising, the topic is ever evolving. In Karen’s words, “What we may have considered AI 20 years ago, we may not consider it AI today.”

During her talk, she took us back in time to the origins of AI—dating back to 1956. From there, she chronicled key milestones along the way to becoming what we consider to be AI today. 

During her talk, she also discussed:

  • The difference between artificial intelligence (i.e. the reality) and artificial general intelligence (i.e. the dream). 
  • The two theories of intelligence (knowledge and learning) that each spawned their own types of AI (symbolic AI and machine learning, respectively). Note that symbolic systems are much less common today than machine learning and deep learning. 
  • How to figure out if something is using AI or not. 
  • And much more. 

Watch the full video below, or read the full transcription. Please note that the transcription was compiled using AI with Otter.ai, so blame any typos on the machine :)


Session Transcription 


Hi, everyone. Good morning. Thank you so much for joining me so early today. I am very excited to be here. And thank you to the organizers at my con and especially Paul for having me. We are not easing into this morning, we are crashing straight in to a 30 minute crash course on what is the state of the current field of artificial intelligence. So thank you for being on this journey with me. So I wanted to start with actually a different question. Why is AI so confusing? And the reason is it actually refers to two different things. The first thing is the dream the dream of the fields and this is the sci fi rendition of a human like embodied AI system. That's super smart, super helpful, works with us to solve the world's biggest challenges and this is the goal of the field. The second thing that refers to is the reality Our basic AI system stay like voice assistants, chatbots, Netflix recommendation systems, none of them are anywhere near the dream. And if you've read, this was a hilarious thing that Siri did only three years ago. So today we've sort of separated these two concepts into two different terms. So the dream we refer to as artificial general intelligence, and the reality we refer to simply as artificial intelligence. But these definitions are often conflated. They get really confused and that's why in the public imagination, artificial intelligence is seen as something that is more than what it really is today. Um, it's also confusing because the reality keeps changing, so it's getting better getting closer to the dream. And that means the definition of AI is also changing. So what we may have considered ai 20 years ago, we no longer probably consider AI today. And so to really understand what AI is, we need to start with a little bit of history. So in the summer of 1956, a bunch of young white lads got together for an eight week retreat at Dartmouth university to figure out how to mimic intelligence with computers. It was basically a summer camp. They hung out every day. And they hash out various ideas. And this is seen by many people as the founding event for the field. And it was also the event that coined the term artificial intelligence. And so from that eight week summer camp, there emerged two different theories about intelligence. The first is we're smart because we know a lot of knowledge. And so if we want to recreate intelligence, we should encode all of the world's knowledge into a giant database and give birth to a super smart computer otherwise known as an expert system. The second idea is that we're smart because we know how to learn and therefore if we want to create intelligence, we need to build software that can learn and that is today what we know as machine learning. So eventually, these ideas evolved into two different camps. There were the symbol lists. And the connection is the symbol lists are the knowledge camp, the connection started learning camp, and the simplest Scott busy converting the universe's knowledge into a set of logical rules. So it went something like this, they would tell the computer, birds can fly. Robins are birds, and then the computer would logic out that robbins can fly. Great. So the connection is Scott busy designing software that could learn and based on the basic principle that form fits function, they thought maybe we're really good at learning because we have brains. So if we want machines to be really good at learning, we should mimic the architectures between neurons and find a way to build those connections. And that's what gave birth to neural networks. So we'll get back to that later. Okay, so remember this happy to camp photo that did not last very long, because as these two camps evolved time they developed a very fierce rivalry. And it was basically like the Bloods and the Crips of the world. So the connection is thought the simplest worse stupid, the simplest thought the connection is we're crazy. And as a side note, the rivalry was so intense that it actually this partition of ideas trickles into science fiction. So Samantha from her is very much an embodiment of idea number one, that intelligence comes from knowledge she comes into the world fully imbued with all of the knowledge and capabilities of expression. And contrast, chappy is very clearly idea number two, he is he comes into existence with a completely blank slate, and then he learns very quickly from his environment. And if you haven't seen those two movies, I would highly recommend it. So back to the story. So the simplest, and connection is in the beginning, the simplest were winning. And so the connection is actually looked pretty crazy. expert systems were all the rage, especially after they choose some very public major milestones. So you might have heard of this in 1997, IBM created deep blue, and it was the first computer system to beat the world's greatest chess master. And in this photo, you see that a human is executing what Deep Blue decides to do, because deep blue is just software, there's nothing to execute the actual moves. You may also remember this from 2011, when IBM Watson defeated the two best players on Jeopardy, Watson is also an expert system. And so for a long time in the public imagination, this is what AI was, it was these very, very knowledgeable machines.


But then something happens. similar strategy hit a bit of a wall. Basically, if you want to try to distill all of human knowledge into a simple set of logical rules, it's actually really hard because you can say birds can fly Robins or birds Robbins can fly but whatever penguins. So there are lots of exceptions to the rules that we have in our world. And it gets really manually laborious to try and discover all of those exceptions and then encode them into the computer. And so simplest ran into the issue of just not being able to deliver on their promises. It took very, very long to build these extra systems to do anything functional, and the costs are driving up and funding for their projects went away. And that is when connection is started to win. So because machine learning is essentially automating away the task of writing rules, instead of manually figuring out what the rules are, you give the machine a bunch of data and the machine writes the rules for you. It is much faster, much cheaper, and much easier to adapt and adopt new environments. And so here's an analysis I did earlier this year, which shows the frequency of words changing an AI research papers over time, and you can see that the term associated with machine learning, like learning network data, those basically started to eclipse the terms that are associated with knowledge based reasoning somewhere around the mid 2000s. So there were two other things that happened to make this shift happen. The first was in 2006, a young computer science professor named Fei Fei, we came up with an idea. She thought, what if we just drastically increased the amount of data that we feed machines, then maybe machines will become more capable. And we kind of take this idea for granted today, everyone says data is the new oil. But at the time, she was ridiculed for it, and it was a really radical idea. So what she did was she began an international effort based on this idea to capture the entire world of objects into a giant open source database of carefully annotated images. So it took two and a half years they amassed 3.2 million photos and that contains everything from animals to people to household objects, whatever you could think of. And we called it image net. And then she launched a competition to basically see which team could use this data to then train the best image recognition system. So the second thing that happened is in 2010, another computer science professor named Jeffrey Hinton just created a new design for a neural network. So if you remember neural networks are the software that connection is came up with to mimic the brain. But when it first came out, it wasn't very good. It was kind of stupid. It only had a few connections and few notes, only one layer. And so it wasn't very good at processing images. And Hinton came up with the idea of what if we essentially stack them and create what's known as a deep neural network with many layers. And suddenly neural networks became a lot more powerful, they were able to process much more sophisticated images. And that's what spawned the subdivision of machine learning that we know today as deep learning because it's a portmanteau of deep neural networks and machine learning. So both innovations combined led to a dramatic improvement in machines abilities to recognize images. You can see in here, this is a chart of the progress of the image net competition. So in the first year, everyone was still quite far away from perfect accuracy. But in 2012, the first team, which was Hinton's team use deep learning for the first time and there was a pretty big leap in performance. So every year after that, through the competition course, everyone started using deep learning and very rapidly, practically all the teams by 2017 had reached near perfect accuracy. So this was a very, very clear demonstration of the power of deep learning. And what did that do for its popularity, it absolutely exploded. And now we don't just use deep learning for image recognition. As Paul mentioned, we use it for everything. And this is this is basically the A I Hope Solo, so That we're in today. So to recap, there are two theories of intelligence, knowledge and learning. And those each spawns their own branches of artificial intelligence, symbolic AI and machine learning. And the learning sub theory also spawned another theory that the brain structures the reason why we as humans are smart. So we should replicate it by creating neural networks and that is what gave us deep learning. Great. 


So now, everything almost everything that you here is machine learning or deep learning. And symbolic systems are still around but they're far less common. So, what is machine learning? The cut and dry of this is that machine learning is a process of using statistics to find patterns and data. And then second part is it uses those patterns to then make decisions. So you can kind of think about it like humans we have we live our experience, we've learned lessons from those experiences, we then apply those lessons to our actions and decisions. machines, the data is the experience for them. So they learn the lessons from the data, and then apply those lessons to their actions and decisions. So machine learning can be applied to image data, and it's extracting pixel patterns. You can feed it text data, and it'll extract word patterns, you can feed it audio data, it'll extract sound patterns, essentially, any kind of data that you can put into digital form, you can feed into a machine learning algorithm and it will find a pattern for you, and then apply it as you wish. So there are two stages to machine learning. There's the training stage and the deployment stage and the training stage. You take your data, you feed it into the algorithm, and it develops what we call a machine learning model. And that model is essentially the codified version of the patterns or the set of rules that the machine helps you write Then you can take new data, put it into that model. And the model will compute what your desired output is. So here's a very simple example. Let's say you have data, that is cat images, you feed it into a neural network, your algorithm. And the model is essentially the pixel patterns that make a cat, a cat. So then you can go and feed it new data, maybe you have an image of a muffin, you feed it into your model. And hopefully, with some success, it will tell you that this is not a cat. So here's a pretty nice animation of this process happening. Here we have the neural network. That's the algorithm. And then we have both cat and dog photos that we're feeding in, that are labeled. So we're telling it these are the cat photos, these are the dog photos, and it creates the model known as a trained neural network. And then that trans neural network can now tell you that this image is of a dog so this process, actually Really simple, but it's really powerful for a whole host of applications. So you can see that for image recognition. That's what drives face recognition. It's also what drives medical diagnosis, which is a really big button fields and artificial intelligence because a lot of medicine is based off of understanding and interpreting images like MRI scans or CT scans. It drives speech recognition so you can transcribe your voice.


That's how in part you can communicate Siri and Alexa. And if you recently had your bank turn on voice identification, where it essentially adds a security layer to recognize your voice, that is also basically a form of speech recognition. Text prediction prediction is like autocomplete, or auto correct on your phone. Ranking systems are the less obvious one. They are essentially the systems that Google uses to organize the search results in your search feed and when Facebook uses to organize the posts in your newsfeed. So it's essentially taking data, which is your engagement and clicks from before. And finding the patterns that help it predict what you might like to see first versus last. And then recommender systems is particularly important in marketing. It's the reason why you get very good product recommendations on Amazon. And it's also what's behind all of the targeted ads that follow you on the internet. So you might have noticed that each of these groups essentially tried to mimic a human skill. As Paul mentioned before, these are all in quotes because they don't actually mimic it, or they don't actually get to what it is, but they do loosely mimic it. So image recognition is like site text tradition is like writing, so on and so forth. And you can use this insight to then help you figure out what products use machine learning in the future. And so the flow chart that Paul was mentioning this Something that I did last year to make a cheat sheet for myself for how to figure out if something is using AI. I won't go through all of it because it will be included in the end of my slides. But the top says, Can you see, can I hear? Can it read? Can it move, can it reason, and I encourage you to kind of make your own cheat sheet. This was my interpretation of how the different skills and AI break down and it was a very useful exercise for thinking through what the field is currently accomplished. Okay, so this is the current state of the technology. But it's not exactly the dream. And I don't know if they're any good place fans in the audience. But Jana is essentially this like amazing system embodied AI system that helps the people around her. So how far exactly is the dream? How far is Janet from becoming a real thing? If you remember the timeline from the beginning, this is essentially my interpretation of how far we are. So at the very far left is the 1956 Dartmouth campus. And the very far right is the dream that we're trying to achieve. I really don't think we are that far. The current intelligence that we have artificial intelligence that we have is not even as intelligent as a two year old. The thing that Paul was talking about with his kids, his kids have this amazing ability to create these things from scratch. And I can never create things from scratch. It has to be fed initial data, it's always going to be a copycat. And it's always basically just trying to predict what happens next. And so this is a pretty big debate in the field. But based on the interviews that I conduct with researchers, I think most people would agree this is roughly where we are, we've barely moved a fraction of the way over. So how do we actually get to the dream? Well, one of the challenges in the field now is to integrate all of these skills. You'll notice that all all AI systems currently are singularly good at one thing, but we don't have a systems that are very good at multiple things. And actually, Jana is a good illustration of one way that AI researchers are now attempting to bridge this divide. So the simplest and the connection lists are trying to put aside their differences and combine both of their approaches together. And that kind of makes sense because human intelligence probably is the product of our knowledge and our ability to learn. And so it's a little bit like Janet and that she's infinitely knowledgeable in the show, but she also constantly evolves and learns with every iteration. There are also other challenges that need to be overcome. So here's another challenge AI bias. And here's an example of what I mean by AI bias. So the most common image recognition systems are much better at recognizing images from the US and Europe than from the rest of the world. So if you show it an image of a western style wedding where the bride is wearing a classic white dress and the groom is wearing a tie, It will recognize that that's the bride and groom. And it's, there's probably some ceremony happening. But if you show it an African style wedding or an Indian style wedding, it'll basically just say these are people. And there's a good reason for that. After the huge success of image that there were a lot of people that rush to create other open source image databases to help facilitate better image recognition systems.


And the way that they went about building these databases is they essentially scraped whatever photos were available on the internet. And at the time, when this happened, what was available on the internet was largely from the US and Europe. So this map shows a geographic distribution of one of the most popular open source databases. And you can see the biggest dots are from the US, the UK and other countries in Europe. And there's basically nothing from Africa. And there are very, very few images from India. And so it makes sense that if the machines never saw any examples of African and Indian, so weddings, and Then of course, they're not going to be able to identify them. So Google last year is started addressing this problem by launching and inclusive images competition. And essentially what they are trying is both a data and algorithm approach to making our image recognition systems better being culturally sensitive. So the data approach is they are trying to build more databases that actually have images from other regions of the world. And the algorithm approach is, even if we have a limited or skewed data set, are there ways that we can use it any way to build better inclusive image recognition systems. So here's another example from MIT researcher, Joy bomb lady. At the start of her PhD, she discovered that commercial face recognition systems the most common face recognition systems did not even register her face. She had to wear a white mask for them to register. So she began to audit some of the most common system And released a groundbreaking study last year called gender shades. What she found is that the three biggest companies that offer face recognition systems, IBM face++, and Microsoft all had huge gaps and the ability to accurately classify the gender of a light skinned man versus a dark skinned woman. So IBM in this case, had over 34% gap between the two but Facebook's wasn't Microsoft, honestly weren't that much better. She conducted the study this year, including Amazon and Kairos, which just released their face recognition platforms, and it was the same exact thing. But fortunately, the three companies she audited originally actually improved because when she did the study, they reached out to her to figure out how they could do better. And so this really demonstrates that there's actually a way to do better, and there's a way forward. Here's a super recent example, and I'm sure many of you are familiar with this. This was in March when have sued Facebook. And they sued Facebook on two accounts. And I think the first account was more widely reported, which is that Facebook was allowing advertisers to target their advertising space on race, national origin, gender, and that was causing all kinds of housing discrimination problems. But the second account, which is more interesting is that even if an advertiser doesn't explicitly target their ads, the machine learning algorithms will do that anyway, because that's what they are designed to do. And so right after hood, the lawsuit came out. There were researchers that then looked into this to see if it's true. And here's a study that was done, where you can see the researchers created. All of these housing ads, put it on Facebook without restricting the audience at all. And because the machine learning algorithm is optimized for profit and engagement, it ends up showing the homes for sale to a higher fraction of white users whereas it shows homes have for rent to a higher fraction of minority users. And this holds true for employment as well. So without being prompted the Facebook's machine learning algorithm is learning that if you want more engagement on your job ads for nurses and secretaries, you show it to a higher fraction of women. If you want more engagement on your job ads for janitors and taxi drivers, you show it to a higher fraction of minority users. So obviously, that is deeply problematic. And after this study came out, there was a great quote from a professor from University of Michigan, where he essentially said data, big data used in this way will never help us achieve a better world because it is just replicating the patterns from history into the future. And so there's a lot of research happening now to essentially figure out how to make AI systems. not do this so that they achieve our dream of being beneficial to us. So the last challenge that I want to bring up which is all gotten a lot of attention recently is deep fakes and defects is also very exemplifies and illustrates really well the broader category of problems that relate to the abuse of AI. So what happens when you design systems and then they're used incorrectly.


And defects is a relatively new term. So it it's a portmanteau of deep learning and fake media. And it refers to any kind of image that was created using deep learning. And the gist of it is that deep learning got so good at mimicking patterns and pixels, that it can generate its own patterns and create hyper realistic things that have never existed before. So here is an example where on the left, you see Alex Baldwin impersonating Trump. And on the right, you see, President Trump's actual face being overlaid onto Alex Baldwin's body with a deep learning algorithm. And if you only saw the right hand photo, you would just See Trump saying whatever Alec Baldwin wanted him to say. Um, here's another example with President Obama. So in this one, Jordan Peele is impersonating his voice. And the left hand side of the image, the video is actually a video from Obama giving a national address. But the researchers took Jordan pills voice, synthesize the correct mouth movements, for it to look like they were saying the same thing and then overlaid it onto President Obama's face using a deep learning algorithm. So clearly, this ability is a really big national security challenge. Because if politicians can be made to say anything, we're kind of screwed. But it's not just for politicians that this is problematic. It's also already starting to being be used in many different ways to affect vulnerable populations. So just last month, there was an app that was released called Deep nude. It also used to deep fake algorithms to essentially undressed women or otherwise synthesize Fake nude bodies and overlay them onto images of clothes women. And it was hugely controversial. Obviously, if this app, unfortunately this app was taken down. But if this app continued, it could cause a lot of grief to women who might be mistaken for having news on the internet. So, as a final recap, what is AI will cook currently today, AI is predominantly machine learning. And that means it's predominantly the process of finding patterns and data. But AI is also an aspiration. It's an entire field of people working to recreate intelligence, so that you can help us solve challenges that we currently haven't figured out. Climate change, hunger, poverty, disease, and hopefully when it does that, it'll help us all live, happier, healthier and better lives. But to get there, we have to overcome a number of challenges. We have To integrate all the skills together, we have to make sure that our systems are not biased. We have to get rid of the abuse of AI.


But I'm optimistic that we'll get there. Thank you. So this is a Bitly for the articles that I referenced in my talk. I will probably add a few more later on and it also has a link to my newsletter. Please subscribe if you would like to continue the conversation. It is completely free. And it also has a link to the subscription for tech review, if you would like to pay and continue conversation. Thank you so much.

Transcribed by https://otter.ai


Related Posts

Meet Karen Hao: Marketing AI Conference (MAICON) 2021 Speaker

Cathy McPhillips | July 28, 2021

Karen Hao is the senior AI editor at MIT Technology Review, and a speaker at the Marketing AI Conference (MAICON) 2021. Learn more about her work here.

Why Marketers Need to Understand AI and How to Get Started

Mike Kaput | January 5, 2018

The writing is on the wall. Marketers need to understand AI. Here's why and here's how to get started.

Wonder Why Your AI Investments Fail? Here’s Why

Mike Kaput | July 31, 2020

Most AI investments fail for a handful of the same reasons. Here's what they are and how to avoid them.