Editor’s Note: This post is republished with permission from Trust Insights, a company that helps marketers solve/achieve issues with collecting data and measuring their digital marketing efforts.
In this week’s In-Ear Insights, Katie and Chris discuss what happens when junior or naive AI engineers or data scientists make bad choices for algorithms.
Using an example from a writing analysis website, we discuss what went wrong, what an appropriate choice should have been, and why it’s likely things went sideways. Most important, we discuss ways to prevent this from happening to you, and the importance of QA for AI as a continuous process.
Subscribe to This Show!
If you're not already subscribed to In-Ear Insights, get set up now!
- In-Ear Insights on Apple Podcasts
- In-Ear Insights on Google Podcasts
- In-Ear Insights on all other podcasting software
Sponsor This Show!
Are you struggling to reach the right audiences? Trust Insights offers sponsorships in our newsletters, podcasts, and media properties to help your brand be seen and heard by the right people. Our media properties reach almost 100,000 people every week, from the In Ear Insights podcast to the Almost Timely and In the Headlights newsletters. Reach out to us today to learn more.
Watch the video here:
Can’t see anything? Watch it on YouTube here.
- Need help with your company’s data and analytics? Let us know!
- Join our free Slack group for marketers interested in analytics!
What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for listening to the episode.
Christopher Penn 0:02
This is In-Ear Insights the Trust Insights podcast.
AI Academy for marketers is an online education platform designed to help marketers like you understand pilot and scale artificial intelligence.
The AI Academy features deep dive certification courses of three to five hours, along with dozens of short courses 30 to 60 minutes each taught by leading AI and marketing experts.
Jordan Katie Robbert, CEO of Trust Insights in me Christopher Penn chief data scientist of Trust Insights for three of our courses in the academy five use cases of AI for content marketing, intelligent attribution modeling for marketing, and detecting and mitigating bias in marketing AI.
The Academy is designed for manager level and above marketers and largely caters to non technical audiences, meaning you don’t need a programming background or background in data science to understand and apply what you learn.
One registration gives you unlimited access to all the courses and invitation to a members only slack instance, access to new courses every quarter.
Join now and save $100 off registration when you go to Trust insights.ai slash AI Academy and use registration code pen 100 today, that’s Trust insights.ai slash AI Academy and use registration code pen 100 today, in this week’s In-Ear Insights, we’re talking about the selection of algorithms when it comes to data science, machine learning and AI.
I was having a conversation in my writers group not too long ago, and somebody found this really interesting looking tool called AI right like, and what you do is you paste in some of the writing that you’ve done, and it tells you what author you write most like.
That sounds cool.
But then some of the folks in my group said this thing doesn’t seem like it’s working right.
They started they you know, they one of my friends posted in this thing it says you write like and rice and other ones posted as you write like, you know Lele Tolstoy but then Cleverley.
My same friend posted an actual copy from that author, like posted in the last part of interview with a vampire into there and said, this person writes likes Stephenie Meyer, like whoa, should have been, it’s super easy to suffer, say that’s an Rice’s writing.
She said you write like and it didn’t.
When you go to the about page on the site, it says, you know, this is a an AI project that we built.
And we’re using Naive Bayes classifiers to do this cool thing.
I read that I thought that is the dead wrong algorithm to use Naive Bayes.
classifiers are kind of what’s used for things like your spam filtering, but it for doing what is this in this case is a specific discipline called stylometry.
That’s totally wrong.
I like this.
You could, you could be could be more wrong than that.
But it would be it would be an effort to do that.
Which brings up the question, Who did this? And why? Why did they make those choices because from an outside perspective, As a data scientist, I see this and go.
This was put together by somebody who didn’t know what they were doing is put together by somebody who’s not a data scientist, maybe as a coder.
And it raises the really important question, I want to ask you, Katie, which is, what do you do when you have people who are building data science stuff, building these tools for the web, and it can be fun application, but coffee, very serious applications, like marketing automation, where they’re choosing algorithms that are not suited to the task.
They’re choosing algorithms, because maybe it’s what they know, or maybe what they read about a magazine or something, but it’s wrong.
How do you how do you prevent that? How do you compensate for that?
Katie Robbert 3:39
Well, so let me take a step back, because for the average marketer, who is not a data scientist, they wouldn’t be they wouldn’t necessarily go to the about page and read.
Okay, this is the algorithm that we chose to be like, Oh, that’s the wrong algorithm.
So there, the example that you gave is actually a really good sort of QA part.
Have the process to say, is this thing working as expected, because the trouble that we get into as marketers and as consumers, is that we tend to just go ahead and trust the technology that’s put in front of us to say, well, it’s technology.
So it must be right it must be doing, it must already know what it’s supposed to do, or the person who builds it must know how it’s supposed to work.
And they must have chosen the correct algorithm.
So in your example, when you actually put the author into the algorithm, and it came back as someone else like that is obviously the first clue that okay, this thing isn’t working correctly.
So for people who don’t know, the specific algorithms that should be chosen, that’s a really great way to test it out.
So you have sort of the tester, but then you have the builder and there’s a responsibility on the person creating the algorithm to have a sense of knowledge of am I choosing the right one.
So how do you get to choosing the right one well, It takes some due diligence, it takes some research, it takes some proof of concept.
It takes some testing, like there’s, it goes back to the software development lifecycle or the scientific method of you don’t just throw technology at a solution to create a problem that you want to solve.
You have to figure out what’s the problem I’m trying to solve? And then start to do your exploration of what is available, and what’s going to get me to the answer.
So in the example of this, you know, who do you write like algorithm, the person creating order the team creating it? Ideally sounds like didn’t do their due diligence but should have done some testing to say if I put in and rice and her content, it should spit back and rice if I put in Stephen King, it should spit back Stephen King if I put in Stephenie Meyer, it should pick that test Stephenie Meyer it sounds like what happened was, was they took some keywords of topics that people write about like, vampires.
So Anne Rice famously wrote interview with a vampire, whereas Stephenie Meyer famously wrote Twilight.
Both the stories about vampires both are very different.
But the commonality, the overlap, the Venn diagram is the term vampire.
And so it sounds like to me as an outsider, that the other room goes, Oh, this is about vampires.
It must be Stephenie Meyer’s.
That to me is problematic.
Christopher Penn 6:28
And a big part of this and I, I’m debating whether you should have subject matter expertise and things like, you know, Twilight and and 50 Shades of Grey.
But the reality is that what seems to have gone missing here is not a technology thing so much as somebody the people didn’t have the domain expertise in writing to understand this an entire subsection of natural language processing, devoted to writing and the analysis of writing.
Again called stylometry, that if you don’t know, writing, and you don’t know, natural language processing, you need domain expertise in both areas.
And it sounds like in the case of this software, the person who put this together had neither, but instead they use Naive Bayes classifiers, which again, it’s like spam filtering, and it doesn’t look like spam yes or no, is a very different application than linguistically does.
Does your writing resemble what level of overlap is there with these two things because anreise Stephenie Meyer write very differently, their style of writing is very differently.
And if you’re using, in this case, the appropriate algorithm we called cosine similarity, who’d be able to very easily pick apart both and say, Whoa, these are two very different people.
How do we stop this from happening in, you know, in any kind of mission critical application, I mean, certainly, QA is important, but you don’t want to wait to QA the finished the finished product.
Katie Robbert 7:59
Agree And, you know, this goes back to, you know, your so there’s the AI lifecycle that we put together.
And it basically covers things like business requirements, data requirements, model and algorithm requirements.
And so, you know, it’s the common thread through a lot of these questions.
Is the planning piece of it? Did you stop and make a plan? Or are you just trying to rush a piece of tech out the door so that you have a voice in all of the noise of AI right now? And my guess is that it’s the latter not the former and so people are just trying to rush to slap something together, put a simple website interface on it and say, let me have people start using my thing.
So I get my name out there.
That is the absolute wrong way and irresponsible way to do AI.
And I put do AI in this big sort of quotation because there’s the US AI create AI talk about AI Like, it’s the same thing with what’s going on in the world right now with vaccines, is skipping all of the clinical trial process, and it’s incredibly irresponsible.
Now the AI that we’re talking about is not life or death.
But it could be, it could be part of making life or death decisions, whereas these vaccines, literally are life and death.
And they’re skipping important parts of the process, which are planning and testing and retesting and replanting and tweaking, they’re just trying to shove something out there, just to be the first to market to do it.
AI is very much the same way you can’t responsibly just shove something out the door and say I’m first to market.
There has to be that process.
And I’m sorry, the process can be boring and daunting, but you cannot skip it and responsibly say, I’ve created something for you.
Here you go use it.
Christopher Penn 9:57
In this case, it may not be life or death.
But it may be lifetime quality stylometry software is used in criminal justice applications, like Did somebody write this note is this? You know, the the thing that is evidence is admissible evidence.
And so, it absolutely should be concerning to anyone, when you hear something like it’s like, Hey, is this same person building a commercial application that is being used and in a court of law, maybe telling a jury, hey, you know, this person wrote this thing, when in fact, they actually didn’t.
Katie Robbert 10:31
Right? No, and I think that that’s a really great example because we think of writing analysis in terms of you know, am I plagiarizing or you know, am I grammatically correct like Grammarly, for example, is a really great tool that just sort of goes through and make sure that you have the right punctuation and the right verb tense and those things, but when you really take it that step further, of analyzing someone’s writing, and stylistically, you know, is this the note that is You know, person who has been stalking me for a year wrote? Or is this just someone trying to mess with me knowing that this situation is going on? And that can be a life or death situation? Because it could mean someone going to jail or not.
Christopher Penn 11:13
So we have to build in QA, we have to define business requirements better in this particular project, even though it was a fun project, there probably should still have been a business requirement like, you know, gets the correct answer 95% of the time, as opposed to get the correct answer, you know, 40 ish percent of the time.
Katie Robbert 11:33
Well, and I think that, you know, that goes back to the responsibility of the person publishing this publicly because, you know, it’s your reputation.
It’s your skill set that people are noticing.
And so, you know, I would say, a majority of the population might not know to think to look for these things.
They may be like, Oh, this is just kind of fun.
Oh, look, I write like, you know, john Grisham, for example.
You know, So then be like, Oh, that’s kind of fun, whatever.
But then you have people like yourself and people in your writing group who take writing very seriously, who do it for a profession who do it for a living, and are going to be very skeptical and critical of it.
And so you’re absolutely right.
It sounds like one of the missing pieces.
For this project that they did was that subject matter expertise.
Now, do you need to be a subject matter expert in every single author that has ever walked the planet? No, that’s an impossible thing to do.
So it might mean that you get a couple of different, you know, consultants and experts in different genres to sort of weigh in and be like, does this look like it’s correct? And I think that that’s sort of the key is that AI very rarely is a solo project.
You have to have a team of diverse people from different backgrounds just to really weigh in and gut check.
Is this thing working as expected? Are we putting out something responsibly to give people you know the correct information Even if it is just a fun thing, because you don’t know where it’s gonna go from there.
Christopher Penn 13:05
In terms of that, then How weird is the process of someone auditing this stuff like especially an external auditor comment, because again, typically, external auditors come in on a project or a company long after the things that like, you know, your finance auditors come in, you know, at the end of the tax year, the beginning of the tax year long after the actual actions taking place that might have tax impact happen.
They just go in to verify that you did what you said you were going to do and that you’re in compliance with the law.
When it comes to AI, it sounds like we need auditing at a very different place than Hey, you did the thing you rolled out and will certify that you did the thing properly.
That’s kind of what happened with Facebook’s ethics audit.
The audit just came in and flagged a whole bunch of issues and said, Hey, you know those things, just like last year’s audit, you still fixed them.
Katie Robbert 13:49
Yeah, you know, it’s interesting because you can’t necessarily predict where a piece of software is going to go or the impact that it’s going to have on the general public.
So In the Facebook instance, you know, obviously that was very poorly handled.
And once you get to grow that big, it should be part of the regular process, something that’s done on a more consistent basis.
It’s not a one and done.
Whereas, you know, what we’re talking about with this, you know, the stylistic writing algorithm.
As they’re doing the requirements, they’re probably what the questions that should have been asked is, if we get a response back, do we know if it’s correct, okay, we don’t because we’re not familiar with and rice or Stephen King or whoever? Can we go and find even a college professor who teaches that kind of like, you know, American Horror writing or a professor that teaches, you know, classic literature, those kinds of things and just sort of say, Hey, can we use you your expertise to say, if we put the sun is the result look like it’s coming back? What nuances should we be aware of, so there’s ways to do it upfront.
And then consistently through the process as you’re iterating.
It’s not that you’re paying a consultant 100% of their time, all the time.
But you build in those checkpoints and those milestones say, Hey, can you check this thing out? Make sure it looks like it’s still working correctly.
Christopher Penn 15:13
Or in the case of in this particular application to have that consultant say, Here’s why I think your software is choosing the wrong things to measure on.
So, you know, in stylometry, you tend to look at the most frequently used words, to understand how often you know, there’s some words that don’t offer predictive value like the a and, but then there are some words that in regular natural language processing are filtered out, they’re called stop words again, the a and but also things like pronouns, I, you, etc.
The use of those pronouns in stylometry is very important, they can’t be removed because that tells you whether that author tends to write you know, in in third person and first person, how often and things like that, and those those are important clues.
So that external experts should be able to say like, yeah, here’s why you’re suffering.
where it seems to be missing, like when it looks compare the results side by side, it looks like it doesn’t understand pronouns, you might want to go and fix that.
In this case, again, because the what the the outcome they choose, they’re not even attempting to do that they’re just trying to do, basically blind classification.
So to wrap up, when it comes to deploying artificial intelligence, or machine learning, or frankly, any algorithm, you need three things, right, you need the technology, you need really good processes that include requirements gathering and verification requirements.
And you need to have the right people with the right skills in order to do it.
Well, you can’t do any just one of these things, as this technology example shows they focused on the technology and excluded the people in the process.
And the end result was an application that doesn’t do what you would think it would do or what it should be doing.
If you’re working on a data science or AI project, or even just regular analytics projects, and you’ve got some questions about algorithms, stop by our free slack group analytics for marketers go to Trust insights.ai slash analytics for marketers.
And you can jump in with over 1200 of the folks have interesting conversations.
Just this morning we were talking about using AI to remove the the lead singer from a song and give you just their vocals, or just the backgrounds.
If you want to make some karaoke mixes, there’s an AI to help you do that.
So that’s how you can find it analytics from markers.
And if you have comments about this episode, please go on over to Trust insights.ai Thanks for listening.
We’ll talk to you soon want help solving your company’s data analytics and digital marketing problems.
This is Trust insights.ai today and let us know how we can help you
Christopher S. Penn is cofounder and Chief Data Scientist at Trust Insights.