Skip to Content
Artificial intelligence

Five questions you can use to cut through AI hype

Here’s a checklist for assessing the quality and validity of a company’s machine-learning product.
May 15, 2019
An abstract illustration showing  the numbers 1-5 and question marks
An abstract illustration showing the numbers 1-5 and question marksMs. Tech

Two weeks ago, the United Arab Emirates hosted Ai Everything, its first major AI conference and one of the largest AI applications conferences in the world. The event was an impressive testament to the breadth of industries in which companies are now using machine learning. It also served as an important reminder of how the business world can obfuscate and oversell the technology’s abilities.

In response, I’d like to briefly outline the five questions I typically use to assess the quality and validity of a company’s technology:

1. What is the problem it’s trying to solve?

I always start with the problem statement. What does the company say it’s trying to do, and is it worthy of machine learning? Perhaps we’re talking to Affectiva, which is building emotion recognition technology to accurately track and analyze people’s moods. Conceptually, this is a pattern recognition problem and thus would be one that machine learning could tackle (see: What is machine learning?). It would also be very challenging to approach through another means because it is too complex to program into a set of rules.

2. How is the company approaching that problem with machine learning?

Now that we have a conceptual understanding of the problem, we want to know how the company is going to tackle it. An emotion recognition company could take many approaches to building its product. It could train a computer vision system to pattern-match on people’s facial expressions or train an audio system to pattern-match on people’s tone of voice. Here, we want to figure out how the company has reframed its problem statement into a machine-learning problem, and determine what data it would need to input into its algorithms.

3. How does the company source its training data?

Once we know the kind of data the company needs, we want to know how the company goes about acquiring it. Most AI applications use supervised machine learning, which requires clean, high-quality labeled data. Who is labeling the data? And if the labels capture something subjective like emotions, do they follow a scientific standard? In Affectiva’s case you would learn that the company collects audio and video data voluntarily from users, and employs trained specialists to label the data in a rigorously consistent way. Knowing the details of this part of the pipeline also helps you identify any potential sources of data collection or labeling bias (See: This is how AI bias really happens).

4. Does the company have processes for auditing its products?

Now we should examine whether the company tests its products. How accurate are its algorithms? Are they audited for bias? How often does it reevaluate its algorithms to make sure they’re still performing up to par? If the company doesn’t yet have algorithms that reach its desired accuracy or fairness, what plans does it have to make sure they will before deployment?

5. Should the company be using machine learning to solve this problem?

This is more of a judgment call. Even if a problem can be solved with machine learning, it’s important to question whether it should be. Just because you can create an emotion recognition platform that reaches at least 80% accuracy across different races and genders doesn’t mean it won’t be abused. Do the benefits of having this technology available outweigh the potential human rights violations of emotional surveillance? And does the company have mechanisms in place to mitigate any possible negative impacts?

In my opinion, a company with a quality machine-learning product should check off all the boxes: it should be tackling a problem fit for machine learning, have robust data acquisition and auditing processes, have highly accurate algorithms or a plan to improve them, and be grappling head-on with ethical questions. Oftentimes, companies pass the first four tests but not the last. For me, that is a major red flag. It demonstrates that the company isn’t thinking holistically about how its technology can affect people’s lives and has a high chance of pulling a Facebook later down the line. If you’re an executive looking for machine-learning solutions for your firm, this should warn you against partnering with a particular vendor.

This story originally appeared in our Webby-nominated AI newsletter The Algorithm. To have more stories like this delivered directly to your inbox, sign up here. It's free.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.