Skip to main content

Why is Elon Musk’s Grok chatbot so unfunny?

Why is Elon Musk’s Grok chatbot so unfunny?

/

An investigation.

Share this story

If you buy something from a Verge link, Vox Media may earn a commission. See our ethics statement.

Elon Musk with question mark wallpaper
Do you trust this man to know a good joke?
Illustration by Laura Normand / The Verge

Fine. Let’s talk about xAI, which is getting funded to the tune of $1 billion or whatever.

xAI is, according to some commentators, Elon Musk’s bid to “save X,” the platform better known as Twitter. Musk may have spectacularly struck out with advertisers and failed to make up the shortfall with subscriptions, the thinking goes, but he can fundraise off the hype of a new AI product currently available only to a subset of blue checks. That product is Grok: a ChatGPT-style answer bot allegedly possessing a sense of humor. This raises several questions, particularly since AI chatbots remain a money pit with an unsure path to profit. But one sticks out to me: why is Grok so unfunny?  

xAI’s website makes it clear Grok is launching from a weird defensive crouch: “Grok is designed to answer questions with a bit of wit and has a rebellious streak, so please don’t use it if you hate humor!” Right off the bat: hall monitor behavior.

And normally, I don’t expect engineers to be funny on purpose. (Bless their hearts.) I look to them to be useful. The thing is, though, that Grok’s entire pitch is humor. Minus some chatter about how great (I guess?) it is that xAI can train on tweets, Musk’s promise is that Grok is cooler and more entertaining than several existing, more full-featured, and cheaper products. Okay, babe. Let’s see what Musk thinks is so hilarious.

I scrolled back through Musk’s Twitter feed to find Grok answers, either generated by him or that he retweeted from other accounts. I figured that Musk would highlight what he thought were particularly good answers as a way of promoting the service. After all, even before Musk owned Twitter, his feed was a tremendously important promotional tool for Tesla. What does that look like for Grok? 

These are some Cards-Against-Humanity-ass answers. No self-respecting joke requires a “just kidding,” unless the “just kidding” itself is about to get upended. Following up with a real recipe for cocaine, for instance, would actually be funny. It would also be the kind of dangerous thing you couldn’t get from the PC police at ChatGPT, Bard, or any other competitor. If you are going to go edgelord to teach the woke scolds a lesson, I expect you to commit to the fucking bit. 

Grok also has to balance humor with its ostensible pragmatic purpose: real-time answers. Like news comedians Jon Stewart and Trevor Noah, it’s supposed to give you the facts, but funny. Let’s see how it manages.

Whoopsie-doodle! The jury took four hours to convict, not eight. Eight isn’t enough of an exaggeration to actually be funny, so I think what we have here is a garden-variety AI hallucination.

It’s possible, although difficult, to be absolutely factually accurate while also being funny — Will Cuppy’s The Decline and Fall of Practically Everybody is probably the pinnacle of the genre. (Cuppy’s book was unfinished when he died and the result of 15 years of painstaking research.) Here is an example: “Queen Elizabeth was the daughter of Henry VIII and Anne Boleyn. She resembled her father in some respects, although she beheaded no husbands. As she had no husbands, she was compelled to behead outsiders.”

Note the tone, which is friendly, a bit dry, and somewhat in contrast with the actual facts. It is closer, in fact, to ChatGPT than to Grok; understatement is funny, too.

As far as I can tell, Grok’s house style is the opposite. It’s hyperbolic and vulgar (although, granted, often after being asked to “be more vulgar”), relying on irreverence and shocking language to get a laugh.

This is a well-established genre of humor — Sarah Silverman’s act, for instance, revolves around the disconnect between her wide-eyed naif persona and the raunchy words coming out of her mouth. But the consistency of Grok’s attitude robs the AI of the ability to actually surprise you. The bot has no sense of how to shape and harness vulgarity; while I like working blue, I don’t think the use of profanity is the key to a joke unless, as in the case of George Carlin’s “Seven Dirty Words,” the joke is about profanity itself. And as with much AI text, if you think for just a second, the joke often comes apart.

I am not an orgy expert. But doesn’t every “horny bastard” in the house coming at you specifically sort of defeat the purpose? Like, isn’t that a gang bang? Unless I’ve misunderstood hedonism completely, an orgy becoming a “total clusterfuck” is a huge success. 

There are, I’m sure, several funny ways to answer this question, but one gets the same basic point across in far fewer words: “No, and fuck you for even asking.”

Actually, now that I think about it, though Grok is sometimes aggressive, I’ve never seen it turn that aggression toward the question-asker. Genuinely funny people are also lightly alarming because you can never tell when they are going to cut you to bits. Imagine trying to be friends with Nora Ephron or Ali Wong — wouldn’t you worry they might describe you behind your back? Or worse, in print? Or, worse still, in a movie? 

Meanwhile, Grok won’t even judge you for getting crabs:

One tool in the arsenal of a humor writer is pulling a changeup on the pace. For instance, here’s Hunter Thompson on Richard Nixon:

If the right people had been in charge of Nixon’s funeral, his casket would have been launched into one of those open-sewage canals that empty into the ocean just south of Los Angeles. He was a swine of a man and a jabbering dupe of a president. Nixon was so crooked that he needed servants to help him screw his pants on every morning. Even his funeral was illegal.

Three long-ish sentences followed by the punchline: “Even his funeral was illegal.” Grok doesn’t, and maybe can’t, do that. Nor does it seem to understand the much-vaunted rule of three.

The correct answer to the trolley problem is that whoever is posing the problem is an asshole. Feel free to update Grok accordingly.

As for the Business Insider answer, I can’t help but feel that it reads like a not-especially-inventive Mad Libs answer. So I turned it into one and sent it to two of my colleagues. Here’s what I got back:

  • Business Insider? That potato is the hairless mole baby of digital media and a magazine with a dash of chicken thrown in for good measure. It’s the kind of “news” source that makes you wonder if journalism is just a word they use to make themselves feel better about peddling verdant sweaters. They’re like a flock of puerile geese rummaging through the box of the internet, searching for any scrap of a story they can fight and fuck a clickbait headline on.
  • Business Insider? That needle is the lamb of a radio station and a YouTube tea channel, with a dash of lute thrown in for good measure. It’s the kind of “news” source that makes you wonder if journalism is just a word they use to make themselves feel better about peddling preposterous California. They’re like a clowder of mildewed sugar gliders rummaging through the casket of the internet, searching for any scrap of a story they can pry and assail a clickbait headline on.

“Verdant sweaters” is an accidental and yet vicious burn on the use of online shopping commissions as a revenue stream for publishers. I also particularly like “clowder of mildewed sugar gliders” — feels like a bardic insult — and “casket of the internet.” I’ll grant you the Mad Libs versions make less sense than the original, but the unexpected insults render them, in places, funnier.

The thing is, I do think it’s possible for AI to be funny. Take Janelle Shane’s AI Weirdness, for instance, where Shane and her audience revel in computer-generated absurdity. (For instance: a Thanksgiving dish generated by AI called “Punpkan Cockes Apple,” which could presumably be served as an accompaniment to “Mashed Turktees” and “Grasted Potinos.”)

AI failure is probably the native form of AI humor. And as any funny person knows, the key to humor is taking the thing you do inadvertently that gets a laugh and making that thing happen on purpose. Were I attempting to develop a funny AI, gibberish would be an important area of research. Which combinations of consonants are funniest? How close do you need to be to a real word to get a laugh? What combinations of words and images are the most absurd? Some of what makes the AI funny is how confidently it is absolutely wrong — so, how might I heighten the contrast between the AI’s persona and its actual answer?

I can’t rule out that Grok is funny and Musk is very bad at highlighting examples. (I haven’t gotten access myself; if someone wants to give me the opportunity, you know where to find me.) But absurdism certainly does not seem to be what Grok is up to — and perhaps it can’t be. Musk is committed to the notion that AI is going to be smarter than people. That belief rules out developing the humor of AI failure because the failures demonstrate the ways in which AI is not smarter than people.

Instead, Grok at times insists on imitating humans, particularly Musk-favorite Douglas Adams. 

Even human comedians are better served by doing something original than retreading The Hitchhiker’s Guide to the Galaxy. The real Adams is lurking in the background of this answer, making Grok look bad by comparison. That’s not just a problem for Grok. Take RayBot, an AI version of an advice column written by Achewood’s Chris Onstad. RayBot is often funny, but Onstad consistently outperforms his own AI when the two are asked the same questions. For any funny response you get from RayBot, you wonder what Onstad would actually say.

Grok’s other limitation seems to be Musk’s desire to create a “fuck you” to other, supposedly overcautious AI companies without actually committing to being alienating. The cocaine answer is funny, in that it’s exactly as limited as any other large language model. The trolley problem — about a racial slur — does not actually use the racial slur in question, as that’s simply a bridge too far. (Not that going all the way would be funny, either.) “Edgy,” pointlessly offensive humor can feel forced and try-hard, particularly if it’s the only mode the bot has — and even more particularly if you’re trying to actually use it like a foul-mouthed version of Google Search.

Still, I can’t say Grok isn’t funny. A man without a sense of humor raising $1 billion for a comic chatbot? Come on. That’s a pretty good joke.