We're using AI chatbots incorrectly

Large language models by themselves aren't truth machines. When an AI chatbot responds it doesn't know anything. We shouldn't use them like they do.

Large language models (LLMs) by themselves aren't truth machines. When an AI chatbot responds it doesn't know anything. We shouldn't pretend they do.


When we use a machine or device we expect it to perform in a predictable way. Pressing volume up on a TV remote will always make the volume louder. Punching 3+5 on a calculator will always equal 8.

When we want information about a topic we have typically been able to turn to major, trusted publications to get info - like what happened in the news yesterday or the best cheesecake recipe.

This all starts to break down when it comes to internet algorithms. If you search for "the best cheesecake recipe" you get millions of results that all look alike. Everybody has their own 'best cheesecake'. The expectation when you search for something you'll get a reliable 'answer' comes into conflict with the reality of the internet - there's no single answer. There's millions of answers.

The problem gets inverted, but arguably worse, with AI chatbots. Instead of seeing that there are in fact millions of cheesecake recipes you get a single, assertive answer. Only this time it's generating an answer based on all the millions of different cheesecake recipes regardless of whether they're good or bad.

The stakes are low for a cheesecake recipe. It doesn't really matter if there is more than one 'best' cheesecake, or even some bad ones (although that is sad). But the stakes become high when you start to ask things like what happened in the news yesterday, commonly accepted facts (e.g. who has walked on the moon), political issues, and important health information (e.g. do vaccines cause autism?).

In Stratechery this week, Ben Thompson puts it well:

Suddenly, there isn’t an abundance of supply, at least from the perspective of the end users; there is simply one answer. To put it another way, AI is the anti-printing press: it collapses all published knowledge to that single answer, and it is impossible for that single answer to make everyone happy.

With the internet came an an abundance of other opinions which has eroded our concept of a shared reality. With generative AI we've taken that abundance of other opinions and bundled it back up into a single answer. Only this time it doesn't reliably come from a trusted authority, it may or may not be wrong, and it may be different for everyone.

The problematic is we prescribe mental models of older technology to newer ones. Using Gemini or ChatGPT may feel like a better way to search the internet, and in a lot of ways it is, but LLMs aren't truth machines. When Gemini generated 'racially diverse Nazis' that became really clear. Looking for a picture of the founding fathers using a tool which generates a facsimile is like using an etch-a-sketch to do basic math.

The larger problem is that it's really hard for users to understand when these tools are generating something and when they're transforming existing knowledge from one format to another. There are solutions to make AI chats more accurate. A common example of this is RAG (retrieval augmented generation), which points LLMs to reference pre-selected sources of knowledge. It's not always a perfect solution, but it is a positive step. But the issue remains when AI chatbots blur the boundaries between transforming and generating.

In a totally different context, this HBR article uses the example of workplace productivity:

The problem companies are facing, however, is that traditional methods of process redesign may not be entirely up to the task because GenAI doesn’t function like a traditional technology. Users “talk” to it, much as they would to a human colleague, and it works with the user in an iterative fashion. It also can continuously improve as it learns user needs and behaviors (and vice versa).

It points to treating generative AI tools differently to tools we've used before. It may seem small, but it's an important step to make sure we're not mixing generating and transforming. AI chatbots aren't the same as search engines or even traditional major publications. We shouldn't misinterpret them as a replacement for either of those industries.


💩 Cool shit

Globe Explorer – This is really neat. Enter a topic and get a customized hierarchy of sub-topics to help you start learning.

Airfoil – I've shared this blog before because it always delivers. An incredibly long, incredibly detailed and wonderfully interactive post about how planes fly.

Weird Fucking Games – I mean, the title is self explanatory. The absurdity and creativeness makes this fun to explore.

Tribute to Steamboat Willie – Steamboat Willie is in the public domain. So we get things like this fun musically-inspired game.

Browser Games – More games. All of these you can play in your browser.

How do we compare – Interactive data viz helping you compare income across different regions.

Dead Simple Sites – Websites these days can be pretty heavy. That's why I like this list of minimal websites.


Share this with a friend because Re: 2nd attempt for you - Please Confirm!