WhyLlmsCantCountLetters

💼Looking for work💼 I'm currently open to new opportunities in hands-on architecture, management, and/or innovation work with LLMs. Please contact me on my LinkedIn if interested.

LargeLanguageModels often get mocked for failing at tasks like counting how many R's are in the word “Strawberry.” Why does this happen?

Large Language Models take input text and break it down into smaller pieces of text called "tokens." Then, they convert the tokens into arrays of numbers called "vectors." The LLM then takes those vectors as input for the rest of its layers.

Because LLMs are not trained to count letters in a word, the vector representation does not retain a precise character-level memory of the original text, which is why LLMs don't know how many R's are in the word Strawberry, and other similar errors.

I realized that LLMs can -really- help with sharpening one's ideas, even while just providing feedback on one's writing. I've shared the most interesting findings on WritingWithLlmsIsLearning.

2025-07-07: Based on more recent testing, it looks like newer LLM versions are a lot better at these tasks. Looks like GPT-4o, for example, has now been trained to count those letters. But this is still a good explanation of how LLMs work, and probably still applies to older and smaller LLMs, so I'm keeping it here.

2025-07-08: Turns out, LLMs are now accurate on dictionary words, but still fail on random sequences of letters. how many r's in rrrorrrorrrorrro returned 11 instead of 12 -- even after counting.

So, they aren't really counting letters at all. Even now.