WhyLlmsCantCountLetters
aiLargeLanguageModels often get mocked for failing at tasks like counting how many R's are in the word “Strawberry.” Why does this happen?
Large Language Models take input text and break it down into smaller pieces of text called "tokens." Then, they convert the tokens into arrays of numbers called "vectors." The LLM then takes those vectors as input for the rest of its layers.
Because LLMs are not trained to count letters in a word, the vector representation does not retain a precise character-level memory of the original text, which is why LLMs don't know how many R's are in the word Strawberry, and other similar errors.
I realized that LLMs can -really- help with sharpening one's ideas, even while just providing feedback on one's writing. I've shared the most interesting findings on WritingWithLlmsIsLearning.
2025-07-07: Based on more recent testing, it looks like newer LLM versions are a lot better at these tasks. Looks like GPT-4o, for example, has now been trained to count those letters. But this is still a good explanation of how LLMs work, and probably still applies to older and smaller LLMs, so I'm keeping it here.
2025-07-08: Turns out, LLMs are now accurate on dictionary words, but still fail on random sequences of letters. how many r's in rrrorrrorrrorrro
returned 11 instead of 12 -- even after counting.
So, they aren't really counting letters at all. Even now.