LargeLanguageModels

programming

💼Looking for work💼 I'm currently open to new opportunities in hands-on architecture, management, and/or innovation work with LLMs. Please contact me on my LinkedIn if interested.

LLMs are powerful because they're different from humans. They're tireless, don't need to eat, don't need to sleep, are MUCH faster than humans in writing and comprehension -- but they can't count the number of R's in "strawberry."

People tend to hate AI because they see AI as a threat to their livelihood -- and to be fair, this is a real concern. But this problem says less about AI and more about OurEconomicSystem.

LLMs first decode the input text

A token is a coded representation of a sequence of characters -- often, these are partial or whole words, or even symbols.

LLMs take inputs and then convert them into tokens, and then further convert them into vectors which are high-dimensional. The LLM then works on those vectors directly.

Mermaid diagram

in other words, they forget what the exact input was as soon as they've read it, and they convert ("decode") that input into these vectors.

This is exactly why LLMs don't know how many R's are in the word Strawberry, and other similar errors.

LLMs represent their ideas in higher-dimensional space

An LLM outputs token as a random walk through n-dimensional space.

The n-dimensional space here is the space of all possible