Case Study: Moving LLM Logic to the Frontend—My Experience

programming, case-studies, ai

I was once building a RAG chatbot in an innovation team. Our chatbot was already built with LlmInBackend architecture: every prompt tweak required a backend deploy, and backend deploys were usually a lot harder than frontend deploys. Eventually, experimenting became painful, especially when frontend UX changes demanded backend modifications to prompts.

But I realized there was a better way. As the technical lead, I made the architectural decision to move AI logic, prompts, and orchestration to the frontend (while keeping secrets and business APIs in the backend). This enabled much faster experimentation and feature development. The backend’s main job became securely proxying OpenAI requests and exposing business data, not handling prompt tweaks or AI flows.

Before, we were using LlmInBackend Architecture (Simplified)

When I started, the UI was a thin client that mainly sent requests to the backend. (There was some interesting stuff with localhost-based conversation history storing, but that's not relevant here).

Meanwhile, the backend handled all of the AI logic.

Mermaid diagram

Advantages

The advantage of LlmInBackend is the centralization of all AI logic, prompts, and sensitive operations in the backend gives maximum control over security, access, and business logic. It’s easier to protect API keys, enforce rate limits, and manage data privacy, since nothing sensitive ever touches the client.

Disadvantages

Unfortunately, for our use case, LlmInBackend ended up being much slower to experiment and iterate. Every change to prompts or AI logic required a backend deploy, which slowed down product development and kept my frontend team from quickly improving the user experience.

Afterwards, we started using LlmInFrontend Architecture (Simplified)

After making the switch, the frontend took on much more responsibility, and we moved to LlmInFrontend architecture. At the time, it was a novel concept that I had created. I'm sure other people were doing this already, but for me, it was totally new.

I led the change so that the UI not only handled user interaction, but also managed prompts and orchestrated the AI logic directly in the browser. The backend's job was reduced to securely proxying OpenAI requests and exposing business APIs—no more prompt tweaks or AI flows in the backend.

Mermaid diagram

Advantage

The biggest advantage was speed: our frontend team could experiment and ship AI features fast, without waiting for backend deploys. We could try new ideas, tune prompts, and improve user experience in real time. The backend still kept secrets and business logic safe, but no longer blocked rapid iteration.

Disadvantages

The main risk was security. I had to be careful about what stayed server-side. If too much moved to the frontend, I risked exposing sensitive operations. Rate limiting, access control, and security all had to be handled robustly in the backend proxy, or we would open ourselves up to abuse.

Before (Expanded)

Here's what our traditional backend-heavy LLM app actually looked like. The frontend just rendered UI and collected input. All the important stuff—prompts, AI orchestration, OpenAI calls—lived in the backend.

Mermaid diagram

After: LLM in Frontend (Expanded)

Now watch what happened when I moved AI logic and prompts to the frontend. The frontend orchestrates everything AI-related, while the backend focuses on secure API proxying and business data.

Mermaid diagram

What Actually Changed

Moved to frontend:

Stayed in backend:

Why This Works

This shift allowed our frontend team to rapidly iterate on AI features without backend deploys, leading to:

  1. Faster iteration cycles
  2. More flexible and sophisticated AI features
  3. Cleaner separation of concerns between frontend (UX, AI orchestration) and backend (security, data)

Security is still critical: a robust backend proxy must handle authentication, logging, cost control, and rate limiting to avoid exposing sensitive operations to the client.

The Result

We shipped better AI features, faster, with a leaner backend and a more empowered frontend.