Frontend Architecture for the Age of AI Codegen: Designing Code LLMs Get Right
AI codegen does not remove the need for architecture - it raises the price of bad architecture. A vague component API used to cost you a confused teammate. Now it costs you a confidently wrong LLM generating the mistake fifty times.

Article focus
Architecture > prompts
The real leverage in AI-assisted teams
Key takeaways
- AI codegen does not replace architecture - it amplifies whatever architecture you already have, good or bad.
- An LLM generates from the patterns it can see. Legible component APIs, shared types, and consistent conventions are now a performance feature, not just a style preference.
- The job shifts from "writing the code" to "designing the constraints" - types, contracts, tests, and lint rules that make wrong AI output fail loudly instead of shipping silently.
- Treat the codebase as a prompt. The clearer your boundaries, the higher the hit rate of every AI generation against them.
- Guardrails lower the failure rate, they do not zero it. The bugs that survive hide in the seams between layers - and sometimes the AI resisting your design is a signal the design is stale.
Does AI codegen make frontend architecture less important?
The opposite. AI generates from the patterns it can see, so architecture is now a multiplier - good structure makes every generation more correct, and bad structure gets your mistakes replicated at machine speed.
The comforting story is that AI writes the code now, so designing boundaries and contracts matters less. After a year with assistants in the loop daily, I believe the opposite. An LLM predicts the most likely code given what it can see - your files, types, naming. When the architecture is clear, the likely completion is the correct one. When it is ambiguous, the model fills the gap with a plausible guess, and a plausible guess is exactly the kind of bug that passes review.
A vague API used to cost you one confused teammate. Now it costs you a confidently wrong assistant generating the same mistake across fifteen call sites before lunch. Bad architecture got more expensive; good architecture got more leveraged.
Legibility Simulator: same prompt, two repos
Toggle between an ambiguous and a legible codebase. The prompt never changes - only what the model reads. Watch the likely correctness move.
Same prompt: “Add a button that submits the form.”
// Three "button" patterns live in the repo.
// Loose types. No obvious winner.
<Card2 v="p" onTap={...} />
type BtnProps = { variant?: string; href?: string }// Model picks one at random and invents a prop. <Card2 v="primary" submit isLoading /> // ^ wrong casing ^ not a real prop // Compiles? Sometimes. Correct? No.
Plausible, confidently wrong, ships through review.
Treat your codebase as the prompt
The single highest-leverage shift is to stop thinking of the prompt as the thing you type and start treating the entire codebase as the real prompt. The model reads your structure far more than it reads your instructions.
I learned this by accident. I spent a week writing a careful instruction file - conventions, do-nots, examples. Output barely improved. Then I deleted one folder of stale, half-migrated components nobody used, and quality jumped noticeably. The model had been copying the dead patterns because they were right there in the files it opened. My instructions were a footnote; the repo was the real prompt.
It is an observation, not a creed: the assistant matches surrounding code far more reliably than the request I type. So the highest-leverage move is to make the surrounding code unambiguous - one obvious place per concern, types that describe intent, and ruthless deletion of dead patterns. The same legibility that lets a new hire infer the rules lets the model generate against them.
- Conventions over configuration: if there is one obvious pattern, the model matches it. If there are three, it picks one at random.
- Co-location: keep the type, the component, and its test close. The model reads what it can reach.
- Naming as documentation: `PricingTierCard` constrains generation far better than `Card2`.
- Delete dead patterns: every stale example in the repo is a wrong answer the model might copy.
Drive the prompt yourself
Same request, three levels of context. See how much guessing you remove just by handing the model better structure.
Prompt: "Render the user's avatar."
Context the model can see
// (the model sees nothing else useful)
What it generates
<img src={user.profileImage} />
// ^ invented field. undefined at runtime.The prompt never changes — only what the codebase hands the model. More context up front, less guessing downstream.
I now design component APIs assuming the AI never reads the docs
Assume the model only sees TypeScript autocomplete, never your comments or your README. That one assumption makes you strict in exactly the places that matter: narrow unions, required pairings, no loose strings.
The bug that changed how I design props: I asked an assistant to add a loading state to a submit button. My Button had a loading prop; the model passed isLoading - a reasonable guess, a prop that did not exist. The type was permissive there, so nothing complained. It rendered, passed review (the diff looked human-written), and shipped. The spinner never showed, a user double-submitted a payment, and I heard about it from a support ticket two days later.
The model was not dumb. It guessed a plausible name and my API let it through. Three rules I now apply, walk through them below: force a discriminated union when a component has more than one mode, require props that only make sense together, and never accept a bare string where you meant four specific values. If the AI could guess a prop name and be wrong, the API is too loose.
Harden it yourself: 3 clicks to an AI-proof API
Start from the loose Button that shipped the payment bug. Click "Apply next fix" and watch each change shrink what the AI can get wrong.
Start: the API the AI misused
type ButtonProps = {
variant?: string; // "primary"? "Primary"? who knows
loading?: boolean;
onClick?: () => void;
href?: string; // button or link? both?
};Still broken: This is what shipped the payment bug. isLoading, a misspelled variant, and button-or-link confusion are all typeable.
Contracts are the seatbelt for AI-generated data flow
Shared schemas between frontend and backend turn the most dangerous class of AI mistake - silently mismatched data shapes - into a build-time failure the assistant cannot ship past.
AI is confidently fluent about responses it has never seen. Ask it to render a profile and it will reach for user.avatarUrl, user.profileImage, or user.photo depending on the phase of the moon. None throw at generation time - they throw in production, as a broken image and a silent log.
A shared contract closes the whole category. Define the shape once (Zod, tRPC, a generated client) and reference it from both sides; when the assistant invents a field, TypeScript rejects it before the PR opens. It is the highest-return guardrail you can add - the full version is in "API Contracts Between Frontend and Backend."
typescript
import { z } from 'zod';
// One source of truth, referenced by FE and BE.
export const UserProfile = z.object({
id: z.string().uuid(),
displayName: z.string(),
avatarUrl: z.string().url().nullable(),
});
export type UserProfile = z.infer<typeof UserProfile>;
// AI writes: user.profileImage
// TypeScript says: Property 'profileImage' does not exist.
// The hallucination dies at build time, not in prod.Spot the hallucination
AI wrote this. One line is a confident fabrication. Click the line you think is wrong.
AI generated this profile card. One line is a hallucination.
Build a guardrail stack, not a review bottleneck
You cannot eyeball-review AI output at the speed it is produced. Replace human vigilance with automated guardrails - types, lint, contracts, tests - layered so each catches a different class of mistake.
"We will just review it carefully" does not survive contact with reality. The volume is too high and the output is too plausible - it looks right, which is exactly what makes manual review fail. Move correctness out of human attention and into the machine: a stack of nets, each tuned to a different failure mode, cheap, running in CI, never tired. Click through which net catches which mistake below.
Human review then focuses on the one thing machines cannot judge - is this the right thing to build at all. And the AI-era twist on testing: tests now double as the spec you hand the assistant. Write the test first, let the model generate until it goes green ("Testing Strategy for Frontend Architecture").
- TypeScript (strict): catches hallucinated fields, wrong prop shapes, missing cases.
- ESLint with boundary rules: catches architectural violations the model cannot see in a single file.
- Schema validation at the edges: catches data that lies about its shape at runtime.
- Integration tests as spec: catches behavior drift and doubles as the prompt for what to build.
- Human review, last: reserved for taste, product fit, and "should this exist" - not for catching typos.
The guardrail stack - and how much is enough
First: which net catches which mistake. Then: drag to see why stacking endless manual review past the automated nets just becomes a bottleneck.
TypeScript (strict)
compile time
ESLint boundary rules
compile / CI
Integration tests
CI
Schema validation (edges)
runtime
Human review
PR
Pick a mistake an AI might generate — watch which net catches it. Cheaper nets sit higher; human review is the last, most expensive layer.
Automated nets give you most of the safety for almost none of the speed cost. This is the sweet spot.
Where does the frontend engineer actually add value now?
In the decisions AI cannot make: boundary design, data ownership, and what to build. The work moves up the stack from typing code to shaping the system the code lives in.
If the assistant writes the component, what is left? The judgment calls: where does this state live and who owns it, what is the contract between these services, is this a new component or a composition of existing ones. These - the questions in my "Frontend Architecture mental model" - depend on context the model does not have: the roadmap, the team, the tradeoffs nobody wrote down.
That is the more interesting half of the job. The tedious part - wiring the form, mapping the array, the ARIA attribute - is increasingly handled; what remains is design. So if you are choosing what to get good at: learn architecture. Prompting has a ceiling; architecture is what makes every prompt, and every engineer, more effective.
Should the AI write this?
Answer a couple of questions about the task and land on where the engineer actually belongs in the loop.
Is there a clear, typed contract or pattern for this already?
Where this breaks in real life
Guardrails reduce the failure rate; they do not zero it. The interesting failures are the ones that slip between layers, and the times the architecture itself was the thing that needed to change.
Guardrails are not a force field. Mine have failed, and the honest cases teach more than the theory - three that cost me something real:
The seam between two nets. An AI component read a config value that was always set in test fixtures but optional in production. Types were happy, tests were green, review waved it through - it only broke for users whose config omitted the field. The fix was not "more tests"; my fixtures were lying, so I now generate test data from the same schema that guards production.
The guardrail I trusted too much. I had an ESLint rule banning cross-boundary imports and leaned on it so hard I stopped reading import diffs. The model routed a forbidden dependency through the one path the rule missed - a re-export barrel. Zero violations reported. A quiet linter is not an intact architecture.
When the AI was right and I was wrong. I kept "correcting" an assistant that wanted to colocate data fetching in a feature - until I checked why it insisted, and found the codebase had already moved to server components, where that is correct. The model read the newer code more honestly than I read my own stale rule. I changed the architecture, not the output.
Tangled vs layered: what the AI sees
Toggle the two architectures and tap any node to see what the model generates when it lands there.
What the AI does here
Wires product components, passes data. The model has one clear job here.
Tap any node. In the tangled map every file is a different job; in the layered map each node gives the model exactly one.
A checklist for making your frontend AI-ready
Before your next AI-assisted feature, run a quick scan: is there one obvious pattern, a typed contract, a tight component API, and an automated net for each failure mode?
You do not need a migration project to get most of this value. Pick the friction you feel most and tighten it. Each item below makes both your AI assistant and your human team measurably more reliable.
- Patterns: is there exactly one obvious way to do this, visible from the files an assistant will open?
- Contracts: is the data shape defined once and shared, so hallucinated fields fail to compile?
- Component APIs: are illegal prop combinations un-typeable, not just discouraged?
- Guardrails: does a machine - not a human - catch each class of mistake before merge?
- Tests as spec: can you hand the model a failing test and let it generate to green?
- Hygiene: have you deleted the dead patterns the model might otherwise copy?
Lint your own props
Paste a real prop type from your codebase. The heuristic flags the loopholes an AI is most likely to misuse.
Rough client-side heuristic, not a real type checker — but it flags the loopholes an AI is most likely to walk through.
Need help implementing this?
I build these systems for a living - let's work on yours.
Frontend Architecture & System Design
Structure so teams can ship. Clear boundaries, state strategy, and contracts. New features land cleanly; refactors stay low-risk.
Frontend Development
Production UIs in React, Next.js, or Vue- third-party and payment integrations included. Built for real traffic and maintained in production.
Full Stack Development
End-to-end apps: MERN, React+Node, or MEVN. Auth, role-based access, and real-time where it matters. Designed to scale without rewrites.
Your turn
- >Did this help you ship something?
- >Which part clicked the most for you?
- >Applying this at work? Share your experience.
Recommended blogs
Continue reading

Shipping React UI Fast Without Making a Mess
The way I structure React and Next.js UI so the team ships fast because the system is obvious, not because we skipped every guardrail.
Photo by Zak Chapman on Pexels
Read article
JavaScript Closures Explained: Why Your Functions Remember Everything
Learn JavaScript closures with interactive demos. Covers lexical scope, the var vs let loop bug, stale React hooks, memory leak patterns, and closure interview questions.
Reference photo by Asad Photo on Pexels
Read article
Discussion
Leave a comment
Thoughts, questions, corrections - all welcome.