
When “Just ship it” turns into lasagna code
We start with a familiar scene: a small experiment that grows faster than its architecture can handle. This is exactly the kind of situation where clean architecture becomes essential. Like layering pasta without checking the recipe, we kept adding features until the structure began to wobble.
At first everything seems perfectly reasonable. A quick integration with an API, a prompt that produces surprisingly good results. Suddenly the prototype starts solving real problems. The codebase grows organically: a helper class here, a service there. Maybe a controller that does just a little bit more than it probably should.
Table of Contents
- When “Just ship it” turns into lasagna code
- Meet baby Leo: A tiny chatbot with big ambitions
- Step 1: A static recipe website
- Step 2: Adding a Quarkus backend
- Step 3: Webapp with database
- Step 4: Extracting text from images
- Step 5: Extracting text from images (bis)
- The first signs of architectural decay
- Why AI projects rot faster than traditional systems
- Clean architecture – quick recap
- Clean architecture in an AI context
- Ports and adapters: The moment the structure changes
- Local LLMs vs cloud models
- Testing without talking to the model
- Guardrails and validation in the domain layer
- Git branches as storytelling devices
- Lessons learned from real projects
- Why this matters for Java developers
- The final plate: Fast innovation without regret
Then another feature is requested, and another, and another.
Soon the architecture begins to resemble a large Italian family dinner where every dish ends up on the same plate. A little business logic mixed with infrastructure code, model calls hidden inside services and prompts scattered across the project like grated parmesan. Individually these decisions make sense. Together they create something that works. Until the moment you try to change it.
AI projects make this even worse. Models evolve quickly, prompts are constantly adjusted, and new capabilities appear almost weekly. What was a simple experiment yesterday becomes a moving target today. Without clear boundaries, each new change adds another layer to the lasagna.
And just like reheated pasta, every modification becomes a little more fragile than the last.
Meet baby Leo: A tiny chatbot with big ambitions
Our story begins with a small project called (baby) Leo. A simple assistant designed to help digitize and translate family recipes. The goal was modest: take handwritten recipes, extract the text, and make them available online.
Many traditional recipes exist only on paper. They are written in notebooks, on aging index cards, or sometimes on pieces of paper that look like they survived several decades of enthusiastic cooking and the occasional splash of wine, or, as with many of the best recipes, in the head of grandma, kept as one of the best secrets in the world. Beautiful pieces of culinary history, but not exactly convenient for a modern web application.
Baby Leo’s first task was therefore straightforward: read text from images. Using Tess4J, a Java wrapper around the well-known Tesseract OCR engine, the application could convert handwritten recipes into digital text.
At this stage the system was extremely simple. A static website displaying recipes, an OCR component extracted text from uploaded images, and everything worked well enough to demonstrate the idea.
Of course, software projects rarely stay small once developers start adding “just one more feature”. Soon baby Leo would grow up and translate recipes and eventually interact with large language models.
What began as a tiny kitchen helper was slowly becoming something much bigger. After all, in Italy even a small meal rarely stays small for long.
Step 1: A static recipe website
Before baby Leo became anything resembling an AI system. It started as something much simpler: a static website for sharing recipes.
The idea was straightforward. A small site where people could browse recipes. Nothing dynamic. No AI involved. Just HTML pages and a bit of styling. Think of it as the digital equivalent of a handwritten recipe book left open on the kitchen counter.

The design looks charmingly nostalgic. Olive-green gradients, decorative panels, and a menu that feels like it belongs somewhere between early 2000s web design and a family recipe archive. It may not win modern UX awards, but it has character. Most importantly, it contains something valuable: authentic public recipes.
At this stage the system had exactly one responsibility: display recipes. Users could browse categories like pasta, soups, desserts or drinks. Open individual recipe pages with ingredients and preparation steps.
From a technical point of view, the site was simple and stable. There was no backend, no database and no processing logic. Every recipe existed as static content.
But simplicity comes with limitations.
The recipes themselves still lived outside the system. Many existed only as photos or handwritten notes stored in notebooks. The website could display recipes, but it had no way of bringing new ones into the system automatically.
That is where the next idea appeared. If baby Leo could read a recipe from an image. Suddenly all those handwritten family recipes could become a part of the platform.
That meant it was time to teach baby Leo how to read!
Step 2: Adding a Quarkus backend
To begin learning how to read, baby Leo needs to grow into Leo: a young boy whose brain is now ready to connect letters, words, and meaning from the first reading material; Static pages were no longer sufficient (about as lively as yesterday’s risotto). So we brewed an espresso-strength Quarkus backend (Kotlin + quarkus-rest/Jackson, wired via Gradle wrapper) that now serves the legacy static assets from META-INF/resources and exposes JAX-RS endpoints (e.g., /api/version pulling app.version from application.properties). Add your own endpoints as easily as tossing basil on pasta: drop a @Path class, Quarkus hot-reloads it in dev mode, and RestAssured tests keep things al dente. Hint for Java amici: keep config in MicroProfile, let Quarkus handle dependency injection, and your controllers stay thinner than a good pizza crust. This was a significant turning point.
package be.lutske.leolegacy.interfaceadapter.rest;
import jakarta.ws.rs.GET;
import jakarta.ws.rs.Path;
import jakarta.ws.rs.Produces;
import jakarta.ws.rs.core.MediaType;
import org.eclipse.microprofile.config.inject.ConfigProperty;
/**
* REST resource that exposes the application version.
*
* The version value is read from application.properties via MicroProfile Config.
*/
@Path("/api/version")
public class VersionResource {
@ConfigProperty(name = "app.version")
String appVersion;
@GET
@Produces(MediaType.APPLICATION_JSON)
public VersionResponse getVersion() {
return new VersionResponse(appVersion);
}
}
Enable the first Jakarta EE (11) endpoint for the website.
Step 3: Webapp with database
With a backend now in place, the next logical step was persistence. Recipes should no longer live only in files but become proper data that can be stored, retrieved, updated, and expanded over time. In other words, Leo, a few anniversaries later and now grown into Leon, needed a pantry.
To achieve this, we introduced a database that could handle growth, backups, and the steady arrival of new recipes. The platform was no longer just serving static pages; it was becoming a real application where recipes exist as structured records instead of scattered documents.
Leon therefore evolved into a small but capable web application, extended with a relational database. Interestingly, because much of the boilerplate code was generated by AI agents, the usual pain of writing data mappings disappeared. That made it perfectly reasonable to choose a lightweight, straightforward approach. Instead of introducing a full ORM layer, we opted for a clean JDBC implementation.
The result kept the architecture simple and transparent. Queries remain explicit, the data model stays easy to understand, and the system avoids unnecessary abstraction while still providing reliable persistence. Leon could now remember recipes instead of merely displaying them.
Step 4: Extracting text from images
To solve the problem of handwritten recipes, we introduced optical character recognition (OCR). In the Java ecosystem, a practical choice for this is Tess4J, a wrapper around the well-known Tesseract OCR engine.
With this addition, Leon gained the ability to read images (at least on a basic level). In a way, this marked his transition into teenage Leonito.
A user could upload a photo of a handwritten recipe, and the system would extract the text. Ingredients, instructions, and measurements suddenly became searchable and editable. Decades-old recipes could finally be transformed into structured data.

Of course, OCR is rarely perfect. Handwriting varies, images may be blurry, and occasionally the system produces creative interpretations of what was written. A carefully written “basil” might become “basin,” and “olive oil” could occasionally turn into something far more mysterious.
And this is where the project started to grow in a familiar way. Once the system could extract text, it became tempting to add more features. Perhaps recipes could be translated, ingredients could be standardized, shopping lists could be generated automatically.
At that moment, teenage Leonito stopped being just a website.
It started to become an application.
Step 5: Extracting text from images (bis)
The next challenge appeared when teenage Leonito tried to read the recipes. None of them followed a consistent structure. Some were neatly typed, others scribbled in notebooks, and many existed only as photos taken in the kitchen between a pot of simmering ragù and a glass of house wine. Traditional OCR could extract text from these images, but the parsing stage quickly became messy. Ingredients blended into preparation steps, quantities disappeared, and sometimes the whole recipe looked more like a poetic monologue than structured data.
To solve this, we turned to something a bit more sophisticated. Over the past few years, large language models (i.e., LLMs) have proven to be wonderful at understanding messy information. Foundation models such as GPT, Claude, or Gemini are not just capable of reading text; they can interpret images, extract meaning, and reorganize content into structured information.
So teenage Leonito learned a new trick, taking his next step forward and turning him into a young adult.. Through LangChain4j, we extended his capabilities with support for LLMs that can interpret recipe images, extract the ingredients and preparation steps, and transform everything into a clean, structured format. Think of it as having a patient Italian grandma sitting at the kitchen table, carefully rewriting every handwritten recipe so it fits neatly into a cookbook.
There was one more small twist. Many of these recipes were originally written in Italian dialect or casual kitchen language. Since the platform itself uses English as its default language, Leonito now also translates the extracted recipes automatically. This way the knowledge of a 90-year-old Italian grandmother, who may never have written a formal recipe in her life, can still find its place in the digital cookbook. In a way, the system became a bridge between old kitchen notebooks from somewhere in Tuscany and a modern recipe platform running on a Quarkus backend. The only hurdle we still need to overcome is that grandma doesn’t want to share her recipes with the world, so probably we would need to think about the addition of security layers.
Example recipe import:
Action 1: Find the recipe

Action 2: Upload the recipe

Action 3: Review imported data

Action 4: Save and enjoy the recipe

The first signs of architectural decay
At this point, Leonito has started to rely quite heavily on LLM integrations. While this unlocks powerful capabilities, it also introduces a certain fragility. If we decide to switch models, change providers, or adapt prompts, the impact quickly spreads across the application. Integrations need to change, prompts need to evolve, and suddenly a simple model change turns into a broader refactoring effort. (ref. Why AI projects rot faster than traditional systems).
On top of that, technical concerns begin to leak everywhere. AI integration logic, REST exposure, database access, and other infrastructure details start appearing throughout the codebase. Even for a small application like Leonito, the number of dependencies is already growing. One can easily imagine what happens once the platform expands: more features, more models, more integrations, and potentially migrations to different environments.
If we want Leonito to age as gracefully as a well-kept Italian cookbook, we need to step back and think about structure. What happens if we want to swap a model provider, move from a local model to a cloud one, or change the database? Without clear boundaries, every change risks touching half of the application. That is why it becomes necessary to introduce a cleaner architecture (i.e., one that separates concerns, isolates dependencies, and allows the core logic to remain stable even as the surrounding technology evolves.
Why AI projects rot faster than traditional systems
AI projects tend to rot faster than traditional systems because the ingredients keep changing. Which makes clean architecture even more important. In a typical CRUD application the core logic stays relatively the same. Like a classic pasta recipe that rarely changes once perfected. AI systems are different. Prompts evolve, teams replace models, output formats drift and new capabilities appear every few months. What worked perfectly with one model version may behave differently with the next. Forcing developers to adjust prompts, validation rules or data pipelines. Without clear architectural boundaries, these changes pile up like layers of pasta in an overenthusiastic lasagna. Until the system becomes difficult to maintain. In other words, if traditional software ages like well-stored parmesan, AI codebases can spoil more like fresh mozzarella left too long on the kitchen counter.
Clean architecture – quick recap
Before we dive into “How clean architecture can help”, let’s first do a quick recap of how we look at the subject.
Clean architecture organizes software into clear layers so that the core business logic remains independent from frameworks, databases, and external systems. At the center sits the domain, which contains the fundamental business concepts and rules of the application. These are the entities and domain objects that describe what the system is and what constraints must always hold. Surrounding the domain are the use cases, which describe what the system does: creating a recipe, retrieving it, updating ingredients, and so on. Use cases orchestrate the domain logic but remain completely independent from technical concerns. In other words, the heart of the system should not care whether it runs on a laptop in Florence or in a cloud cluster somewhere else or whether the overall application is Spring or Quarkus based.
To interact with the outside world, use cases depend only on interfaces (often called ports). These interfaces define what data or services the use cases require. The actual implementations are provided by data providers (adapters) in the outer layers of the architecture. For example, a RecipeRepository interface may be defined close to the use cases, while a JDBC implementation, a REST client, or another persistence technology lives in the infrastructure layer. The important rule is that dependencies always point inward: the domain and use cases never depend on databases, frameworks, or external APIs. Those external components depend on the abstractions defined closer to the core.
At the edge of the system, we find the application wiring, where frameworks and delivery mechanisms live. REST endpoints, messaging consumers, or UI controllers receive incoming requests and translate them into calls to the appropriate use cases. Dependency injection is used to connect the pieces together: infrastructure components implement the interfaces required by the application layer and are injected at runtime by the framework. The result is a system where the core logic remains stable while technologies can evolve around it.
Or, to keep a small Italian touch: the recipe stays the same, even if the kitchen tools change.
With this evolution in mind, we can no longer speak of Leonito, the young adult, but of Leonard, now a mature man.

Clean architecture in an AI context
Before introducing clear boundaries, our project already had some structure. The code was separated into application, infrastructure, persistence, and inference layers. On paper this looked quite reasonable. In practice, however, the responsibilities were starting to blur.
Below is a simplified view of the project structure at that stage:
application
└─ service
└─ RecipeExtractionService
infrastructure
└─ ai
├─ ChatModelProducer
└─ LangChain4jRecipeExtractionService
persistence
├─ entity
└─ repository
interfaceadapter
└─ rest
At first glance, this looks organized. But the moment AI functionality enters the picture, the boundaries begin to soften.
The RecipeExtractionService in the application layer is responsible for orchestrating the use case. But it also starts to understand too much about how the AI model behaves. The infrastructure layer contains the LangChain4j implementation, yet the prompt design and response mapping begin influencing higher layers of the system. Over time, more logic tends to creep upward.
This is a common pattern in AI experiments. The model integration starts as a small helper class, but gradually becomes part of the application’s core. Prompt construction, parsing logic, and model configuration spreads across the codebase until the architecture starts to resemble a plate of spaghetti: everything intertwined, impossible to move one strand without disturbing the rest.
Another issue becomes visible when we think about change. What happens if we replace the model? What if we move from LangChain4j to a direct API integration? Or introduce a second model for translation or classification? Without clear ports between layers, those changes ripple through the application.
Ports and adapters: The moment the structure changes
To prevent the project from turning into a full plate of architectural pasta, we introduced a clearer separation between capabilities, AI integration, and use cases. This is the moment where Leonito stops behaving like a small script glued together with helpful classes and starts evolving into a real platform.
Instead of letting services talk directly to models and repositories, we reorganized the code around the principles of clean architecture. The use cases now describe what the system wants to achieve. For example, the application can extract a recipe from an image while the outer layers handle the technical details. AI integration, persistence logic and REST endpoints became adapters that implement clearly defined ports. In other words, the core of the system focuses on the recipe. While the kitchen tools around it can change freely.
In practice, this meant that our use cases no longer depended directly on LangChain4j, repositories, or HTTP controllers. Instead, they depend on ports that describe the capabilities the application needs, such as extracting a recipe or storing it. Concrete implementations live in adapters at the edge of the system: a REST adapter to expose the functionality, a persistence adapter for the database, and an AI adapter that communicates with the language model.
We untangled the spaghetti. What used to be a tangled mix of responsibilities is now a structured recipe where each component knows its role. The diagram below shows the simplified structure of the project after this refactoring.
leolegacy
│
├── domain
│ └── model
│ └── Recipe, Category, Ingredient
│
├── application
│ ├── usecase
│ │ └── ExtractRecipeFromImage
│ └── port
│ ├── in
│ │ └── ExtractRecipeFromImageUseCase
│ └── out
│ ├── RecipeExtractionPort
│ └── RecipeRepositoryPort
│
├── adapter
│ ├── in
│ │ └── rest
│ │ └── RecipeImportResource
│ └── out
│ ├── ai
│ │ └── LangChain4jRecipeExtractionAdapter
│ └── persistence
│ └── JpaRecipeRepository
│
└── config
└── ApplicationWiring
Local LLMs vs cloud models
We explore why running models locally with Ollama, Podman or vLLM can be attractive for experimentation, development speed and privacy. Instead of tightly coupling the application to a single external AI provider, the architecture allows models to run directly on local infrastructure when needed. Moving to a cloud-hosted model therefore becomes a simple configuration choice rather than a refactoring marathon.
This approach also fits nicely into the broader conversation around sovereignty. By supporting local models, models running on controlled infrastructure, and AI-SaaS providers such as OpenAI, Anthropic, or Mistral, the platform embraces a “keep your options open” mindset. This aligns well with clean architecture thinking: the application logic remains independent from the specific model provider.
In practice, this means that model selection becomes a decision based on real constraints and capabilities. Depending on cost, data confidentiality, response times, or model features, you can choose the environment and model that best fits the use case. One day that might be a local model running next to your Quarkus service, the next day a hosted foundation model in the cloud. The architecture stays the same; only the configuration changes. In other words, the recipe stays the same, even if the kitchen appliance changes.
Testing without talking to the model
By isolating the LLM behind a port, we can replace it with a fake implementation in tests. Suddenly, our test suite runs fast and reliably, without waiting for an AI to feel inspired.
Testing in this context also requires a slightly different mindset. The approach is twofold. On one hand, we validate the integration and the surrounding business logic up to the point where the prompt is sent to the model. This ensures that the application behaves correctly: inputs are prepared properly, prompts are constructed as expected, and responses are processed in the right way. On the other hand, the behavior of the model itself is not something we truly control or validate in unit tests, since it lives outside the application as an external dependency.
Because the behavior of the system is now heavily influenced by the model version, validating everything locally is often not sufficient. Small changes in model versions, prompt interpretation, or even inference server upgrades can significantly alter the output. For that reason, validation increasingly shifts toward end-to-end testing, controlled environments, and key-user validation. In practice, this means running tests against stable environments where model versions and lifecycles are managed carefully, rather than assuming deterministic behavior in isolated local tests.
In more advanced setups, infrastructure patterns can help manage this evolution. For example, placing service meshes in front of model deployments allows teams to experiment safely with newer versions. Techniques such as traffic mirroring or canary releases, where routing decisions can depend on user headers or other signals, make it possible to compare model behavior before fully switching over. We will not go too deeply into these operational aspects here, as they go beyond the coding focus of this article, but they align with the same philosophy: isolating change, managing dependencies carefully, and designing systems that can evolve without breaking the entire kitchen (i.e., clean architecture on the infrastructure level).
Guardrails and validation in the domain layer
The model may be adventurous enough to suggest pineapple in a carbonara, but the domain layer has the final word. That is exactly why Leonard needs guardrails (is someone mentioning marriage?). Once AI becomes part of the flow, the application must still protect its own integrity. Business rules should decide what is acceptable, what is not, and where the boundaries lie. The model can propose ideas; the system remains responsible for judgment.
In practice, that means protecting Leonard on several fronts. If imported recipes suddenly start proposing pineapple on pizza, the system should be able to reject them with the firmness of an Italian nonna guarding the family cookbook. If someone uploads a “recipe” containing thousands of lines of irrelevant text, the system should block it before it triggers an unnecessarily expensive AI call. The same applies on the output side: if a prompt or malicious instruction causes the model to generate an absurdly long response, the application should cut that short before the bill grows faster than a dough left too long in the sun.
So the guardrails are not there because we distrust AI completely. They are there because AI is only one ingredient in the dish, not the chef running the whole kitchen. Input limits, output limits, validation rules, and domain constraints make sure Leonard stays useful, affordable, and aligned with the purpose of the platform. He may be becoming a big boy, but even in Italy, every growing cook still needs a few house rules.
Git branches as storytelling devices
To make the evolution of the system visible, we developed each architectural step in a separate Git branch. Every branch represents a moment in Leonard’s journey: from a simple static website, to OCR-based extraction, to a backend with persistence, and eventually to a structured AI-enabled architecture. The branches also include the prompts we used to generate parts of the code, so readers can see not only what changed, but also how those changes were produced.
You can explore the entire progression step by step in the repository at https://github.com/luma-repositories/leo-by-lutske-and-maarten. If you would like to see the story behind the project unfold live (with a few Italian food references along the way), you can also catch our talk “AI without spaghetti: Clean architecture in the age of AI”, presented at conferences such as Voxxed Days Zurich 2026 and JCON 2026. In the meantime, the repository itself already reads a bit like a recipe: each branch adds a new ingredient to Leonard’s architecture.
Lessons learned from real projects
What we have learned so far is that domain knowledge should be kept separate from integration logic and AI-specific behavior. That separation makes the system much easier to test. For example, a real database integration can be replaced in tests with something simple and fully controlled, such as a hashmap. This allows you to write API-to-database tests that focus on the data and the expected behavior rather than on technical plumbing. You can refactor internals quite freely, and as long as the API contract stays the same, the tests remain stable. Only when the underlying data contract changes do you need to adjust the test implementation as well. The result is a testing approach that feels more functional, more predictable, and less dependent on endless mocking.
This separation also makes change far less dramatic. Swapping models, changing integration patterns, or even introducing a new deployment style becomes much easier when the core logic is not entangled with infrastructure details. We once saw this very clearly in a Spring Boot application from which Quarkus serverless functions were extracted for batch-oriented processing. The main adaptation happened in the application and deployment layer, while the majority of the code could remain exactly as it was. That is the real beauty of a cleaner architecture: not that it looks elegant on a diagram, but that it allows the system to evolve without turning every change into a full kitchen renovation.
Last but not least, dependency upgrades also become much more manageable. When external libraries are contained within smaller, well-scoped modules, their impact is limited to perhaps a handful of adapter or utility classes. That means upgrading a dependency, and with it reducing security exposure, no longer becomes a disruptive sprint priority every single time. Instead, it turns into the kind of maintenance work you can handle in a controlled way when the moment is right. Or, with just a small Italian touch: if the sauce is neatly separated from the pasta, you can improve one without ruining the whole plate.
Why this matters for Java developers
Backend Java developers and architects increasingly integrate AI into existing systems. While these features are easy to prototype, they introduce rapidly changing dependencies such as evolving models and shifting prompt behavior. Applying familiar Java practices like Clean architecture and ports and adapters helps keep those integrations isolated, ensuring that the core business logic remains stable while the AI tooling around it can evolve. In other words, the recipe of the system stays the same, even if we occasionally swap out the pasta machine in the kitchen.
The final plate: Fast innovation without regret
By clearly separating domain, application, and infrastructure, we keep experiments safe and reversible. Models can change, new integrations can appear, and the system can evolve without turning the codebase into a plate of architectural spaghetti. In short, we can move fast without ending up with architectural indigestion.
Along the way, Leo has grown into Leonardo. The platform now supports OCR, LLM integrations and structured recipes. But the menu is far from complete. Ideas like MCP support, RAG, automatic shopping lists and family-protected access are still on the table. If you would like to see where Leonardo is heading next. Come join our talk ‘AI without spaghetti: Clean architecture in the age of AI’. Where we share the latest updates and lessons from this architectural kitchen.
