Map your Code – Master your Architecture

Richard Gross

In projects with hundreds of thousands of lines, it is easy to lose track of code, architecture and quality. Are we still on the right track, are we blocking ourselves with internal dependencies, or are we already stuck? Software is immaterial; we cannot see how it is doing.

To know how our software is doing, we have systematically check if the mental models of the developers matches the written code. Spoiler, they do not, as software changes too fast to keep track, every developer has their own model and no one has the complete picture.

We call the process of matching expectation to reality code mapping. Without it, you might have a direction in which you want to go but no way to know if it will take you to your goal.

Imagine navigating with only a compass. You can decide to go east but whether you end up in New York, Paris or fall into the ocean depends on where you start. If you do not know where you are, you can end up anywhere.

Similarly you can give the direction to refactor for two months but if that time takes you anywhere depends on where you stand. Not all parts of software need to be #flexible after all, and you should know that there are no more pressing things like security.

To navigate to a desired place, you need both a direction and your current location. This makes code mapping (a) key to mastering architecture. Architecture being defined as:

The software architecture of a system is the set of structures needed to reason about the system, which comprise software elements, relations among them, and properties of both.

Software Architecture in Practice (SAiP) by the Software Engineering Institute

By mapping you can tease out the actual structures in code, how they relate to each other, the environment and even the developers. While powerful it is not the only method in an architects tool box and certainly not the only thing you should to master your architecture. Recent tool box additions like LASR or Residuality Theory are equally worth checking out.

But back to mapping. In no particular order it allows us to:

  • Manage changes
  • Improve cost and schedule estimates
  • Provide a basis for training
  • Enhance communication among stakeholders

The following chapters will show what mapping is and how to apply it by example.

On Mapping

We can master our architecture by periodically matching our expectations to reality, to what is actually codified. This code mapping is different to taking a pen and drawing boxes and lines. It is of course still worthwhile to do the latter and draw what we think. These are after all our expectations that we want to prove or disprove.

But what is really interesting is what the reality is. To extract this information we need to look at facts, for example code or git, and extract and visualize the information there. How we do that depends then on what facts we have available, the power of our tools and what category we want to map.

Once we have the tools in place we can start our iterative design process (heavily inspired by the PDCA cycle):

The Code Mapping Cycle
  1. We make architectural decisions
  2. We implement them
  3. We map the results
  4. We adjust our thinking, then -if needed- our architecture

There are multiple things we can map in a code base. A good place to start is with the large structures of our code.

Map Code Structures

The first thing we can do in a code base is to map the structure. My preference is to use the open-source tool CodeCharta for this because it is free, maintained and continuously improved since 2017.

CodeCharta can visualize the metrics of our code base as a 3D tree map. Seasoned developers might know this approach from CodeCity, last released in 2009. I don't make any claims of originality here. The city metaphor is simply are very effective visualization for code bases. CodeCharta is novel only in that it is very easy to get started with and it is well-maintained.
Please note that code mapping is a universal method and not tied to any tool. You do have to use tools though to have some semblance that what you map is based on facts, not feelings. CodeCharta is simply used here because it is free to use and I am very familiar with it, having helped develop it. If the other tools I mention are more familiar to you then use them instead to map. You can find a summary at the end of this article.

Let us now look at Discourse. We will visualize each file as a building. The size of the building (✥) represents the real lines of code (RloC). Real because these are only the code lines that we actually have to read to understand the code. Comments or whitespace are excluded. The result is the following:

Discourse Code Map

In the visualization you can immediately see that discourse consists of basically three technical clusters: app (259k), plugins (254k) and spec (240k). Which tells us that discourse is intended to be customizable via plugins. It also tells us that tests are valued: app and spec have almost a 1:1 ratio and each plugin gets its own tests, something we can see by highlighting “*spec*”.

It does not tell us anything about what discourse is for though. app, probably the core of discourse, is structured in a technical manner. It’s about models, services and controllers. Likely because that is the idiom for Ruby on Rails applications.

The name of this structure pattern is package-by-layer. It groups things together not by relation (e.g. controllers typically do not call each other) but by technical whims of the framework. You can immediately spot the framework, Rails, but nothing about the domain that the software supports.

To me this pattern is problematic because it does not tell me anything about what belongs together and I cannot grasp the domain at-a-glance. This makes it harder to onboard people (the directory structure is the first thing they encounter) but it also means that feature development is spread out across multiple code places.

Contrast that with the structure of the plugins. They seem to be using more of a package-by-feature pattern where all layers of the feature (controllers, models, …) are in the same place (see Martin Fowler’s Bliki or this post by Simon Brown). You see plugins for chat, calendar or poll, and that tells you more about the domain than technical layers ever can. The tests also get this context by default. Any test inside chat is about testing the chat. To me that seems much easier on the brain. We take a cue from neuroscience (explicitly Hebbian Theory) and wire together what fires together.

If we do it well, we can get the following text-book example:

DDD Library Code Map

Seeing the structures catalogue, lending, patron tells you a lot about the domain.

Of course, it’s not entirely fair to compare this sample project (with around 3,000 lines of code) to a huge system like Discourse. I did this for two reasons. Firstly, it was important to me to show how the business domain can be communicated directly via the structure. Second, there are no large OSS projects that use package-by-feature and use functional domain terms. Large open source projects “only” support business processes, but are not themselves part of the core business. Therefore, the features have technical names such as validation and not functional domain names such as patron.

For example the OSS PHP web framework Laravel uses package-by-feature:

Laravel Code Map

Here you can clearly see the parts that Laravel provides the user. The logic for validation or accessing the database is not spread out but in one place. Once you accept the technical feature names, it is easy to grasp what Laravel is about.

I encourage everyone to give package-by-feature (or the Simon Brown alternative package-by-component) a try. It might be your first “adjust” step. After adjusting your packaging you will likely have your next insight. It is often surprising how much code has been invested per feature. This in turn can lead to the question if all details of the feature are actually used. If it turns out only a fraction are needed by actual users (e.g. we could map the feature usage) we could decide to delete obsolete code.

Map Code Complexity

Let us now map the Spring Framework:

Spring Framework Code Map

This time we place cyclomatic complexity on the height (↨) and also on the color (🖌️) of the building. Cyclomatic complexity tells us how many branches in code there are (if, case, for, catch, …) and that translates to how many decisions are made in the file. Buildings with 100 or more decisions are colored yellow, those with more than 200 are colored red here. I prefer lower thresholds for my code but these values give us a good overview where the major complexity of the Spring framework is situated without being overwhelming.

Clearly spring-core and the spring-beans are hard to understand, given that files are quite large and complex. But the WhatWgUrlParser inside spring-web is also quite interesting. It has 2k RLoC and 840 cyclomatic complexity and exists because it’s what browser use to parse Urls that are not spec-compliant (RFC 3986). A lot of code can be needed to be lenient with Urls.

The term cyclomatic complexity was coined by Thomas McCabe in 1976, long before Java (1995) or C++ (1985) were released. The metric is part of a field of study called "computational complexity theory" (so Wikipedia tells me), like the complexity classes of your favorite algorithms.

Unfortunately the name complexity has a very different definition in "complexity science". Here complexity means "the presence of emergent behaviours and properties in a system" (as explained by Barry O'Reilly, inventor of Residuality Theory). As code has no emergent behaviors, it will do exactly what you wrote, software can never be complex. Software is only ever complicated. From this definition viewpoint, the metric should have been called cyclomatic complicatedness.

You might have stumbled upon complexity science the same way I have, through the Cynefin framework and wondered about these things. As they say, naming is a hard problem, especially when different fields of study clash. In this article we stick with complexity because that is the name of the metric.

By itself the complexity view does not lead to an “adjust”. While it is useful (and sometimes surprising) to see where complexity is spent, it is only useful to look at complexity together with other metrics. But before we can combine multiple metrics, a word (a chapter) of warning.

On Metrics

When a metric becomes a target, it ceases to be a good metric.

Free interpretation of Goodhearts Law

Goodhearts paraphrased adage sums up the problem with metrics quite well. They are very bad as targets.

Let us say the target is to “Decrease the complexity of these 100 classes to below 30 within 4 weeks.” It is really easy to do so by using the “Extract class” refactoring. Whether the result is understandable does not matter. If we are evaluated or paid based on the cyclomatic complexity per class, we’ll keep it low no matter what. Job accomplished, the patient is dead.

At the same time metrics are very useful as the start of a conversation, especially when we combine it with other metrics:

Why does this really complex class have so little line coverage? Do we need to do something about that? Does that low coverage create problems for future feature X?

Possible team conversation

We need metrics to handle our huge code base but they can only show where we can start exploring. They cannot show answers. Please keep that in mind when mapping and before adjusting.

Map Knowledge Silos

When is complex code bad? Due to all the decisions that complex code encodes it can become very bad when only one person knows it. Here is an example from a customer project where we put the number of authors on the color (🖌️). This is information that is available in the git log.

A file becomes a knowledge silo if only a few people know about it. In the following picture every file that has one author becomes red, every file that has two authors is yellow, and three or more authors is white.

Knowledge Silo Code Map

The Librarian class is quite complex but only has one author. It is also the linchpin in a feature that business considers critical. Clearly a place where we should increase the knowledge through pair/ensemble programming or similar. Especially if the author is leaving the team or about to leave.

But is Librarian really a problem? Depending on the team there are multiple situations where it might not be an issue. For example:

  • The team does regular team-programming but does not use the git trailer feature like Co-Authored by: X
  • The git metrics are flawed for some other reason (e.g. pull-requests are always squashed by the same person -> generally not a good idea)
  • Librarian is changed only very seldom

It is often worth exploring further and looking at how often the code is changed.

Map Often-changed code

Even very complex code is not necessarily dangerous. If the code is just there and does its job then we do not need to tackle it. But what if it is constantly changed, what if it has a lot of insertions {+} and/or deletions {-}? It is much harder to understand complex code than simple code. Hence modifications in complex code take more time, are more error prone and that makes them more expensive. We can of course also map such code. In the following example we put the ratio of insertions/deletions (we call this churn) on the color:

High Churn Code Map

ModuleService in the top-left corner immediately sticks out. It has a lot of code, a lot of decisions, and a lot of change. In addition the name is intention-hiding. This is a prime refactoring target before we implement the next feature there. But we would do well not to stop the investigation here. Maybe there is some form of coupling at play that leads to all this churn?

Map Change Coupling

This time we take the same customer project but map number of authors on the color. A high number of authors points to a coordination bottleneck: many authors need to change the same file because it is used in multiple contexts. We can do another thing and visualize change coupling: when two or more files are frequently committed together, we say they are change coupled. To change one you also have to change the other, a fact that is clear in the git history. The resulting map can be seen below.

Change coupling was first proposed by Adam Tornhill in his book Your Code as a Crime Scene, then under the name "Temporal Coupling". It is also available in the tool CodeScene which is well-worth checking out.
Change Coupling Code Map

We can clearly see that ModuleService is safe in terms of authors and change coupling. Complexity is high, so is churn, but no coordination bottleneck. RentService however is committed together with a lot of other files. It does not import these other files and neither does the inverse happen.

It is thus likely that a hidden coupling exists between RentService and these other files. That can take many forms. In stringly typed code (where types are represented with strings instead of modelling them explicitly) the coupling happens because the code switches on the same string code in multiple places. There are multiple other ways in which such a hidden coupling can exist (see also connascence) and something that has to be explored in the team.

Next we can look at the explicit coupling in code, the dependencies that the files have.

Map Dependencies

import statements create explicit dependencies between files. With tools like SonarGraph we can check the compiled (byte) code and visualize the dependencies as a graph.

In this article we are going to be using an alternative tool (as of time of writing not yet open-sourced). Mainly because it has a very powerful visualization algorithm (more on that below), and is very easy to get started with as it can parse source code directly without needing build artifacts. I am also very familiar with it, having helped develop it.

When we map dependencies we are particularly interested in cycles. These can take multiple forms.

The most obvious is A directly depending on B and B depending on A. This can already be quite vicious. Changing A can lead to a change in B which might lead to a change in A to accommodate and so on.

Cycle between A and B

Then there are transitive dependencies. A depends on B, B depends on C and C depends on A. Again changing any of the elements can lead to changes in any other element in the chain. The longer the chain the more of these changes can ripple through your code base and lead to unforeseen changes.

Transitive Cycle between X and Y and Z

Finally we have architecture cycles where features or layers have cyclic dependencies between them. We can visualize such problems by arranging our packages into rows from top to bottom (an algorithm that was pioneered by Structure 101, now part of Sonar). The elements in the first row have only dependencies on the elements below them. The elements in the last row have only ingoing dependencies. Elements in the same row have no dependencies between them.

In most code bases this perfect structure does not exist however. There are always some cycles. To solve this we use a heuristic and assume that the one with more outgoing dependencies should be placed higher.

The result of this ordering is that we can skip drawing the arrows going down and only draw the red arrows that go up. This reduces the noise in the map significantly and allows us to focus on the cycles that we need to manage. The following example code (only subtly inspired by a famous tabletop role-playing game) uses a variant of Jeffrey Palermo’s onion architecture. Normally all dependencies should point down towards domain. But domain also depends on the higher-level application.

Layer Cycle

The problem with cycles is that fixing them is hard (i.e. time-consuming) but creating them by mistake is easy. It is something that can happen even to experienced teams.

CodeCharta Feedback when packaging-by-layer

The graph shows the packages sorted by the previously described algorithm. Because the dependencies going down are implicit, we can hide them and reduce the signal-to-noise ratio. We can focus on the red arrows going up. These upwards dependencies, aka feedback dependencies, can easily create knots in our brain.

To understand services you first have to understand ui but to understand that you first have to understand services but to understand that…

This deadlock does not only happen in your brain. It can also happen when you change something because that can cause ripple-effects where a change in service leads to a change in ui which leads to a change in service and so on. Normally the layer at the top should be safely changeable without affecting lower layers.

The first step to address the red dependencies is to package-by-feature, which will clarify which features depend on each other. Dependency-breaking techniques such as consumer-defined interfaces can help once this work is completed. Often it is a case of placing an element where it is needed the most and then breaking the dependency that other features have or delegating their needs.

A framework that is organized by feature since a long time is spring 🍃. The separation is quite strong because each coarse-grained feature is separated into its own gradle project 🧩. This way the projects cannot have cycles between them because the compiler does not allow them. Each of these projects has their own package 📦 hierarchy.

🍃 spring-framework              
├─🧩 spring-context             
  └─📁 src
    ├─📁 main/java
      ├─📦 org.springframework.cache
      └─📦 org.springframework.context
    ├─📁 test/java
      └─📦 org.springframework.context
    └─📁 testFixtures/java
      └─📦 org.springframework.context
├─🧩 spring-context-support           
  └─📁 src
    ├─📁 main/java
      └─📦 org.springframework.cache
    ├─📁 test/java      
      └─📦 org.springframework.cache
    └─📁 testFixtures/java
      └─📦 org.springframework.contextsupport
├─🧩 etc.

The package org.springframework.context shows up both in the main production code as well as in the test and testFixtures. This is normal. However org.springframework.cache shows up across multiple projects. This becomes important when we map the dependencies of the framework:

Spring Framework Feedback including Tests

We have feedback even though the compiler does not allow them.

One reason is that the tool we are using works on source code, not build projects, and groups the files by package. Files with the same package name but different build project will end up in the same location. E.g. org.springframework.cache is located in multiple gradle projects. This is known as the split package problem and is not allowed when using the java platform module system (JPMS). When not opting-in to JPMS, files in the same package but different jars can at runtime override each other or access each others package-private members, potentially breaking encapsulation. As far as I know Spring does not use JPMS because it creates significant overhead for them to implement it.

The more important reason for the feedback dependencies however is that test files can create feedback, even if the production code has none. E.g. the files in src/test/..cache depend on files in src/testFixtures/..contextsupport which then depend on src/main/..cache.

If we remove the src/test and src/testFixtures folder and only leave the production code in src/main we get a much more manageable picture:

Spring Framework Feedback without Tests

If we look at only the production code in main then there is almost no feedback between the top-level features. If we include test and testFixtures then we have lots. This creates an interesting debate. On one hand src/main cannot depend on src/test (in maven at least) so you cannot have the ripple-effects when changing code. On the other hand these test dependencies can still create knots in your brain.

Of course we should not only look at top-level cycles but also how the dependencies inside the top-level packages look. We can take a deeper look at http which is part of spring-web 🧩:

Spring-Web.http Feedback Dependencies

In this picture the file-based cycle arrows are enabled. These are blue downwards arrows that form a cycle together with the red feedback arrows. E.g. DefaultBodyBuilder implements BodyBuilder which uses RequestEntity which uses DefaultBodyBuilder.

To keep an architecture #flexible it is paramount to manage all this feedback. Getting an overview by mapping the existing ones is a great first step. It would be even better if we could prevent them in the future, which the next chapter can help us with.

Map Architecture Violations

Most programming languages are quite limited in what architectural constraints we can model. Usually we only have the visibility modifiers public, private etc. that protect us at design and compile-time. With ArchUnit we can add tests to our java applications that guard even more (see ts-arch, dependency-cruiser or Sherrif for your JavaScript/TypeScript project).

ArchUnit provides a fluent DSL that makes it easy to describe and understand what we want to guard against:

classes().that().resideInAPackage("..foo..")
    .should().onlyHaveDependentClassesThat().resideInAnyPackage("..source.one..", "..foo..");

The DSL can also be used to guard against layer cycles:

layeredArchitecture()
    .consideringAllDependencies()
    .layer("Controller").definedBy("..controller..")
    .layer("Service").definedBy("..service..")
    .layer("Persistence").definedBy("..persistence..")

    .whereLayer("Controller").mayNotBeAccessedByAnyLayer();

or even package cycles:

slices().matching("com.myapp.(*)..").should().beFreeOfCycles();

Adding these rules to an established project is not easy however. Fixing all the discovered violations would take a long time. It is often better to fix only the critical violations, accept the existing ones but still have a safety against new violations. ArchUnit provides a feature to record violations and only error when a new violation is created:

freeze( // accept existing violations
noClasses().should().dependOnClassesThat().resideInAPackage("..service..")
);

We can then map these violations to figure out where we should focus our time.

Architecture Violation Code Map

Map Feature Violations

Showing violations for each individual file is probably too detailed for most of your stakeholders. My colleague Andreas Blunk had another great idea. He summed up the violations per feature and displayed those instead. Of course this only works if we have packaged by-feature. The result gives stakeholders a good visualization of the remaining work.

Feature Code Map

Andreas also used the aggregated view to show the progress of the modernization effort. A delta map allowed him to show where his team had decreased architecture violations (green roof) and where new violations where accepted (red roof). Stakeholders can “see at a glance which areas are well optimized and where there is still room for improvement.” A good place to show them is during a review where a lot of changes happened “under the hood.”

Feature Delta Code Map

Conclusion

Code mapping is the process of matching expectation to reality. It is beneficial not only for individuals but also for teams. A good map makes code tangible and enhances communication among stakeholders. It can also visualize what features the application has and how much (code) was invested, which can provide a basis for training. We can also improve cost and schedule estimates. The metrics in the map make it plain which parts are #flexible and which require a (significant) overhead to change. Finally the map also allows us to manage changes. It provides insights where to share more knowledge and what to split. Together with a feature roadmap we can better plan our time and focus our modernization efforts on the parts that are about to change.

We all use maps daily. Navigation on the streets, in transit or in buildings (floor plans) is much easier when you know where you are; and can see the path to your target. We can get the same clarity by mapping our code, especially if we are new in the project. Every month or two we can repeat the activity to see if we are still on-track or need to course-correct. The map makes it easier to communicate not only our goal but also changes to that goal to our stakeholders. I encourage you to give mapping a try.

Appendix – Mapping with CodeCharta

Getting started with CodeCharta is incredibly easy thanks to the new unifiedparser. It can parse source code directly thanks to the amazing tree-sitter parser-generator library.

# clone a project of your choice
git clone git@github.com:laravel/framework.git laravel-framework

# check you have java installed, should print "java 21.0.2" or similar
java --version
# check you have npm installed, should print "10.9.2" or similar
npm --version
# download the codecharta analysis tools 
npm install -g codecharta-analysis
# analyse laravel
ccsh unifiedparser laravel-framework -o laravel-framework.uni
# open the CodeCharta docs, click on "Web Studio", then load the generated file
open https://codecharta.com

CodeCharta has multiple other parsers for different sources. One parser for git, one importer for sonarqube and of course one tool to merge the result. Check out the project page if you are interested. If you are very interested consider helping out. There is still a lot that can be built.

Appendix – CodeCharta Alternatives

CodeCharta is definitely not the only mapping tool out there.

CodeScene uses circular packing for its map and automatically aggregates code health of your code. Seerene provides a map but also a digital boardroom that gives CTOs an overview of their whole enterprise.

SonarGraph is great for mapping and identifying bad dependencies in your code. You might have seen their visualization in the book Sustainable Software Architecture. An alternative is Structure101 which has now been integrated into Sonar.

JetBrains Quodana and SonarQube are static code analysis tools that visualize code issues with dashboards, not as maps. With the right importer their insights can be mapped with CodeCharta however.

This article is part of the JAVAPRO magazine issue:

Agentic AI Meets Java

Explore how agentic AI introduces new opportunities and challenges for Java development — from conceptual shifts to practical learnings.

Discover the edition 

Total
0
Shares
Previous Post

Early Bird 30% — JCON GenAI — The Enterprise Java AI Revolution Begins

Next Post

Agile Nightmares – When Agile Methods Become an Innovation Barrier

Related Posts