What Is Code?

12 May 2026

Unmesh is a Distinguished Engineer at Thoughtworks, based in Pune, India. He is the author of Patterns of Distributed Systems.

generative AI

Two Aspects of Code
Conceptual Models and Vocabulary
Vocabulary in Code
Bounded Contexts and Local Vocabularies
Programming Languages As Thinking Tools
Working with LLMs
Cognitive Debt
Code as a shared Conceptual Model
Code itself as Harness and Context
Conclusion

What is code? At a high level, the answer to this question seems obvious. Code is what developers write: instructions expressed in a programming language that tells machines what to do. For years, writing code meant typing it out, word by word. Progress is measured by how efficiently code can be produced, compiled, tested and deployed. With modern LLMs we no longer need to type every word to produce code. Large amounts of executable code can now be generated from high-level descriptions. This forces a deeper question: If producing code becomes cheaper, what remains valuable about code?

Two Aspects of Code

Code has always served two distinct but intertwined purposes.

First, code is a set of instructions to a machine. It directs computation, moves data, interacts with storage, and coordinates execution. In the era of LLMs, this is the part being commoditized.

Second, code is a conceptual model of the problem domain. This is the “design” aspect. A well-designed codebase does not only contain instructions for the machine; it also contains concepts for humans and tools to reason with.

The activity we call coding is where these two aspects meet. We are shaping the concepts, names, boundaries, and relationships through which the system is understood.

Conceptual Models and Vocabulary

Making the conceptual model explicit is the deeper aspect of coding, driven by the domain and the use cases the system is meant to address. Every domain comes with established processes, practices, and more importantly a shared vocabulary. That vocabulary is where the conceptual model becomes visible.

Vocabulary is usually understood as the set of words used in a particular language or subject. I am able to write this article because I know the vocabulary of English. The reader can read it because they share that vocabulary with me.

But to understand this article, knowing English alone is not enough. This article is about software development. Software development is a broad, mature field with its own technical vocabulary. When I use a word like abstraction, I am not merely using an English word. I am referring to a specific software development concept, with its own meaning, history, and implications. A reader unfamiliar with software development may understand the word at the surface level, but miss its deeper meaning in this context. The mature areas with their own established vocabulary are called domains.

This is true of all serious domains. Communication depends on shared vocabulary. Whether we are communicating with a person, a framework, or an LLM, the words we use must map to concepts that the receiver can understand and act upon.

Vocabulary in Code

A well designed codebase is a representation of a certain vocabulary. Where does this vocabulary in code come from? This is where the unique nature of software development truly shines. Software development works on the intersection of various domains. At one end we have the domains such as banking, finance, retail, inventory, healthcare, insurance etc. On the other we have domains like web, infrastructure, AI, data engineering etc.

Someone doing web development needs to have a strong grasp of web architecture, the semantics of web methods, the universal caching potential of GET, and the implications of those semantics. Someone who does not know that will not architect complex systems well. The same is true in other domains. Vocabulary is not just a collection of labels. It carries meaning, constraints, and design consequences.

Consider a retail domain. When we write code for retail we talk about customers, products, orders, shipments, payments etc. When we are doing web development for the retail domain, the code contains concepts that map retail domain to web domain. E.g. Catalog is a resource and we can use GET, POST, PUT, DELETE HTTP methods to perform operations on it. Someone writing code needs to be familiar with both the vocabularies.

Coding for a domain is fundamentally an act of translation. The developer maps the domain vocabulary onto the vocabulary of technical domains. In doing so, a new vocabulary is also built using the constructs provided by a programming language. There are concepts like logs, repositories, quorums, transactions, and specific concepts like money. Concepts become types, relationships become interfaces, rules become invariants, and workflows become compositions.

The precise names of variables, the boundaries of methods, and the hierarchy of classes are discovered step by step. The right abstraction often is not obvious upfront; it reveals itself only as you continually mold and refactor the code against real-world constraints. When used well, the code slowly becomes a readable, highly specific representation of the domain itself.

For the technical domains we typically find frameworks and libraries which provide the base implementation patterns. Frameworks and libraries are codified vocabularies. They capture the most common patterns of usage. That is why ecosystems such as the Spring Framework exist for building enterprise applications involving the web, integration, and related concerns. Different programming languages bring their own flavor, along with specific design constraints that get reflected and codified in their frameworks and libraries.

Bounded Contexts and Local Vocabularies

Frameworks work well when a domain has stable, recurring structures with broadly shared semantics. But something like “online retail” or “stock exchange” is different. Those are not just technical stacks. The main reason there is no universal high-level framework is that the vocabulary is not stable enough across all instances of the domain. The attempt to find universal abstractions become either too generic to be useful, or too opinionated to be widely applicable. The closer you get to the core business model, the more the abstractions must be discovered locally. This is why the idea of a bounded context in Domain-Driven Design is so important. A bounded context marks the boundary within which a particular vocabulary and model are valid. The same word may mean different things in different contexts, and each context needs its own abstractions, rules, and language.

How do we build these local abstractions and vocabulary? A lot of this vocabulary is built through iterative sessions where we write code and reflect on it. Techniques like TDD are excellent for this iterative development of the vocabulary. They help us discover the right names, the right abstractions, and the right boundaries by forcing continuous feedback between the model and its behavior.

Coding can not happen in isolation. There must be close collaboration between domain experts, users and developers. This collaboration is necessary to build these local abstractions and vocabulary.

This connects directly to the lessons of agile software development. The emphasis on individuals and interactions, customer collaboration, working software, and responding to change is not just process advice. It is a way of discovering and refining vocabulary through feedback.

Domain-Driven Design makes this more explicit through the idea of a ubiquitous language: a shared language developed by developers and domain experts and tested continuously against working software.

Programming Languages As Thinking Tools

Building vocabulary through code requires active engagement in writing and reshaping code; not just passive review of generated code . The very act of thinking deeply about code often happens only when we are actively engaged in writing it. Programming languages and their constructs and constraints themselves become thinking tools. The design constraints provided by different programming languages help shape our thinking. The channels and lightweight threads of Go, the object-oriented model of Java, or the ownership model of Rust all push us to see structure, boundaries, and trade-offs in particular ways. In that sense, programming languages do not just help us express a design. They also help us discover it. I recently had to design a custom Future implementation for the asynchronous programming examples. One of the important aspects of future API is to design the compositions to be able to express a sequence of actions.

         var future1 = action1();
         future1.thenCompose(val1 -> action2(val1))
                .thenCompose(val2 -> action3(val2))

Knowing the concepts and vocabulary of functional programming is crucial to be able to implement this api well. Not knowing those concepts results in awkward implementation and usage.

Sometimes, the programming language syntax can become too verbose and hide the underlying structure of the solution. For example, recently while working with a snapshot isolation implementation for my workshop, describing the essential requirements in plain english was a bit vague and putting it in Java code was too verbose. More constrained formal specifications like TLA+ would have helped. But even writing a single page pseudo formal spec helped significantly.

         Begin(T, coord):
           R(T) := HLC(coord).now()
           writeSet(T) := {}


         Read(T, N, key):
             N.HLC.tick(R(T)) //HLC advanced. So any write or commit after this is guaranteed to be at a higher ts
             return latest committed version of key with ts <= R(T)


         Write(T, N, key, value):
             N.HLC.tick(R(T))
             if LatestCommittedVersion(key).ts > R(T):
                 abort T
             place provisional intent for (key, value)
             writeSet(T) := writeSet(T) union {key}

This pseudo-formal spec helped clarify my thinking and served as a good basis for further discussions, implementation and validations through tests.

Working with LLMs

Considering 'Coding as Vocabulary Building', has important implications for LLMs. LLMs are trained on vocabulary from a large body of text and code. They learn recurring relationships between names, APIs, libraries, frameworks, idioms, design patterns, and implementation structures. When they see words such as Controller, Repository, Reducer, ConsensusModule or TransactionLog those names are not just labels. They carry associations with known code structures and expected behavior.

This is why vocabulary matters when working with LLMs. If our prompts use vague or inconsistent language, the model has to guess the design we intend. If our codebase uses unclear names and inconsistent concepts, the model has little stable structure to follow. But when the vocabulary is precise, consistent, and embodied in the code, the LLM can map our intent more reliably to useful implementation.

Cognitive Debt

This also explains a particular danger of LLM-assisted coding: cognitive debt. Cognitive debt accumulates when words, abstractions, and structures are used without their meaning being well understood by the people working with them. LLMs amplify this risk because they can generate large amounts of plausible code very quickly. The generated code may contain controllers, repositories, reducers, factories, transactions, schedulers, or other familiar-looking structures. The code may compile. It may even pass basic tests. But if the team does not understand the conceptual model behind those structures, the codebase has gained vocabulary without shared understanding. The problem is not that the LLM generated code. The problem is that the code introduced vocabulary faster than the developers built understanding. That gap is one of the major contributors to cognitive debt.

Code as a shared Conceptual Model

As we discussed in Designing Abstractions with LLMs, writing code has two deeply interwoven activities: discovering and applying abstractions. Discovering the abstractions is where we are developing the vocabulary. Once a strong vocabulary is built, it represents a shared conceptual model. Once that model exists, much of coding becomes using that conceptual model to build use cases. This is where good libraries and good foundational code shine. This is the part where we try to hide the intricacies of the programming language and environments, and give a more and more English-like interface to the vocabulary we have built. A typical way this works is to build a DSL to make using this vocabulary or these abstractions easy and close to natural language.

LLMs are excellent at this. They provide a natural-language interface to your abstraction vocabulary. The best part is that if you have executable code behind your vocabulary of abstractions, it itself acts as an excellent guardrail for the LLM to fix its mistakes. Good abstractions, executable behavior, tests, types, and invariants all help constrain the model and make its output more useful.

We can use the vocabulary to build an external DSL. LLMs work very nicely as a natural language interface on top of the external DSL. I have been using LLMs to great success with tools like PlantUML. And it's not a surprise, since LLMs by design work best to map vocabularies.

In that sense, strong foundational code becomes even more important in the age of LLMs. Once the vocabulary of a system is well formed, coding becomes less about producing raw syntax and more about using a well-developed conceptual language to build reliable software.

Code itself as Harness and Context

A lot of discussion on context engineering and harness engineering treats code as blackbox with the responsibility of it to be generated correctly managed externally by providing right context in prompts or constructing right harness with specs, tests, static validations etc to make sure it is structured and works as intended. A well structured code with abstractions forming a well defined vocabulary itself acts as the most important part of the harness and context. I have repeatedly seen good success with well designed code working very well with the LLMs. More importantly, when the code is built with stable abstractions with clear semantics, you get some freedom to choose whatever LLM model you use and do not need to worry much about how accurate your prompts are. Code structure and accompanying tests themselves provide context and harness that makes the LLMs output reliable and useful.

Conclusion

The role of coding is not disappearing. But it is changing.

As LLMs make code generation cheaper, the mechanical act of writing instructions becomes less central. What becomes more important is making the conceptual model explicit, discovering the right vocabulary, and refining that vocabulary through iteration, domain expertise, and feedback. This is also why programming languages continue to matter deeply. We are not meant to be passive reviewers of generated code. The act of writing code is itself part of our thinking.

Code is still instructions for a machine. But it is also a model of understanding. In the LLM era, that second role becomes even more important. The future of coding is not just writing more code faster. It is building better conceptual models, better vocabularies, and better foundations on top of which both humans and LLMs can work.

Acknowledgments

I would like to thank Rebecca Parsons and Martin Fowler for their valuable feedback on the early drafts of the article.

Margaret Story, Max Kanat-Alexander, Premanand Chandrasekaran, Nate Schutta, Kavi Sekhon, Misma Silfver, Ricardo Piccoli, and Vinod Kumaar R commented on the draft and gave suggestions.

Gemini and ChatGPT were used to help refine the language of the existing write-up and assist with copy-editing the article.

Significant Revisions

12 May 2026: published