Why Integration Exposes Bad System Design

When we finally connect two or more systems, suddenly hidden assumptions and mismatches pop up. Integration isn’t creating new bugs so much as shining a spotlight on flaws that were always there.

It’s like assembling a puzzle: the picture only makes sense when the pieces really fit. In this post, we’ll explore how integration acts as the ultimate design review, using real-world analogies to highlight key lessons for building robust architectures.

Integration as the First Real Design Review

Integration is often the moment of truth for our architecture. You might write unit tests and mock dependencies, but until components really exchange data, you won’t know if the designs align.

Think of it like cooking a new dish: baking bread and preparing soup separately might work, but only by combining them can you taste the final meal. In one project story, unit tests passed without a hitch until integration tests began. That’s when a “shallow, but wide” flaw in the design surfaced, caused by an unstable entity ID scheme. In other words, integration revealed that what seemed fine in isolation didn’t hold up when everything ran together.

A simple analogy: Imagine two dance partners who practiced alone. Both know the steps, but they are out of sync when the music starts. Integration is like that it forces teams to agree on how pieces connect. By treating integration as the first real design review, we catch these misalignments early. Instead of blaming integration for introducing bugs, recognize it as the exam that tests your design. In the wild, this mindset means writing integration tests and staging systems early to catch mismatches in messages, data formats, and timing.

The Role of Contracts

This is where contracts come in. A contract is any precise agreement on the shape and meaning of data between systems. It could be a JSON Schema, an OpenAPI definition, a WSDL for SOAP, or even an old COBOL copybook in a mainframe. Think of a contract like a rulebook or legal agreement: it spells out exactly what fields, formats, and values are allowed. When you work contract-first, both sides agree on the message structure up front. Front-end and back-end teams can work in parallel against a mock API defined by the contract, surfacing assumptions early. In contrast, a code-first approach often reflects only one team’s internal logic. That can lead to “brittle contracts, inconsistent behavior, and expensive rewrites” later, as other services try to integrate. Without a clear contract, hidden assumptions grow like weeds.

In fact, one practitioner says it plainly:

Good contracts save time, prevent misunderstandings, and allow large teams to scale their work without tripping over each other.

When APIs are treated as the first step in design rather than the last, integration becomes smoother and systems are easier to evolve.”

In practice, contracts need to be very explicit. For example, a JSON Schema tells exactly which fields are required and what type they must be. OpenAPI (Swagger) documents can be used to automatically generate test requests or server stubs. COBOL copybooks in legacy systems defined record layouts down to the byte, ensuring every COBOL program agreed on data formats. The more precise the contract (and the stricter the schema), the clearer the communication between systems.

A good analogy: it’s like agreeing on a common language or protocol before you start sending messages. With a well-defined contract, both sides know the grammar and vocabulary, so they can catch misunderstandings early rather than in production.

The Trap of Flexibility Without Clarity

It’s tempting to stay open and flexible. After all, JSON itself is a “schemaless” format – you can send almost anything. But this flexibility can be a trap. If you don’t explicitly enforce a schema, fields will drift as teams add or rename data. Over months or years, one service might quietly start writing new fields, another might interpret them differently, and soon behavior diverges. Without clear rules, integrations break in subtle ways.

Think of it like giving your kids too much freedom at home: without clear guidelines, they’ll gradually rearrange everything until the house looks nothing like when you left it. Similarly, in data integration, one good practice is to never remove or rename existing fields without a plan. As one expert advises, “the key to versioning JSON is to always add new properties, and never remove or rename existing properties”. In other words, treat the data format like a public contract: changing it can break callers, so evolve it carefully. The downside of blind flexibility is that your message can get bloated and confusing over time (old fields hang around for backward compatibility)

A real-world analogy: Imagine two people communicating in a dialect that changes every week. At first they understand each other, but without anyone writing down the new words, misunderstandings pile up. By the time they realize, it’s a mess. That’s what happens when systems exchange “free-form” JSON with no versioning or schema. It works at first, but drift causes one service to break another years later. The cure is to impose some structure an agreed-upon schema or version number even if it feels like a bit of overhead. Being slightly strict pays off by keeping everyone on the same page.

Boundaries Must Be Explicit

Every service or system should have clear ownership and separation of concerns. In other words, make your boundaries explicit. If one team owns a database, let them own the full model; if another team needs that data, have them talk to that service rather than sharing the database directly.

Explicit boundaries are like borders on a map: they tell you who is responsible for what. A software boundary might be an API endpoint, a message schema, or a database schema that no one outside a team touches.

A great illustration comes from software tenets:

By keeping the surface area of a service as small as possible, the interaction and communication between development organizations is reduced. blog.ploeh.dk

In practical terms, this means each service should expose only the data and behavior it needs to. If you treat services like black boxes, with only well-defined inputs and outputs, then crossing those boundaries becomes a deliberate act, not a hidden implementation detail.

For a quick analogy, think of language translators between countries. Suppose a legacy system “speaks Xhosa” and a new system “speaks English.” You put an adapter/translator in between to convert messages. The adapter ensures each side only hears its own language, and the rules of translation are clear. In our case, a service boundary might have a JSON->XML translator or vice versa. (If we let them speak loosely with each other, messages could arrive garbled.) Below is a simple diagram of this idea:


        +----------------------+        +---------------------+        +----------------------+
        | Legacy System        |------->| Adapter/            |------->| Modern System        |
        | (Speaks Xhosa)       |        | Translator (OXO)    |        | (Speaks English)     |
        +----------------------+        +---------------------+        +----------------------+

In short, treat each boundary as a clear interface. This makes ownership clear one team owns the Xhosa side, another owns the English side and you avoid the chaos that comes when two teams unknowingly step on each other’s toes. It also ties back to observability: if every request across a boundary must cross through a defined point, you can log it, trace it, and handle failures explicitly at that point.

Failure Is Normal — Design for It

No matter how carefully you design your integration, failures will happen. Networks drop packets, downstream services go offline, data can be corrupt, and even cosmic rays might flip a bit. The question is not if you’ll see errors or slowdowns in production, but when. The good news is that recognizing failure as normal makes your systems stronger.

In practice, this means building automatic retry logic, idempotency, and dead-letter queues (DLQs) into your integrations. If a message processing fails, rather than losing it or hanging forever, you retry a few times. If it keeps failing, send it to a *DLQ *for later inspection. This pattern is like having a safety net: any unprocessable message is quarantined instead of crashing the pipeline. Also make your processing idempotent – design messages so that re-processing them (e.g. after a retry) doesn’t duplicate work.

A helpful metaphor: imagine sending letters with a postal service. If a delivery fails, the mailman doesn’t just disappear; the letter might bounce back or go to the post office holding area. A DLQ is like that holding area. Retries are like trying a second or third delivery attempt. Idempotency is like writing clear instructions on the letter so that if the letter arrives twice, the mailman only delivers one parcel.

In a concrete example, Kafka experts remind us:

No matter how well you design your system, failures will happen.

So assume errors will occur at integration points. Design each boundary with failure in mind. Use queues that preserve messages, add a dead-letter queue to catch the poisons, and tag each message with a correlation ID so you can trace it if things go wrong.

 Producer --> [ Queue ] --> [ Consumer ]
                      |              |
                      v              v
                   [ DLQ ]     [ Logs & Metrics ]

This simple flow shows the idea: producers put messages on a queue, consumers process them, and failed messages are rerouted to a DLQ instead of disappearing. Meanwhile, everything writes to logs and metrics so you can spot issues. Planning for failure like this makes your integration far more resilient.

Observability Is Part of the Contract

You can think of observability (logging, metrics, tracing) as an extension of your contract. A contract doesn’t just say what data to exchange, it should also include how you’ll monitor that data in flight. For example, every message or request should carry a correlation ID so logs across services can be linked back together. Services should agree on structured log formats or error schemas so that when something goes wrong, operators don’t have to parse free-form text.

Structured logging is key. If every service follows a standard (say, JSON logs with fields for request IDs and error codes), then you can write automated tools to alert or cross-reference. Without that, you’re manually chasing clues. One API designer noted: the service that restored the system quickly during an outage did so only because “every log pointed to the same error schema”. Consistent logs were their lifeline.

In practice, include logging and tracing in your integration contract. Enforce that every message handler logs its progress (including timestamps and IDs). Use distributed tracing tools so you see the full path of a request through all services. In other words, make observability a non-negotiable part of your interface design. As one architect put it, “Logs, metrics, and alerts form the foundation for understanding how contracts behave in production and how they respond under stress. Include distributed tracing to track requests across services, and correlation IDs to link related events.” When teams treat observability as part of the system contract, outages become easier to diagnose and resolve.

Strictness Creates Stability

It may seem counterintuitive, but the more rigid and explicit a system is, the more reliable it can be. Legacy systems often survive for decades precisely because they forced strict rules on data and processes. Take COBOL in banking: institutions still rely on COBOL largely “because of the stability, speed, and scale it provides for high-volume transaction processing.” COBOL programs handle ATM withdrawals and credit card transactions with “proven reliability” over decades. Those systems weren’t designed to be flexible or easy to change; they were built with explicit record layouts and fixed logic. Once running, they could be trusted to do the same thing over and over without surprises.

In contrast, overly permissive or “magical” systems tend to be fragile. If one part of your integration quietly adds an undocumented feature, another part may break. If you say “we’ll figure it out on the fly” at a boundary, you’re setting yourself up for ambiguity. Strictness like disallowing unvalidated fields, rejecting unknown values, or requiring version numbers may feel stifling at first, but it creates a stable base. It’s like driving on a road with solid guardrails: you might go a little slower, but you won’t veer off the cliff. By enforcing clear rules and catching violations immediately, your system fails fast rather than accumulating silent inconsistencies.

Integration Encourages Incremental Evolution

Integration enables incremental replacement. Use the Strangler Fig pattern: introduce a façade or adapter that routes traffic to either the legacy system or the new implementation, then migrate functionality piecewise. Version your APIs, allow multiple schema versions where necessary, and employ canary or blue/green deployments to reduce risk. Small, observed changes are safer than a single big-bang rewrite.

Practical Integration Checklist

Define explicit contracts. Use JSON Schemas, OpenAPI/WSDL, or similar to fix message formats. Treat these as shared agreements, not afterthoughts
Enforce clear boundaries. Make ownership obvious: one service owns its data and interface. Use adapters/translators for mismatched formats (see diagram above). Reduce shared abstractions across teams
Validate early and often. Reject bad or unknown data at the edges. The more you validate inputs on arrival, the less downstream code will break.
Plan for failure. Assume messages can’t always be processed successfully. Implement retries, idempotency, and dead-letter queues instead of dropping errors
Invest in observability. Include logging, metrics, and tracing in your interface design. Pass correlation IDs with requests, and use consistent log schemas so failures can be traced end-to-end
Be strict to stay stable. Avoid overly flexible schemas. Don’t change or remove fields without versioning. Remember, legacy systems often persist because they were so rigid. Embrace that discipline where it counts
Evolve incrementally. Use canaries, blue/green, or strangler-style façades to replace functionality bit by bit. Keep old and new systems running in parallel during transition, making rollbacks safe

By treating integration as a design tool rather than a last-minute chore, teams can catch and fix deep design issues early. The key is precise communication (contracts), clear ownership (boundaries), and a mindset of resilience (failure + observability). Use the checklist above to guide your next integration project, and you’ll avoid many of the nasty surprises that would otherwise surface only in producti

Curious how real-world integration works? Read this blog to explore practical patterns and examples.? Check out this blog post

References:

Crumlish,C. (2025) “Integration Reveals All: How Building File Analysis Exposed Hidden Architecture”, Building Piper Morgan (Medium). Available at: https://medium.com/building-piper-morgan/integration-reveals-all-how-building-file-analysis-exposed-hidden-architecture-3d696dbf2803 (Accessed: 10 Dec. 2025) medium.com
Superblocks Team (2025) “SOAP vs REST: 9 Key Differences & When to Use Each in 2025”, Superblocks Blog. Available at: https://www.superblocks.com/blog/soap-vs-rest (Accessed: 10 Dec. 2025) superblocks.com
Nguyen,S. (2023) “What is WSDL in SOAP | A Comprehensive Guide”, DreamFactory Blog. Available at: https://blog.dreamfactory.com/what-is-wsdl-in-soap-a-comprehensive-guide (Accessed: 12 Dec. 2025) blog.dreamfactory.com
SmartBear (2023) “The Benefits of OpenAPI-Driven API Development”, Swagger API Strategy Blog. Available at: https://swagger.io/blog/api-strategy/benefits-of-openapi-api-development (Accessed: 12 Dec. 2025) swagger.io
Alexander, D. (2023) “Luxoft’s Mainframe Engineering Services help banks master COBOL core banking”, Luxoft Blog. Available at: https://www.luxoft.com/blog/why-banks-still-rely-on-cobol-driven-mainframe-systems (Accessed: 12 Dec. 2025) luxoft.com
Fowler, M. (2024) “Strangler Fig”, martinfowler.com (ThoughtWorks). Available at: https://martinfowler.com/bliki/StranglerFigApplication.html (Accessed: 12 Dec. 2025) martinfowler.com
Mastercard (2024) “ISO 20022 and JSON: balancing standardisation and flexibility in APIs”, Mastercard B2B Exchange. Available at: https://b2b.mastercard.com/api/management/article/iso-20022-and-json-balancing-standardisation-and-flexibility-in-apis (Accessed: 13 Dec. 2025) b2b.mastercard.com