The Infrastructure Configuration Nightmare: When Separation Creates Slowdown

Why centralized infrastructure repositories might be killing your deployment velocity

Featured image

A simple feature story

It’s Monday morning, coffee is steaming. You’ve just finished implementing a great new feature that your users have been asking for. The code is clean, tested, and ready to ship. It shouldn’t take longer than 30 minutes to deploy to production, right?

Week later, you’re still trying to get it deployed.

Sound familiar?

Here’s what actually happened: Your feature needs a new Kafka topic. Simple enough, right? But in your organization, Kafka topics aren’t created by your service. They’re managed in a centralized kafka-infrastructure repository. So you:

  1. Context switch to the kafka-infra repo
  2. Find the YAML file for your environment
  3. Add your topic configuration
  4. Create a PR
  5. Wait for the platform team to review it
  6. Wait for the scheduled deployment window
  7. Create a JIRA ticket for production deployment
  8. Trigger the deployment
  9. Finally return to your actual feature

Oh, and your feature also needs a new database table. Time to repeat this process in the database-migrations repository. And don’t forget the secrets in the secrets-management repository.

What if I told you there’s a better way that’s just as secure but dramatically faster?

The Current State: A Maze of Repositories

maze

In large organizations, infrastructure is typically managed by different specialized teams. Naturally each team wants to control changes being made in their part of the system world. What it results in are separated repositories for configuration:

Each repository may have its own:

The idea is that this separation provides:

But in practice, it creates something else entirely: a bottleneck that slows everyone down.

Multi-Repository Fragmentation

bottleneck

When you need to deploy a feature that touches multiple infrastructure components, you’re faced with a coordination nightmare.

Context Switching Overhead

The Deployment Coordination Nightmare

This is where things get truly painful. Each centralized repository doesn’t just have a different codebase - it has a completely different deployment mechanism:

As a developer, you need to maintain a mental model of:

This knowledge is typically scattered across:


What was estimated as a 30-minute deployment becomes a two-week battle.

Cross-Repository Dependencies

dependency

Some infrastructure repositories may have dependencies on each other, creating a complex chain of prerequisites that must be satisfied in a specific order:

Then you can’t just create all these PRs in parallel and hope for the best. There’s a strict ordering for each scenario.

The Knowledge Problem Multiplied

Not only do you need to know:

You also need to know:

Version Drift & Configuration Mismatches

drift

When infrastructure and code are deployed separately, version consistency becomes a constant struggle.

The mismatch scenario:

Your service v1.2.3 is running in production But which infrastructure version is it using?

Nobody knows for sure. 🤷

The consequences:

When things go wrong, you’re faced with partial states. Kafka topic deployed successfully but database migration failed halfway. Is the service working? Half-working? Should you roll back just the database? Just the service? Both?

The Knowledge Burden

knowledge

All these problems compound into a massive knowledge burden that teams must carry.

What developers need to master:

and many, many more…

The onboarding nightmare:

Each new developer will require solid time to get use to all processes, still full bookmark folder of wiki pages would be required.

Blast Radius Problem

blast

There’s another insidious problem with centralized infrastructure repositories: one person’s mistake blocks everyone.

The Tragedy of the Commons

Picture this: Your organization has 50 teams, all deploying services. They all share the same centralized repositories. On a typical Monday:

9 AM:

10 AM - Scheduled deployment fails ❌ on Team X error

Your change was perfectly fine, but is blocked anyway.

The Debugging Nightmare

When the deployment fails, someone needs to figure out why:

  1. Check deployment logs (which system? Octopus? Jenkins?)
  2. Review all 15 PRs merged since the last successful deployment identifying which caused the failure
  3. Contact the team responsible
  4. Wait for them to fix it
  5. Re-run the deployment
  6. Hope nothing else broke in the meantime 🤞

Meanwhile, your service launch is delayed. Your stakeholders are asking questions. Your team is blocked.

The Validation Gap

“Why don’t you just have better pre-merge validation?” you might ask.

Great question. The problem is that comprehensive validation is extremely difficult in centralized repositories:

Why Pre-Merge Validation Fails:

  1. Environment-specific issues - Config works in DEV, fails in PROD
  2. Timing-based conflicts - Two teams modify the same resource
  3. Complex interactions - Change A + Change B = disaster
  4. Schema validation ≠ Runtime validation - Syntax is correct, semantics are wrong

The broken window effect is real. Once the deployment pipeline is seen as unreliable, the entire development process slows down.

Repository Bloat Problem

bloat

As your organization grows, centralized infrastructure repositories become increasingly unwieldy. What started as a simple, organized approach transforms into a performance nightmare.

The Performance Death Spiral

Every operation gets slower.

Starting from cloning repository, running necessary local actions, into deployment. Time of each layer is accumulating, making context switch pricier and pricer.

Remember failed deployment caused by Team X? Imagine feedback loop after each fix getting dozens of minutes.

You catch yourself after whole day achieving nothing, trying to understand how this happened, it can be very depressing (even more when you trying to explain what you did day before on morning standup).

The Way Forward

forward

So what’s the alternative? In-codebase infrastructure manifests.

Instead of scattering configuration across multiple repositories, keep it next to the code that uses it:

my-service/
├── src/
├── tests/
└── infrastructure/
    ├── database.yaml
    ├── kafka-topics.yaml
    └── secrets.yaml

Example: Single deployment pipeline

deploy-to-prod:
  steps:
    1. Validate manifests
    2. Deploy schema
    3. Create Kafka topic
    4. Create secrets
    5. Run database migration
    6. Deploy service
    7. Run smoke tests

  rollback: atomic (all or nothing)
  approvers: dynamic (based on what changed)

Everything in order. Everything versioned. Everything atomic.


What changes:

“But what about security?”

Instead of separate repositories with owner teams reviewing each PR, we may route approvals dynamically:

Same gates. Same people reviewing. Better context (they see code + infrastructure together).

“But what about shared infrastructure?”

Keep truly shared infrastructure (VPCs, clusters, policies) centralized. Move service-specific infrastructure (your Kafka topics, your database migrations, secrets) into service repos.

It’s not all-or-nothing. It’s a spectrum.

Final Thoughts

If deploying a simple feature takes two weeks in your organization, it’s not because you need that much security review. It’s because you have process debt.

Centralized infrastructure sounds natural, keeping it close to managing teams.

This works perfectly in an ideal world where:

But we don’t live in that world.


World changed. Modern tools (Kubernetes, Terraform, GitOps) enable atomic deployments. Modern organizations embrace service ownership. Modern security happens through automated validation and dynamic approvals, not synthetic separation.

The goal isn’t to eliminate oversight. The goal is to eliminate waiting.

Your developers shouldn’t need tribal knowledge just to create a topic. Your platform team shouldn’t be overwhelmed with trivial reviews. Your organization shouldn’t be waiting weeks to ship features.

The infrastructure configuration nightmare is solvable.

Maybe it’s time to wake up.

bottleneck