Building an AI-Powered SDLC Agent Platform — The Honest Build Log

I'm building an AI agent platform for the software development lifecycle. Not a chatbot that writes code — I have enough of those. This is something different. Specialized agents that collaborate, review each other's work, and don't let bad code through without a fight.

This is the honest build log. Real bugs, real frustrations, real "bakit ganito?!" moments. No polished announcements. Just the messy truth of building something from scratch while running on coffee and stubbornness.

Jump to
The idea The journey so far Where it stands Configurable everything

The idea

What if your dev team's code review, architecture review, and testing could be handled by AI agents — but with real quality gates? Not "AI generates code and you hope it works." More like "AI generates code, other AI agents tear it apart, and a human gets the final say."

That's what I'm building. Agents that have opinions. Agents that reject bad architecture. Agents that say "nope, try again" when the code doesn't meet standards. And when the agents can't figure it out — the pipeline stops and asks a human.

The journey so far

The first few days were humbling. I blamed the AI for everything — "ang eng eng naman," "it doesn't follow instructions," "the model is broken." Classic developer move: blame the tools.

Turns out, ako pala 'yung may problema. Most of the "AI flakiness" was my own infrastructure breaking things silently. Once I fixed the plumbing, the AI was actually pretty good.

Then came the real challenges — agents disagreeing with each other, retry loops burning tokens on the same errors, and one agent approving a 22-line file as "complete" when the prompt asked for a full system. Every bug taught me something about building multi-agent systems that I couldn't have learned from a tutorial.

Biggest lesson so far: The hardest bugs aren't in the AI. They're in the gaps between agents — the implicit assumptions, the missing contracts, the data that one agent expects but another never sends. Fix those, and the AI part works surprisingly well.

Where it stands

It works. End-to-end. From a text prompt to generated, tested, reviewed code — with a full audit trail of every decision. I've watched agents catch actual architecture violations, reject buggy designs, and escalate to human review when they hit something they can't solve.

There's still a lot to build. But the foundation is solid — and it's mine. My own IP, built from scratch.

Name reveal coming soon.

Configurable everything

One thing I got right early: nothing is hardcoded.

Each agent can use a different AI model. The routing agent gets a small, fast one. The code generator gets a bigger, smarter one. The reviewer can use something completely different for diversity of opinion. Swap models in config — no code changes.

Three deployment environments:
Air-gapped — fully offline. Nothing leaves the machine. For classified or highly sensitive environments.
Private network — on-premise with access to internal resources. For enterprise environments handling sensitive data.
Public cloud — cloud-hosted models, full scalability. For commercial workloads.

Same codebase for all three. The environment is a config setting, not an architecture decision.

Want the detailed build log?

War stories, dated entries, screenshots, architecture details, and the bug AI couldn't fix.

Don't have access? Send me a quick email:

Request access →

Or just say hi — I like talking about engineering.