The question no engineering intelligence platform can answer

The dashboard that answered nothing

I spent three and a half years at Athenian building an engineering intelligence platform (Gartner-recognized, backed by Point Nine Capital). The kind of product where you plug in your GitHub, your Jira, your CI — and get dashboards showing cycle time, PR throughput, review bottlenecks. DORA metrics. The whole package.

I was Head of Engineering. I led a team that built the platform and we relied on it every week.

Here’s the thing nobody talks about: even with our own product — the one we built, the one we understood inside out — I still spent hours every week in 1:1s, skip-levels, and cross-functional syncs trying to piece together why things were happening. The dashboards were trying to get there — alongside the cycle time number, we’d show a table of which pull requests were involved. But that just opened second-order questions: okay, why did that PR have a high cycle time? Was it because engineers got pulled into incident response? Because a cross-team dependency blocked the review? Because someone was silently burning out? Each answer led to another question, and the tools stopped being useful after the first level.

And decisions don’t wait for you to finish investigating. You need to act — sometimes with whatever incomplete picture you have. You talk to a few people, check a few tools, triangulate — and even then, the picture is never quite complete. You make the call knowing there are things you aren’t seeing.

The tool showed what. The “why” lived in my head, assembled manually from hours of conversations, Slack threads, and gut feeling.

The glimpse

Toward the end of my time at Athenian, GPT-3.5 and GPT-4 landed. The word “agent” wasn’t really a thing yet — but I prototyped one of the earliest LLM-based agent MVPs for engineering organization reasoning, based on the ReAct pattern with self-critique, connected to our metrics API. It could present the same data in a more conversational way — ask it a question, get an answer that wove together metrics you’d otherwise check in three different dashboards.

That part worked. What was hard — barely possible with those early models — was pushing it toward actual reasoning. Drilling deeper into why a velocity drop happened, following the chain across PRs, people, deploys. The models couldn’t reliably do that yet. And part of the problem was structural: the data was connected only through foreign keys and table joins — there was no explicit graph of relationships between teams, people, systems, and decisions. Even if the models had been smarter, the substrate wasn’t there to traverse.

The glimpse was real, though. For the first time, something could reason across the data, not just display it. The team was exceptional — it was just a matter of time. But things went in a different direction. Athenian was acquired by the Linux Foundation, and the product evolved into something else. I’m grateful for that experience and that team. Between Athenian, the Linux Foundation, and the AI work that followed — each chapter sharpened a different piece of what I’m building today.

Same pattern, bigger scale

At the Linux Foundation, I directed the Insights team. LFX Insights — project health analytics, contributor intelligence, ecosystem rankings across the Linux Foundation’s open-source portfolio. A ~20-person cross-functional team across three continents.

Different context. LFX Insights wasn’t for B2B companies managing their own engineering orgs — it was for tracking contributions and health metrics across thousands of open-source projects and foundations. But even in that world, the output was mostly numbers: contribution counts, rankings, activity trends. Not patterns. Not “why is this project losing momentum” or “what changed in this ecosystem that explains the shift.” The tools showed the what, not the why. Same gap, different domain.

And on the operational side, leading a larger team in a bigger organization made the problem dramatically worse. The time you spend assembling context doesn’t grow linearly with team size — it follows what Fred Brooks described in The Mythical Man-Month: the number of communication channels scales as N×(N-1)/2. A team of 5 has 10 channels. A team of 10 has 45. A team of 20 has 190. At Athenian with a smaller team, I could mostly hold context in my head — even then, I was tracking things across a lot of documents just to keep up. At the Linux Foundation, that stopped working entirely. I was spending more than half my day, every day, talking with the people reporting to me and with peers, just to understand what we needed to focus on and what decisions needed to be made.

And this context wasn’t just for me. Many of these decisions weren’t mine alone — they needed to be shared with peers, with the people I was reporting to, with stakeholders across functions. Everyone needed the same picture, but everyone was building their own version of it from fragments. The meetings weren’t inherently the problem — the problem was that we were spending most of the meeting time connecting dots rather than making decisions. If we’d had more clarity going in, those same conversations would have been twice as effective in half the time.

The detour that taught me what I didn’t want

I left and built three products in a year: a job board, a Reddit lead discovery tool, a content strategy platform. All shipped. All taught me something important.

They didn’t match my intellectual appetite. Not even close.

I could build them, but I didn’t care about them the way I care about organizations as systems. About entities and relationships. About how a decision in one team ripples across ten others over three months. That phase was necessary — it showed me, by contrast, what I actually want to work on.

The insight

Here’s what years of building and using engineering intelligence platforms taught me:

The right decisions require the full picture. And the full picture is almost always missing.

Not because the data doesn’t exist — it does, scattered across tools, teams, and layers of reporting. But because the complexity of a growing organization and the time it takes to connect the dots make it practically impossible to assemble the picture fast enough to act on it. By the time you’ve manually investigated enough to understand what’s really going on, the decision window has closed.

Every tool in this space today — at least on the engineering side, tools like Jellyfish, LinearB, Swarmia — answers questions you already know to ask. “What’s our cycle time?” “Who’s a review bottleneck?” “Are we hitting DORA benchmarks?” Fine questions. Also the easy ones.

The hard questions are the ones nobody asks. Not because they’re lazy, but because they don’t have the picture yet — and without it, they don’t even know what to ask. “Is there a systemic reason Team X keeps getting pulled into incidents?” “What’s the relationship between last month’s reorg and this month’s delivery slippage?” “Why is our fastest team actually our biggest bottleneck?”

Those questions don’t get typed into a chatbot. They don’t appear on a dashboard. They live in the gap between what your tools show and what your organization actually needs you to see.

Agents can answer questions — but only if you have the right substrate to reason over, and only if you know what to ask. The problem is that the right questions never get asked, and the substrate to answer them doesn’t exist.

And this isn’t just an engineering problem. Any function — product, operations, customer success — has the same structural blindness. The domain changes but the pattern doesn’t.

What I’m building

OneContinuum is the intelligence layer for decision-making.

Not a dashboard. Not an AI assistant that waits for your prompt. Something that does the reasoning your organization can’t do fast enough — connecting data across teams, tools, and time — and surfaces conclusions before you even know you need them. Not as individual fragments, but as a shared reality that every function can reason from together.

One thing I care deeply about: this doesn’t remove agency or responsibility from the leader. The system pushes you to the edge of clarity — and stops there. It surfaces conclusions with evidence and recommendations. But it never prescribes. The system illuminates. Humans decide. That boundary is what makes it trustworthy.

I’m building this while also working full-time as a staff AI engineer, building production AI agents daily. OneContinuum is the product I wished I’d had when I was in engineering leadership — the one that would have surfaced the why before I had to spend half my week figuring it out myself.

If this resonates

I’m looking for design partners — leaders who are tired of piecing together the “why” from scattered tools and weekly syncs. Leaders who want to build something better, together.

If that’s you, I’d love to talk. Not a sales pitch. A conversation between people who’ve felt the same gap.

buildwithus@onecontinuum.io

Lou Marvin Caraig

Lou Marvin Caraig
Founder

The question no engineering intelligence platform can answer — from the inside.