How it Works

Why now

Reasoning Context is the bottleneck.

The frontier models solved the hard part. What's left is the part that was never in the training set: your repos, your calls, your decisions, your people.

2023 Reasoning

The ceiling was the model.

Models couldn't summarize a year of someone's work, let alone weigh it against expectations.

Model intelligence Limited

Context needed —

2023–2025 Threshold

The line crossed quietly.

Reasoning saturated the benchmarks. Nobody celebrated the exact moment. But the wall moved.

Model intelligence Climbing

Context needed Growing

2026 Context

The new wall is your org.

The models are finally capable of the reasoning. The constraint is what they know about your organization.

Model intelligence Solved

Context needed Massive

The stakes

People decisions have always mattered.
Now they move faster than ever.

Companies can no longer get by on manager intuition, scattered systems, and memory. Every week, leaders face high-context judgment calls like these:

Who should lead this project?

Who is in the wrong role?

Who is ready for more scope?

Who needs coaching?

Where is the organization quietly breaking down?

Who deserves a promotion this cycle?

Who should lead this project?

Who is in the wrong role?

Who is ready for more scope?

Who needs coaching?

Where is the organization quietly breaking down?

Who deserves a promotion this cycle?

Who is at risk of leaving?

Who has the most context on this?

Whose growth has stalled?

Which team is quietly carrying the org?

Who should mentor the new hires?

Who is overloaded right now?

Who is at risk of leaving?

Who has the most context on this?

Whose growth has stalled?

Which team is quietly carrying the org?

Who should mentor the new hires?

Who is overloaded right now?

These are all judgement calls. To be useful, AI needs more than intelligence; it needs a system of context.

The graph

Four layers of context

People provide the structure. Evidence supplies the facts. Standards define the rubric. Perspectives shape the interpretation. Without all four, you're left with an incomplete picture.

Layer 1

Who works here, and who works together

Roles, reporting lines, tenure, permissions: the structure. Plus the real organizational network: who actually collaborates, derived from how work flows over time. The graph also mirrors the permissions of every tool a company uses, filtering every query so nothing is visible to anyone who shouldn't see it.

Layer 2

What actually happened

The ground truth of what happened over time. Each employee's activity, from the work they produced to the impact it had. Pulled from the systems where work already lives: GitHub, Linear, Slack, Salesforce, docs, and more, connected to the right people and preserved in history. View all of our integrations.

Layer 3

What great looks like

Role expectations, team priorities, company values, operating principles. The rubric that turns raw activity into meaning, grounded in how this company actually measures impact. Without standards, evidence is just activity.

Layer 4

What informed humans think it means

The observations managers and peers carry in their heads, captured in low-friction ways, connected to the right people and events, and governed by the right permissions. Feedback, private notes, one-on-ones, and manager observations that never make it cleanly into a system of record.

Why nothing else solves this

Every company has fragments. Almost none have the graph.

HRIS

Who someone is. Not what they worked on.

Productivity tools

What moved. Not what was expected.

Manager memory

The important observations, never written down.

Lattice · 15Five · CultureAmp

Forms at review time. No living model underneath.

Each of these holds a fragment of the picture. None of them assembles it. Existing performance tools collect inputs at review time. They don't build and maintain a living model of your people. And a generic LLM can synthesize whatever you hand it, but it can't build and maintain this context for you.

Ron Alexssen

Engineering Manager, Counterpart

What Windmill does

We build the graph. You make the decisions.

Windmill collects, structures, and maintains work context from the systems teams already use. It turns scattered data and human input into a continuously updated, permission-aware context graph.

Performance reviews are the first proof point, but not the endpoint.

Karim Atef Mansour

Director of Engineering, Retail Next

6 min Time per review

Before 3 hours

With Windmill 6 minutes

86% Cycle time reduction

Before 6 weeks

With Windmill 6 days

80%+ Response rate

Before 35%

With Windmill 80%+

90% of reviews are already written

Before 0%

With Windmill 90%

Once the graph exists, many other workflows become possible: staffing decisions, promotion cases, coaching recommendations, succession planning, and organizational diagnostics. The review is one expression of the graph. It will not be the last.

Get started for free

Book a demo