// docs

How We Work: From Discovery to Handoff

I don't run discovery calls that end with a 40-slide deck. I run a structured engagement that ends with a working system.

Every project moves through the same four phases. Skipping the process doesn't save time — it costs it.

Phase 1 — Discovery

Duration: 1-2 sessions, 60 minutes each. Format: Video call or in person. Recorded with your permission (for my own notes only, never shared). Cost: Free. No commitment.

I start by mapping your real situation, not the version you think I want to hear. Polished workflow presentations don't interest me. How the work actually looks does.

What I'm looking for

During Discovery I hunt for four things:

Time leaks. Where does the same kind of work repeat over and over, eating hours that could go somewhere else? Not "we're busy" — specific, recurring tasks with a pattern.

Decision patterns. Which decisions in your workflow are complex (requiring experience, judgment, context that shifts every time) and which are routine (following rules you could write down)? AI handles routine decisions well. It handles complex ones badly. Telling the two apart is the first step.

Data flow. How does information move through your organization? Where does it get stuck? Where does it get lost? Where do people manually copy data from one system into another? Those hand-off points are often the best automation candidates.

Tool friction. Which of your existing tools play well together, and which don't? A CRM with a decent API makes integration easy. A legacy system that only exports CSVs by hand makes it hard. I need to know this before I estimate anything.

What Discovery is NOT

Discovery is not a sales pitch. I won't show you an AI demo and wait for you to be impressed.

Discovery is not a "needs assessment" where I nod along to everything you say and then propose the most expensive option. If your needs don't match what I build, I'll tell you during Discovery, not after you've committed.

Discovery is not the start of a project. It's a decision point. You can walk away with a clear "no" and that's a perfectly valid outcome. A good Discovery that ends in "no" is worth more than a bad Discovery that ends in a project that never should have started.

What you bring

Access to real examples. Actual emails, actual reports, actual workflows. Not cleaned-up versions. The AI will hit your real data, with all its mess and inconsistency. I need to see that during Discovery, not during the build.

Discovery is free. Moving to Specification requires no commitment, and if the project isn't a good fit, I'll tell you before that phase begins.

For the full guide to the Discovery process — how to prepare and what to expect — see The Consultation: why it matters.

Phase 2 — Specification

Duration: 3-5 working days after Discovery. Deliverable: A written functional specification. Your role: Review, ask questions, approve.

I write a functional spec before writing a single line of code. That step is mandatory, and it's the most important document in the whole project.

What the spec document looks like

A typical spec runs 5-15 pages depending on scope. It contains these sections:

Problem statement. What we're solving, in plain language. If your team reads this section and doesn't recognize the problem, I got it wrong.

Exact inputs and outputs. What goes into the system (data format, sources, frequency) and what comes out (reports, notifications, processed documents, API responses). No ambiguity. If the input is an email, I spell out what kind of email, from which source, in what format.

Processing logic. What the AI does with the inputs. Step by step, in order. Including what happens when the AI hits something it can't handle (the fallback path matters as much as the main path).

Edge cases. The things that turn a simple process complex. I spotted them during Discovery, and here they're documented with specific handling instructions. "Client sends an email in a language we don't support" — what happens? "Invoice is missing a required field" — what happens? Every edge case gets an answer.

Scope boundary. What the system does NOT do, written out plainly. This section prevents 90% of the "but I thought it would also..." conversations during the build. If it's not in the spec, it's not in scope.

Integration points. Which tools the system connects to, how, and what happens when those connections fail. API specs, authentication requirements, rate limits — all documented.

Acceptance criteria. How you'll know the build is correct. Specific, measurable tests the system must pass before I call it done. That's the standard both of us will hold to.

Why written specs prevent project failures

Most AI projects fail because the scope was never written down. The two sides had a different picture in their heads. The vendor thought "email classification" meant sorting into 5 buckets. The client thought it meant reading, summarizing, and routing to the right person with a draft reply. Nobody was wrong — nobody wrote down what they meant.

The spec is the contract between us. Not a legal contract (though it can be referenced in one), but an operational one. It's the single source of truth for what we're building. When questions come up during the build — and they always do — we go back to the spec.

If it's not in the spec, it's not in scope. If you want it in scope, we add it to the spec (with an updated timeline and cost). That's not bureaucracy — it's how both sides know where they stand.

You review and approve the spec. If we disagree on something, we resolve it in writing before the build begins. Nobody starts coding with open questions.

Phase 3 — Build

Duration: depends on the install. openclaw: up to 5 working days. hermes: 7-12 working days. custom: 3-7 days. hybrid: 2-4 weeks. Your role: Test on real data, give feedback, report problems.

What the build looks like in practice

A typical cycle for an openclaw (up to 5 days) or hermes (7-12 days) install:

Days 1-3: Core system built and configured. AI agent integrated, basic processing logic implemented. Days 4-5: Internal testing against your sample data from Discovery. I catch the obvious problems before you see anything. Day 6: You get access to the first working version. Not a demo — a real system wired to your data. Days 6-8: You test on real work. Not hypothetical scenarios — your actual documents, your actual emails, your actual processes. Days 9-10: I fix what broke, adjust what felt wrong, improve handling of the edge cases you turned up. We test again until the acceptance criteria from the spec are met.

A custom install (Claude Code / Codex CLI wired into your stack) moves faster — 3-7 days, because the environment is already there. A hybrid is a multi-layer system and runs 2-4 weeks.

This cycle takes as many iterations as it needs. I don't have a fixed number of revision rounds. I have acceptance criteria, and I iterate until they're met.

Testing on real data

"Testing on real data" means exactly what it says. I don't build a demo with hand-picked examples and call it done. I connect the system to your actual data sources and let it process real inputs.

That's where the surprises show up. The email format that worked perfectly in testing breaks because one client uses HTML signatures with embedded images. The document parser handles PDFs from your system fine but chokes on scans from a partner. The classification model works great in Polish, but your clients sometimes write in English.

Those surprises are normal. That's why I iterate. And it's why I test on real data instead of a demo — because demo data never contains the surprises.

Bugs vs. feature requests

During the build I draw a clear line between bugs and feature requests.

A bug is when the system doesn't do what the spec says it should. Email classification is supposed to sort into 5 buckets, but it keeps dropping "invoice inquiries" into "general" instead of "finance." That's a bug. I fix it, no discussion.

A feature request is when you want the system to do something that isn't in the spec. "Could it also draft a reply?" — when the spec only covers classification. That's a feature request. It goes to the backlog. If you want it added to the current build, we update the spec, timeline, and cost in writing.

That distinction isn't me being difficult. It protects the project timeline and keeps both sides clear on what's in scope. Scope creep is the number-one killer of AI projects, and I guard against it.

Quality verification workflow

Every output my systems produce goes through a multi-step verification process:

Automated checks. Format validation, data integrity, error detection.
AI cross-verification. A second AI model checks the first one's output. Different model, different prompt, independent assessment.
Manual review. I check samples by hand from every output category, focusing on edge cases and high-stakes decisions.
Your testing. You test on real work and report discrepancies.

This workflow exists because AI models hallucinate. That's a known property of the technology, not a flaw in the implementation. What matters isn't whether the AI makes mistakes — it's whether they're caught before they reach the end user.

Phase 4 — Handoff

Every build ends with a handoff package delivered to you or your team.

What the handoff package contains

The handoff is everything you need to run the system without me:

System documentation. What was built, how it works, an architecture overview, component descriptions. Written for your team, not for engineers (unless your team is engineers — then I write it for engineers).
Operations manual. Step-by-step instructions for common maintenance tasks: adding new entries to the knowledge base, adjusting prompts, updating data sources, restarting components after an outage.
Configuration files. All system prompts, model settings, integration credentials (stored securely), and workflow configs. Everything is version-controlled in a GitHub repository that you own.
Known limitations. An honest list of what the system doesn't handle well, with workarounds. Every system has limits. I document them instead of pretending they don't exist.
Troubleshooting guide. Common problems and how to fix them. "The AI starts giving shorter answers" — check the context window, it might be full. "Classification accuracy drops" — check whether the input format changed.
Support contact protocol. How to reach me during the support period, expected response times, what counts as a bug.

What "you can run it without me" means

That's my standard. If your team can't run the system day to day without calling me, the handoff failed.

Concretely, your team should be able to:

Add new content to the knowledge base
Adjust the AI prompts when your business needs change
Monitor system health and recognize when something's going wrong
Handle common problems using the troubleshooting guide
Know when a problem is beyond them and needs professional support

I don't build systems that create permanent dependency. You hired me to build something. When it's built, it's yours. Fully.

Walkthrough session

Every install ends with a live call where I walk your team through the system. For the hybrid package this session is longer — it covers the architecture, routing between layers, and advanced configuration.

The session is recorded and added to your handoff package. New team members can watch it later.

What happens between phases

Projects don't jump from one phase to the next instantly. Here's what the typical transition periods look like:

Discovery to Specification: 1-3 working days. I process my notes, sketch the initial spec structure, and start writing.

Specification to Build: 1-5 working days. You review the spec, ask questions, I revise, you approve. Simple specs take a day. Complex ones might need a week of back-and-forth.

Build to Handoff: Right after the acceptance criteria are met. I don't pad projects artificially.

Typical total timeline from the first Discovery session to a live system:

openclaw / hermes: 1-3 weeks
custom: 1-2 weeks
hybrid: 3-5 weeks

Those are real ranges — they include the review cycles and feedback windows, not just build time.

Red flags I watch for

Sometimes I need to pause or stop a project. Here are the warning signs:

No feedback for 5+ working days. I sent you a build to test. Five days pass. Silence. That means either the project isn't a priority for you (so the timeline slips) or something went wrong that you haven't told me about (so I can't fix it). Either way, I pause and schedule a check-in.

Scope changing faster than we can spec it. "Actually, could it handle invoices too?" on Monday. "What about customer complaints?" on Wednesday. "And can we add a reporting dashboard?" on Friday. Each of those might be reasonable on its own. Together, they signal that the problem isn't well-defined yet, and building against a moving target wastes everyone's time.

Stakeholders who weren't in Discovery now making decisions. I mapped the process with you. You understood the trade-offs. Now your manager (who wasn't in the room) has different priorities and wants different features. That's not a technical problem — it's an organizational one. I pause until all the decision-makers are aligned.

The "just one more small thing" syndrome. Every addition has a cost, even a small one. When small additions stack up, they create complexity that wasn't accounted for in the spec, the timeline, or the price. I treat every addition as a spec amendment — written, estimated, approved — instead of letting it slide by unnoticed.

When I see these red flags, I don't push blindly ahead. I stop, raise the issue in writing, and propose a fix. Sometimes that's a re-scoping session. Sometimes it's a timeline extension. In rare cases it means pausing the project until the prerequisites are back in place.

Communication

Everything in writing. Email or a shared project thread. I don't run projects through WhatsApp, Slack DMs, or informal phone calls. Written communication creates a record, prevents misunderstandings, and lets both sides look up decisions made weeks ago.

Weekly status updates on Fridays. Short, structured: what got done this week, what's planned for next week, any blockers. You'll know exactly where the project stands without having to ask.

Response expectations. I reply to project messages within 24 hours on working days. I expect the same from you. When both sides respond fast, projects stay on schedule. When one side goes quiet, timelines drift.

No project decisions in informal channels. If something gets discussed on a call, I send a written summary. If you send a WhatsApp message with a change, I'll acknowledge it and ask you to email it. That's not rigidity — it's making sure nothing gets lost.

If something is blocked on your side for more than 5 working days, the timeline shifts accordingly. It works the other way too: if the blocker is on my side, I tell you within 24 hours and adjust the timeline in writing.

last updated: 2026-05-11