Skip to content

Our Stack

Picking tools for an AI project is like picking tools for a kitchen renovation. You don't use a demolition hammer to hang a picture. You don't use a screwdriver to knock down a wall. I pick tools based on what solves your problem best — not what's trending on Hacker News this week, and not what some vendor is pushing from a conference stage.

This page covers every important tool and technology I use, why I picked it, and what role it plays in the systems I build. No black boxes.


Philosophy: the best tool for the job

The AI industry has a hype problem. Every week there's a new model, a new framework, a new "killer of everything." Most of it is noise.

My approach is boring, and it's meant to be:

  • Does it solve a real client problem? Not a theoretical one. Not a future one. The one on the table right now.
  • Is it reliable enough for production? Demos are cheap. Keeping something running 24/7 on real client data is expensive when it breaks.
  • Can the client understand and maintain it? If I build something so complex that only I can run it, that's a failure. Your system should be fully yours.
  • Does it have a sane cost structure? The fanciest model in the world is useless if running it costs more than the problem it solves.

I review new tools regularly. When something better shows up, I adopt it. But I don't chase trends and I don't rip out working systems for the sake of novelty.


The language models I use

Language models are the main engine of any AI system. Think of them like staff with different strengths — you wouldn't put the same person on writing a legal opinion and sorting incoming mail. Different jobs need different skills, and using the wrong model either wastes money or gives you weak results.

Claude (Anthropic) — the main engine

Claude is my default for most work. Anthropic builds models that are good at following complex instructions, holding context, and producing well-structured output.

The most capable model — the senior analyst.

It has a very large context window: an entire book, a complete contract package, months of email at once. I use it for complex document analysis, designing system architecture before I start building, multi-step reasoning, and any job where accuracy matters more than speed. It works slower, but more carefully.

The daily workhorse — the dependable team member.

It hits the sweet spot between capability and speed. Fast enough for real-time interaction, smart enough for nuance. I use it for AI assistants and chatbots, document processing, content generation that holds your brand voice, and most day-to-day work in production systems.

The fast, cheap model — the mailroom.

Built for high volume. Responds in milliseconds. For simple, high-volume jobs: classifying messages (urgent or not), quick data extraction from documents, spam detection. You pay the mailroom rate, not the senior-analyst rate.

OpenAI — the complement

OpenAI's flagship model comes in when its reasoning patterns complement Claude's — especially with image, chart, and visual-layout analysis. On high-stakes output I can run the same result through a second model and compare. Two independent models reaching the same conclusion is a stronger argument than one. I don't do this routinely — I use it where the gain in confidence justifies the extra cost.

OpenAI's code model helps write clean, well-documented code, review existing codebases, catch bugs before production, and generate test suites. If you have a legacy system nobody remembers the workings of — this model can read it and explain it.

Open-source models (Llama, Mistral, and others)

Sometimes the data can't leave your building. Literally. Regulatory requirements, contractual obligations, or plain business prudence can mean sending data to OpenAI or Anthropic is off the table.

For those cases I deploy open-source models on your own infrastructure: Llama (Meta), Mistral, and where it fits, specialized models for a particular domain.

The trade-off is honest: on-premise models are less capable today than the best commercial models. In exchange you get full data sovereignty. Nothing leaves your network. I'll always be straight with you about that trade-off.

Model comparison

ModelRoleContext classSpeedWhen I use it
Claude — most capableDeep analysis, complex reasoning, contract reviewVery large — entire contract packages, months of emailSlow (thorough)High-stakes work that needs maximum accuracy
Claude — daily workhorseAI assistants, content, production operationsLargeFastMost production systems, real-time interaction
Claude — fast/cheapClassification, routing, quick queriesStandardVery fastSimple, high-volume jobs, sorting messages
OpenAI — flagshipMultimodal analysis, validation, specific analysesVery largeModerateComplementary analytical work, image analysis
OpenAI — code modelCode generation, code review, testsLargeFastBuilding and reviewing all the code I ship
OpenClaw (install flavor)Presence layer — an agent in Telegram/Slack/WhatsAppN/AReal-timePersonal productivity, in-channel team workflows
Hermes (install flavor)Background worker — cron, triage, reportsN/AAsynchronousRecurring background routines, scheduled reports
Llama/MistralOn-premise deployments, data sovereigntyVariesVariesWhen data can never leave your infrastructure

OpenClaw and Hermes — two of the four install flavors

OpenClaw and Hermes are two of the four flavors of the Custom Agent Install (the other two are custom — Claude Code / Codex CLI wired into your stack — and hybrid — a package that combines the presence layer with the CLI). Full scope and pricing: the Labs page.

OpenClaw is the presence flavor: an AI agent sits in a channel your team already uses — Telegram, WhatsApp, Slack, Discord. You message it from your phone, it picks up context, runs a workflow, and comes back with the result in the same channel. It works best when AI stays in the natural flow of work instead of demanding a new tool.

Hermes is the background flavor: a worker that runs on a schedule or on a trigger. Research, monitoring, triage, recurring reports — the result lands wherever it should: mail, Notion, a channel. You pick Hermes when the problem isn't the channel, it's the volume of work.

Install details, spec, and pricing: the Labs page.


Building and shipping

The model is only part of the picture. The system around it — the code, the deployment, the infrastructure — is what makes it useful in your business.

GitHub — your code, always

Every project lives in a Git repository on GitHub. You get full owner-level access — not just read.

What that means for you: a complete change history with who and when; documentation inside the repository, versioned alongside the code; full ownership of the code regardless of any later decision. Want to hire another developer to modify it? They can. Want to switch providers? You take the repository with you.

More on the ownership structure and access: the integration guide.

Python (FastAPI / Flask)

Python is the main backend language. Every important AI library, framework, and model SDK is Python-first. Building in Python means direct access to every tool in the AI world without translation layers. FastAPI for high-performance APIs, Flask for simpler services.

TypeScript / Next.js

When a project needs a web interface — a dashboard, an admin panel, a client portal — I build it in TypeScript and Next.js. TypeScript adds type safety to JavaScript: fewer bugs, better tooling, code that catches problems before production.

Cloudflare Workers — deploy at the edge

Cloudflare Workers runs code on servers distributed around the world. A user in Warsaw and a user in Tokyo get the same response time. You pay for what you use, not for idle servers. For most projects this is my default deployment target.


Integration tools

REST APIs and webhooks

The standard interfaces for connecting modern software. REST APIs for structured queries between systems. Webhooks for event-driven notifications: "when a new order comes in, tell the AI system."

Automation platforms (n8n, Make)

When a flow is simple enough that code would be overkill, a visual builder is the right tool. When the logic gets too complex, performance matters, or the automation is a core business process that needs full version control: I write code.

Databases

SQLite for simple projects and prototypes. PostgreSQL for production systems that need reliability, concurrent access, and scale. I match the database to the project — a proof of concept doesn't need PostgreSQL, and a system serving thousands of users a day shouldn't sit in SQLite.


Quality control

Every language model hallucinates — Claude, GPT, open-source models. That's a property of the technology, not a flaw in a specific model. What matters is what you do about it.

Every important output runs through a multi-step process: generation, fact-checking against source material, cross-checking on high-stakes work, human review before delivery.

For automated systems I build those steps directly into the system: confidence scoring, source attribution, escalation rules to a human for questions with no certain answer, automatic verification of prices and dates against your actual data.

How that flow looks on a real project: how I work.


Security

The detailed per-tool access spec, credential management, and access control: the integration guide.

The rules I follow on every project: API keys stored as environment variables (never hardcoded), data encrypted in transit and at rest, GDPR compliance from day one as an engineering requirement, least privilege — the AI gets only the access it needs to do a specific job.


Summary

I use Claude as the main AI engine. I complement it with OpenAI and open-source models where a specific job calls for it. I build in Python and TypeScript. I deploy on Cloudflare Workers. I ship through GitHub with full ownership on your side.

Every tool choice serves one goal: building AI systems that work for your business and that you have full control over.

If you want to know how this applies to your specific situation, book a free discovery call.

last updated: 2026-05-11

Our Stack — Kuliberda Labs Docs