Blog

Engineering stories, security deep-dives, and lessons from shipping AI-powered test automation.

Opinion

Value Per Token: The Number the Agent-Loop Hype Forgets

Loops where agents prompt agents to write code are useful in the right hands — and oversold as a universal law by the people selling the tokens. A case for value per token, treating models as commodity suppliers, and keeping one deterministic check between generation and trust.

Read more →
Open Source

23 Free QA Skills for Your Coding Agent

We open-sourced a catalogue of 23 diagnostic QA skills for Claude Code and Codex — Core Web Vitals, secret scanning, dependency audits, dead-code detection, flaky-selector hunting, and more. No signup. Copy a folder and go.

Read more →
Engineering

You Can't Review Your Own Work

AI code generation and AI code testing are adversarial systems that should be separate products. When the same agent writes the code and its tests, they pass by construction. The thesis QualityMax is built on — and the deterministic guardrail harness that makes it real.

Read more →
Monthly Recap

May in Review: The Month We Made It More Trustworthy

A rebuilt AI crawl planner grounded in the live page, platform stability work, security hardening, qmax-code going open source, and the dogfooding loop behind it all.

Read more →
Faroe Islands cliffs during a phone-first QualityMax shipping week Dogfooding
May 26, 2026

Six days of shipping QualityMax from my phone

A Faroe Islands trip May 18–23, 43 PRs landed, zero days at a desk, and one revert that proved why the gates exist.

Read more →
Free Tier
May 19, 2026

You're already paying for an AI subscription. Get the full QA loop for free.

The free tier is the product, not a trial. Bring your existing Claude Code or Codex subscription, get crawl → generate → run → fix without leaving your terminal — plus 60 free isolated cloud-sandbox minutes a month for the runs that shouldn't live on a laptop.

Read more →
Dogfooding
May 13, 2026

We built our iPhone app in 4 days — without being a mobile testing platform

QualityMax started as a web E2E testing platform. This week we shipped our own iPhone app to TestFlight. The next day, the new app caught its own production bug end-to-end in under 20 minutes. Here's how the dogfooding loop closed.

Read more →
Announcement
May 2026

qmax-code is now open source

The Go + Charm TUI agent that orchestrates Claude over the QualityMax API is now public on GitHub. Read every line, fork it, send PRs — FSL-1.1-ALv2 (converts to Apache 2.0 in 2 years).

Read more →
Engineering
May 2026

qmax-code 1.13: Claude Code and Codex on QA Steroids

v1.13 doesn't just find the bug — it patches it in your terminal while you watch. Multi-model routing, in-terminal auto-fix, instant PR security review. Works free with your existing CC or Codex subscription.

Read more →
Dogfooding
April 2026

We Redesigned 14 Landing Pages — Through Our Own AI Review Gates

22 commits, 3 PRs, a 5-persona AI review, every commit gated by SAST + prompt-injection + brute-force checks. If we don’t trust our pipeline with our own brand pages, why would you?

Read more →
Engineering
April 2026

Teaching the Reviewer: How 👍/👎 on a PR Comment Rewires the Next Review

A single click on a QualityMax PR comment becomes durable, per-repo knowledge the reviewer retrieves on the next PR. Here’s the plumbing — three feedback channels, one storage layer, and the GitHub-webhook limitation that forced us to build a poller.

Read more →
Analysis

Two Posts, Same Day: The Gap Between AI Policy and Vibe Coding

One mature engineering org writes a 27-page AI policy with the rule “if you can’t explain the code, don’t commit it.” One workshop ships 10 live websites in an afternoon with Lovable and Cursor. The gap between them is the whole QualityMax market.

Read more →
Engineering

The Möbius Strip QA Loop: When the Tool Tests Itself

Most QA tools sit outside the code they test. QualityMax sits inside — and now monitors its own errors, generates its own regression tests, closes its own loops. A single-sided surface where tool and target merge.

Read more →
Dashboard with analytics and toggles Product Update

Your AI Reviewer Now Asks What You Care About

Interactive calibration for AI code reviews: pick which categories to check, which to skip, and get structured findings with a one-command fix for your LLM agent.

Read more →
Anthropic status page showing claude.ai partial outage and Claude Code degraded Engineering

When Claude Goes Down, Your Tests Shouldn't

Today's Anthropic outage took claude.ai partial, Claude Code degraded. Every AI test platform built on a single LLM provider went down with it. Here's why QualityMax routes per-task across Claude, GPT, and Gemini — and what that costs.

Read more →
Terminal with Go code Engineering

Building qmax-code: Why We Built Our Own AI Testing Agent

7,951 lines of Go. Charm framework TUI. 48 MCP tools. Not based on Claude Code. Two tools, one mission — here's the engineering story.

Read more →
Code on a dark screen Analysis

AI Coding Agents Are Secured in the Wrong Direction

The Claude Code source leak reveals an industry-wide gap: AI tools invest in containing the agent but barely verify whether the code it produces is secure. 4% of GitHub commits are now AI-generated. Who's checking them?

Read more →
We Got Brute-Forced on Launch Day Real Incident

We Got Brute-Forced on Launch Day

We posted our vibe-check page on Reddit and Hacker News. 1,145 users came. So did a brute-force attack that blew through our Resend email quota in 4 minutes.

Read more →
Matrix-style digital rain Engineering

Building the Matrix Demo

Behind the scenes of our interactive demo page — boot sequences, the red pill / blue pill choice, a chat-driven AI terminal, and live Playwright execution in the browser.

Read more →
Cybersecurity concept Security

Building a Hostile Site to Test Our AI

How we created an adversary website full of prompt injections, XSS traps, and redirect loops to stress-test our AI crawl pipeline — and what we learned.

Read more →