RM
Back to Case Studies

AI Systems / Audit Workflow

Website Audit Agent

An evidence-backed website audit workflow that turns public websites into structured UX, SEO, performance, content, and prospect-intelligence reports.

The goal was to turn website review from subjective opinion into a repeatable evidence workflow, while keeping AI useful but bounded.

Rule-first audit engineBrowser-first captureBounded LLM synthesisPrivate internal tool
Website Audit AgentComplete
Capture: Rendered browserEvidence confidence: High
UX82
SEO74
Performance68
Content79
Accessibility71
Top finding

Primary CTA is visible but not consistently reinforced above the fold.

Evidence label: Observed
Representative preview. Not a live audit result.

Problem

The problem with AI audits is trust.

AI website-audit tools often collapse observation, scoring, and interpretation into one opaque model response. That makes the output fast, but difficult to trust.

Website Audit Agent was designed around the opposite principle: before the model says anything, the system must capture evidence, classify confidence, and define what is actually known.

Overview

The audit truth is deterministic. The AI is interpretive.

This is not a free-form audit chatbot. The model does not create findings, score categories, invent metrics, or turn weak signals into measured facts. The LLM layer only synthesizes accepted evidence, which makes the workflow more credible for internal prospecting.

Capture evidence first

The workflow starts with public website evidence, not model speculation. Rendered capture is attempted first, with static public evidence as fallback.

Score with rules

Findings and category scores are produced deterministically from captured evidence before any synthesis layer is involved.

Interpret after acceptance

Gemini and the Prospect Audit Agent receive accepted findings only, then translate them into useful internal acquisition intelligence.

Workflow

Controlled pipeline from public URL to private report

The architecture separates capture, persistence, scoring, synthesis, and report assembly so each stage has a clear owner and failure mode.

URL IntakeBrowser / Static CaptureEvidence StoreDeterministic ScoringBounded AI SynthesisInternal Report

URL Intake

A user submits a public website URL.

Browser / Static Capture

Rendered evidence is captured first, then static public evidence is used if rendering fails or is blocked.

Evidence Store

Snapshots and page evidence are persisted for the run.

Deterministic Scoring

Rules generate findings and category scores.

Bounded AI Synthesis

Gemini and the Prospect Audit Agent summarize only accepted findings.

Internal Report

The output becomes private acquisition intelligence.

Evidence model

Every claim needs an evidence status

The system can interpret, but it must not pretend inference is measurement. Evidence labels make that boundary visible inside the report.

Measured

Measured

Directly captured or computed.

Observed

Observed

Visible in captured page evidence.

Inferred

Inferred

Interpretation from available signals; never presented as measured truth.

Agentic layer

The AI is downstream by design.

This is a hybrid workflow-agent system. The deterministic shell owns capture, scoring, persistence, status, and report assembly. The LLM layer owns summary, prioritization, explanation, and prospect intelligence. The Prospect Audit Agent does not browse freely, rewrite findings, or create scores.

Deterministic Audit EngineAccepted FindingsProspect Audit AgentInternal Intelligence

What the LLM can and cannot do

Allowed

  • Summarize accepted findings.
  • Prioritize recommendations.
  • Translate audit evidence into internal prospect intelligence.
  • Explain why a finding matters commercially.

Blocked

  • Create audit findings.
  • Modify category scores.
  • Invent metrics.
  • Present inferred claims as measured truth.
  • Make unsupported revenue claims.

Report surface

A private audit report built from accepted evidence

The report surface is intentionally internal. It presents scores, confidence, evidence labels, and top findings without exposing the private Vercel deployment as a public demo.

ObservedSynthesis status
Issue

Primary CTA is visible but weakly reinforced above the fold.

Category
UX / Conversion
Evidence
Observed
Source
Rendered browser capture
Allowed — based on accepted evidence only.
Representative finding card. Not a live audit result.

Safety / Access boundary

Public repository. Private operating surface.

The project is public for portfolio and reference purposes, while the operating surface stays private behind an internal login and protected job endpoint.

Private Vercel deploymentInternal login and access gateWorker-secret protected job endpointPublic website evidence onlyNo anti-bot bypassNo public live demo

Result

A useful AI system because it stays inside its lane.

The result is a private acquisition workflow that can turn a public website URL into a structured internal report with evidence labels, category scores, and prioritized findings.

Repeatable audit pipeline

A public URL moves through capture, evidence storage, scoring, synthesis, and reporting with explicit stage boundaries.

Controlled AI synthesis

The LLM summarizes accepted findings and prospect context without creating audit truth.

Evidence discipline

Measured, observed, and inferred claims stay distinct throughout the report.

Portfolio-safe architecture

The repo can be shown publicly while the operating surface and generated reports remain private.

Stack

Built as a production-oriented prototype

The stack is practical: Next.js for the app surface, Postgres and pg-boss for durable runs, Playwright for capture, Gemini for bounded synthesis, and a worker endpoint protected separately from the internal session gate.

Next.jsTypeScriptPostgrespg-bossPlaywrightGeminiVercelDeterministic scoringEvidence labelsPrivate worker endpoint

Boundaries / Future Improvements

These are design boundaries, not excuses.

The current scope is deliberately narrow so the case study does not overclaim what the system does.

  • Not a public SaaS.
  • No public live demo.
  • Prospect Intelligence is internal guidance, not audit truth.
  • Static-only reports intentionally exclude visual/mobile/above-the-fold scoring.
  • AI synthesis depends on accepted findings.
  • Future work includes evals, model comparison, observability, and real audit examples.