RM
Back to AI Systems

Bounded AI Analytics Workflow

DataBrief AI

Turning messy spreadsheets into grounded business reports through controlled AI orchestration.

DataBrief AI transforms CSV/XLSX uploads into structured business reports using deterministic profiling, semantic role detection, controlled Python execution, bounded repair, and grounded report generation.

AI WorkflowData AnalysisFastAPINext.jsPython SandboxEvaluation LoopPortfolio Prototype

upload.csv → profile.json

roles: date · revenue · category

status: executed · repaired: 1

exports: report.md · findings.json · analysis.py

Context / Problem

Useful reports without unsupported claims.

Spreadsheets often contain valuable business signals, but lightweight AI tools tend to either summarize them generically or overclaim insights the dataset cannot actually support. DataBrief AI explores a more constrained approach: using AI workflow patterns to generate useful reports while keeping the system bounded, inspectable, and grounded in the uploaded file.

Core idea

Workflow over autonomous agent.

Instead of building a free-form AI analyst, DataBrief AI uses a bounded workflow. The system profiles the file, detects semantic column roles, generates a safe analysis plan, executes controlled Python, evaluates the output, repairs common failures, and produces a grounded report with exports.

The design decision was intentional: workflow over autonomous agent. For spreadsheet analysis, reliability and reproducibility matter more than open-ended autonomy.

Workflow / Architecture

A bounded path from upload to export.

Upload CSV/XLSXValidateProfileRoutePlanControlled Python ExecutionEvaluate + RepairGrounded ReportExport

Validate

Checks file format and structure.

Profile

Detects rows, columns, missing values, duplicates, and field types.

Semantic role detection

Identifies dates, prices, quantities, identifiers, categories, and unsupported fields.

Plan

Creates a dataset-specific analysis plan.

Execute

Runs generated Python in a constrained environment.

Evaluate + Repair

Checks execution and retries bounded recoverable failures.

Report

Generates grounded KPIs, findings, recommendations, limitations, and charts.

Export

Provides Markdown report, JSON findings, and analysis script.

Key features

Small pieces, clear boundaries.

CSV/XLSX upload
Dataset profiling
Semantic column-role detection
Domain-aware routing
Controlled Python analysis
Bounded repair loop
Grounded KPI/report generation
Exportable report, findings JSON, and analysis script
Synthetic and public demo datasets
Honest limitations for unsupported metrics

Design decision

Workflow, not agent.

A full autonomous agent was intentionally avoided. DataBrief AI uses agentic patterns — routing, evaluation, bounded repair, and grounded generation — without giving the system open-ended tool use or arbitrary autonomy. This keeps the experience more predictable and easier to evaluate.

This distinction became central to the project: not every AI system needs to become an agent. In this case, the stronger architecture was a workflow with controlled decision points.

Example datasets

Demo files for grounded behavior.

amazon-purchases-sample.csv

Amazon purchases sample

A messy ecommerce-style dataset used to test semantic safeguards. The workflow correctly avoids unsupported order-level metrics when no order ID exists and marks return/cancel rate as unavailable when no status field exists.

Download Amazon sample

Marketing.csv

Marketing campaign sample

A campaign-performance dataset used to explore how the workflow handles marketing-style metrics such as revenue, spend, clicks, leads, orders, and campaign categories. This also revealed a future improvement area: adding a dedicated marketing-campaign route.

Download Marketing sample

Output preview

Screenshot slots for the report experience.

Report header
Grounded report header with dataset type, execution status, and confidence label.
Primary metrics
Primary metrics filtered to avoid unsupported order-level claims.
Top findings
Top findings with explicit source references.
Charts
Charts generated from the uploaded file.
Exports
Export options for report, findings JSON, and generated analysis script.

Limitations

Bounded by design.

This is a portfolio prototype, not production SaaS. The sandbox uses static checks and resource limits but does not implement OS-level isolation. Analysis quality depends on detectable column roles, and unsupported metrics are intentionally marked as unavailable rather than invented.

  • No OS-level network/filesystem sandbox isolation.
  • Not a fully autonomous AI agent.
  • No external web enrichment.
  • Marketing campaign routing is a future improvement.
  • Output quality depends on column naming and dataset structure.
  • Designed for portfolio demonstration, not production deployment.

What I learned

Autonomy is not always the highest-value feature.

The project clarified a key AI product design principle: autonomy is not always the highest-value feature. For structured tasks like spreadsheet analysis, a bounded workflow can produce a more trustworthy user experience than a free-form agent. The strongest part of the system is not that it does everything, but that it knows what the data does and does not support.

Future improvements

Where the prototype goes next.

  • Add dedicated marketing-campaign route.
  • Improve chart title semantics.
  • Add stronger OS-level sandboxing.
  • Add more domain recipes.
  • Add richer evaluation fixtures.
  • Add optional analysis-strategy planner.

DataBrief AI demonstrates how AI workflow architecture can make spreadsheet analysis more useful, constrained, and transparent.

It is a technical case study in designing with boundaries: enough intelligence to adapt to a file, enough structure to avoid unsupported claims.

More Projects