Validate
Checks file format and structure.
Bounded AI Analytics Workflow
Turning messy spreadsheets into grounded business reports through controlled AI orchestration.
DataBrief AI transforms CSV/XLSX uploads into structured business reports using deterministic profiling, semantic role detection, controlled Python execution, bounded repair, and grounded report generation.
upload.csv → profile.json
roles: date · revenue · category
status: executed · repaired: 1
exports: report.md · findings.json · analysis.py
Context / Problem
Spreadsheets often contain valuable business signals, but lightweight AI tools tend to either summarize them generically or overclaim insights the dataset cannot actually support. DataBrief AI explores a more constrained approach: using AI workflow patterns to generate useful reports while keeping the system bounded, inspectable, and grounded in the uploaded file.
Core idea
Instead of building a free-form AI analyst, DataBrief AI uses a bounded workflow. The system profiles the file, detects semantic column roles, generates a safe analysis plan, executes controlled Python, evaluates the output, repairs common failures, and produces a grounded report with exports.
The design decision was intentional: workflow over autonomous agent. For spreadsheet analysis, reliability and reproducibility matter more than open-ended autonomy.
Workflow / Architecture
Checks file format and structure.
Detects rows, columns, missing values, duplicates, and field types.
Identifies dates, prices, quantities, identifiers, categories, and unsupported fields.
Creates a dataset-specific analysis plan.
Runs generated Python in a constrained environment.
Checks execution and retries bounded recoverable failures.
Generates grounded KPIs, findings, recommendations, limitations, and charts.
Provides Markdown report, JSON findings, and analysis script.
Key features
Design decision
A full autonomous agent was intentionally avoided. DataBrief AI uses agentic patterns — routing, evaluation, bounded repair, and grounded generation — without giving the system open-ended tool use or arbitrary autonomy. This keeps the experience more predictable and easier to evaluate.
This distinction became central to the project: not every AI system needs to become an agent. In this case, the stronger architecture was a workflow with controlled decision points.
Example datasets
amazon-purchases-sample.csv
A messy ecommerce-style dataset used to test semantic safeguards. The workflow correctly avoids unsupported order-level metrics when no order ID exists and marks return/cancel rate as unavailable when no status field exists.
Download Amazon sampleMarketing.csv
A campaign-performance dataset used to explore how the workflow handles marketing-style metrics such as revenue, spend, clicks, leads, orders, and campaign categories. This also revealed a future improvement area: adding a dedicated marketing-campaign route.
Download Marketing sampleOutput preview
Limitations
This is a portfolio prototype, not production SaaS. The sandbox uses static checks and resource limits but does not implement OS-level isolation. Analysis quality depends on detectable column roles, and unsupported metrics are intentionally marked as unavailable rather than invented.
What I learned
The project clarified a key AI product design principle: autonomy is not always the highest-value feature. For structured tasks like spreadsheet analysis, a bounded workflow can produce a more trustworthy user experience than a free-form agent. The strongest part of the system is not that it does everything, but that it knows what the data does and does not support.
Future improvements
It is a technical case study in designing with boundaries: enough intelligence to adapt to a file, enough structure to avoid unsupported claims.