How We Compressed 3 Months of Data Platform Work Into 2 Weeks
(And Why We Are Giving Away the Tool)
Building a data platform on Microsoft Fabric should be straightforward. Microsoft gives you the lakehouse, the notebooks, the pipelines. All the pieces are there.
So why does it still take months to get from "new workspace" to "production-ready, governed, documented platform"?
Because the hard part was never the technology. It is everything around it: the naming conventions nobody wrote down, the governance metadata nobody wants to fill in, the Fabric-specific pitfalls you discover at 11 PM on a Thursday, and the architecture decisions that live in one senior engineer's head until they change jobs.
We got tired of watching the same problems repeat on every project. So we built something different.
The idea: what if your team's expertise was a system?
We have been building data platforms on Databricks and Microsoft Fabric for years. Every project taught us something. A Fabric Git sync quirk, a medallion architecture shortcut, a governance template that actually gets used. But those lessons lived in Slack threads, personal notes, and the memories of whoever happened to be on the project.
We asked ourselves: what if all of that knowledge was not trapped in people's heads but encoded into a system that enforces it, remembers it, and teaches it to the next project?
That is what Data AI Agents is. An open-source AI-powered development team built on top of Claude Code. Not a chatbot you ask questions to. Not a library of templates you copy-paste from. A system that understands your data platform, knows the patterns that work, remembers the pitfalls that do not, and generates the governance documentation that nobody wants to write by hand.
What it actually does
The best way to explain it is through the commands you use every day.
/sdd — Spec-Driven Development. You describe what you need ("add a customer dimension with SCD Type 2 history"). The AI writes a specification, you approve it, and then it implements, creating the notebook, the Delta tables, the quality checks, and the documentation. Every change is traceable from requirement to deployment. No more "why does this table exist?" six months later.
/architect — Architecture guidance on demand. Starting a new domain? The AI walks you through medallion layer design, workspace organization, environment separation, and documents each decision as an Architecture Decision Record. Not generic advice. Guidance that considers your existing project structure and platform.
/document — Governance documentation, automated. It scans your codebase, finds undocumented tables and pipelines, asks you the questions a data steward would ask (who owns this? what is the classification? what is the retention policy?), and generates catalog entries. The boring part of data platform development, almost completely removed.
/improve-ai — The learning loop. This is the one that compounds over time. After every project, the AI extracts what it learned, a new Fabric pitfall, a pattern that worked well, a design decision worth generalizing, and feeds it back into the system. Every project makes the next one faster.
The results that surprised us
We expected the tool to save time. We did not expect how much.
Three months to two weeks. That was the difference on a basic Fabric data platform setup: bronze/silver/gold layers, governance documentation, environment separation, CI/CD, the works. The AI does not just write code faster. It eliminates the back-and-forth, the research, the "how does Fabric handle this again?" moments that eat up weeks.
Data engineers productive on day one. One of our engineers had solid experience with Databricks but had never touched Fabric. With Data AI Agents, they were building production-quality notebooks within days. The AI translated their existing knowledge of data architecture and modelling into Fabric-specific patterns: right workspace structure, correct Delta table configuration, proper Git integration. No Fabric certification required.
Pitfalls you never hit. We have documented Fabric-specific pitfalls, from Git sync failures caused by duplicate logicalId values to notebooks silently breaking because of hardcoded workspace IDs. The AI knows about all of them and steers you around them before you waste a day debugging.
Governance from the first commit. On most projects, governance is the thing you retrofit before go-live. With Data AI Agents, every table gets classified, every dataset gets an owner and a steward, every pipeline gets documented, as part of the normal development workflow, not as a separate painful exercise.
What this means if you are responsible for your company's data
If you are a CDO, a head of data, or anyone responsible for a data platform initiative, here is what actually matters:
You can empower your existing team. Your engineers do not need years of Fabric experience or specialised certifications to build a production-grade platform. The AI encodes the expertise of senior data platform architects: medallion architecture, data governance frameworks, security patterns, performance optimisation. Smart people with solid fundamentals can deliver enterprise-quality results from day one.
Compliance is built in, not bolted on. Data classification, ownership, retention policies, lineage tracking, access controls. These are not afterthoughts. They are part of every /sdd implementation. When the auditor asks "who owns this dataset and what is the retention policy?", the answer already exists in your data catalog.
Knowledge stays with the organisation. When a consultant finishes a project or an engineer changes roles, their hard-won knowledge usually goes with them. With Data AI Agents, every design decision is captured in an Architecture Decision Record. Every pitfall discovered becomes a lesson the system remembers. Every pattern that worked gets generalised for the next project. The institutional knowledge compounds instead of evaporating.
This is version one
We want to be upfront: Data AI Agents is a work in progress. It is not perfect. There are platforms we have not added yet, patterns we are still refining, and edge cases we have not encountered.
But it already cuts months of work down to weeks. It already prevents dozens of known pitfalls. It already generates governance documentation that would take days to write manually. And it gets smarter with every single project because the learning loop is built into how it works.
We believe the best way to improve it is to let more people use it, break it, and contribute to it. That is why we are open-sourcing it.
Try it yourself
If you are an engineer or architect: The repository is public. Clone it, run /init-data-ai to set up your environment, point it at your project with /init-project, and start building. The README has everything you need to get started. If you find a bug or have a pattern to contribute, open an issue or a PR. That is how this gets better for everyone.
If you are evaluating data platform approaches for your organisation: We would love to show you what this looks like on a real project. The tool is free and open-source, but the experience of building production data platforms on Azure and Fabric is what makes it work. Book a 30-minute walkthrough with our data team and we will show you the difference.
Data AI Agents is available on GitHub: link to repo

