In today’s data-driven world, visualization is everything. Whether you’re a business analyst, data scientist or researcher, the ability to convert raw data into meaningful visuals can define the success of your decisions. That’s where Microsoft’s Data Formulator steps in a cutting-edge, open-source platform designed to empower analysts to create rich, AI-assisted visualizations effortlessly.

Developed by Microsoft Research, Data Formulator combines artificial intelligence (AI) and user-friendly interaction to transform the way we explore, clean and visualize data. Let’s dive deep into how this innovative tool is reshaping the future of data analytics.
What is Microsoft Data Formulator?
Data Formulator is an AI-powered data visualization and transformation tool developed by the Microsoft Research team. Its primary goal is to bridge the gap between human intuition and machine intelligence in data analytics.
Unlike traditional chat-based AI systems that rely entirely on natural language prompts, Data Formulator takes a hybrid approach blending natural language (NL) inputs with user interface (UI) interactions. This design allows users to describe what they want while maintaining hands-on control over their charts, fields and data relationships.
In simple terms, Data Formulator helps users iteratively build rich, dynamic visualizations by talking to the AI, dragging and dropping fields and letting the system generate transformations automatically.
Why Data Formulator Stands Out ?
Data Formulator isn’t just another visualization tool — it’s an intelligent data companion that can interpret context, analyze data relationships and even generate SQL queries behind the scenes to deliver the visualization you envision.
Here are some standout features that make it revolutionary:
1. AI-Driven Data Transformation
Data Formulator can understand your goals and automatically transform raw data into the format needed for analysis. Whether it’s aggregating sales by region or calculating trends over time, the AI handles it effortlessly.
2. Interactive and Intuitive Interface
Users can choose chart types, drag and drop fields (X, Y, color, etc.) and modify visual encodings. The interface feels familiar to tools like Power BI or Tableau but adds AI intelligence for deeper analysis.
3. Multiple Data Source Integration
The tool supports major databases and storage systems including:
- MySQL
- PostgreSQL
- Microsoft SQL Server (MSSQL)
- Azure Data Explorer (Kusto)
- Azure Blob Storage
- Amazon S3 and more.
4. Anchoring and Branching
Introduced in version 0.1.7, the anchoring feature allows users to save an intermediate dataset like a cleaned or filtered version and build new analyses on top of it. This ensures AI agents remain context-aware and perform faster computations.
5. Goal-Oriented Data Exploration
With the 0.2.2 update (July 2025), users can now start their analysis by defining a goal. Data Formulator’s AI then recommends exploration ideas, visual encodings, and analytical directions making the process proactive rather than reactive.
How AI Works in Data Formulator ?
At the heart of Data Formulator lies powerful large language models (LLMs) that interpret your intent and execute the necessary transformations. The platform supports models like:
- GPT-4o (OpenAI)
- Claude 3.5 Sonnet (Anthropic)
- Azure OpenAI
- Ollama and others, powered by LiteLLM integration.
When you type a prompt such as “Show the top 5 products by revenue,” Data Formulator generates the SQL query, executes it and renders the visualization instantly. You can then follow up with prompts like “Filter only Q1 data” or “Color by profit margin,” continuing your exploration seamlessly.
Getting Started with Data Formulator
Microsoft has made it incredibly simple to get started with Data Formulator. There are three main setup options:
Option 1: Install via Python (Recommended)
You can install and run it locally using the Python Package Index (PIP):
pip install data_formulator data_formulator
It will automatically launch in your browser at http://localhost:5000.
You can also specify a custom port:
python -m data_formulator --port 8080
Option 2: GitHub Codespaces
If you prefer a cloud-based setup, you can run Data Formulator instantly in GitHub Codespaces with everything pre-configured for rapid development.
Option 3: Developer Mode
Developers who want full control can build the tool locally by following instructions in the DEVELOPMENT.md file. This mode allows customization and integration with your existing analytics systems.
Real-World Applications
Data Formulator isn’t limited to research or academia — it’s a practical tool for:
- Business Intelligence: Quickly visualize sales, trends and KPIs using AI-guided transformations.
- Data Science: Explore large datasets efficiently with AI-assisted querying and summarization.
- Education & Research: Help students and researchers understand data relationships visually.
- Software Development: Build new visualization tools on top of Data Formulator’s open-source framework.
Its MIT License ensures flexibility for both individuals and enterprises to use, modify and expand upon.
Backed by Research Excellence
Data Formulator is grounded in cutting-edge research by Microsoft’s leading scientists, including Chenglong Wang, Bongshin Lee, Steven Drucker, Dan Marshall and Jianfeng Gao.
Two major papers document its evolution:
- Data Formulator: AI-powered Concept-driven Visualization Authoring (IEEE TVCG, 2023)
- Data Formulator 2: Iteratively Creating Rich Visualizations with AI (ArXiv, 2024)
These papers highlight how iterative design, AI interaction, and visualization theory combine to make data exploration more intuitive and intelligent.
The Future of AI in Data Visualization
The release of Data Formulator 0.2 marked a major milestone introducing support for large datasets using DuckDB as a backend. This enables blazing-fast queries and real-time visual feedback, even for millions of data points.
As AI continues to evolve, tools like Data Formulator represent the next generation of analytics platforms where natural language, automation and visualization merge into one seamless experience.
Conclusion
Microsoft Data Formulator is more than just a visualization tool – it’s a revolution in data analytics. By combining AI’s computational intelligence with human creativity, it allows users to explore data faster, smarter and more meaningfully.
Whether you’re a developer, analyst or researcher, this open-source platform offers a glimpse into the future of human-AI collaboration in analytics.
If you’re ready to experience it yourself, visit the Data Formulator GitHub Repository and start creating intelligent visualizations today.
Follow us for cutting-edge updates in AI & explore the world of LLMs, deep learning, NLP and AI agents with us.
Related Reads
- PandasAI: Transforming Data Analysis with Conversational Artificial Intelligence
- Agent Lightning By Microsoft: Reinforcement Learning Framework to Train Any AI Agent
- Free for 1 Year: ChatGPT Go’s Big Move in India
- Reinforcement Learning for Large Language Models: A Complete Guide from Foundations to Frontiers Arun Shankar, AI Engineer at Google
- AI Projects : A Comprehensive Showcase of Machine Learning, Deep Learning and Generative AI
2 thoughts on “Microsoft Data Formulator: Revolutionizing AI-Powered Data Visualization”