Blog
Oct 8, 2025 - 25 MIN READ
Microsoft Fabric as an All-in-One Analytics Solution

Microsoft Fabric as an All-in-One Analytics Solution

A deep dive into Microsoft Fabric — OneLake, Workspaces, Lakehouse, Warehouse, Apache Spark, and how it compares to Data Mesh and decentralised approaches.

Radek Řezáč

Radek Řezáč

The data architecture landscape has oscillated repeatedly between centralised and decentralised governance models. Microsoft Fabric represents a strong push back toward centralisation — bringing together data movement, data engineering, data science, real-time analytics, and business intelligence into a single unified platform (SaaS). No separate services, no data movement between tools.

Centralised vs. Decentralised Data Management

Centralised Data Management — data from different sources gathered and stored in one central database, warehouse, or data lake. A single repository for managing, storing, and using data. Easier governance, simpler access control.

Decentralised Data Management — no central repository. Data distributed across different nodes or domains. Teams have direct access without third parties. More agile for autonomous teams, harder to govern at scale.

Data Mesh vs. Data Fabric

Data Mesh — a decentralised approach to data architecture that promotes domain-oriented ownership. Each domain team is responsible for its own data pipelines, governance, and quality. Data is treated as a product with defined contracts.

Data Fabric — a centralised approach to data architecture and management. An end-to-end, unified analytics platform that brings together all the data and analytics tools an organisation needs.

Microsoft Fabric

Microsoft Fabric is Azure's implementation of the Data Fabric approach. It integrates Azure Data Factory, Azure Synapse Analytics, Power BI, and OpenAI Service into a single unified product. Everything — data lake, data engineering, data integration, Power BI, real-time analytics — operates in one environment. It is fully managed (SaaS): no infrastructure provisioning, no service stitching, no data movement between vendors.

Microsoft Fabric Fundamentals

OneLake

OneLake serves as the central data repository — a managed data lake underpinning all of Fabric.

  1. Unified Storage — a single, unified storage system for all data assets. No need to piece together separate storage services.
  2. Data Accessibility — rather than physically moving data from AWS or Azure, OneLake allows you to create shortcuts to external files. Access data without complex pipelines.
  3. Integration — seamlessly integrates with all Fabric components.
  4. Governance and Security — built-in features for data governance and access control.

Workspace

A workspace is a dedicated environment for managing and organising data projects.

  1. Segmentation — a folder or segment designated for a specific project or department.
  2. Collaboration — team members are added with specific roles: admin, member, contributor, or viewer.
  3. Item Creation — create datasets, reports, data pipelines, notebooks, dashboards, and more.
  4. Capacity Assignment — workspaces must be assigned to a Fabric capacity to access platform features.

Lakehouse

A Lakehouse combines the functionality of a data lake and a data warehouse.

  1. Hybrid Storage — stores structured data (tables), semi-structured, and unstructured data (CSV, JSON). Suitable for analytics workloads and machine learning.
  2. Delta Tables — optimised for high performance, supporting both batch and real-time data processing.
  3. Centralised Location — a single location for storing, managing, and analysing files and data.
  4. Integration — works with Apache Spark, Delta Lake, and open-source tooling.

SQL Analytics Endpoint

The SQL Analytics Endpoint provides a connection interface for interacting with Lakehouse data using SQL.

  • Provides a connection string usable in SSMS, Power BI, and other tools
  • Includes a visual query builder for users less comfortable with SQL
  • Enables SQL queries against Delta tables without writing Spark code
  • Read-only via SQL — write operations require Spark or Fabric Pipelines

Shortcuts

A shortcut references a data table without creating a redundant copy. It acts as a link, allowing you to interact with data from external sources (SQL databases, ADLS, S3) as if it were native Lakehouse data — without ingesting it. Any modifications to the original table are reflected through the shortcut automatically.

Warehouse

The Warehouse is a specialised layer for high-performance analytics on structured data, utilising T-SQL.

Lakehouse vs. Warehouse:

LakehouseWarehouse
Data TypesStructured, semi-structured, unstructuredStructured (relational) only
Write via SQLRead-onlyRead and write
Best forML, diverse formats, batch + streamingHigh-performance SQL analytics, BI
Underlying formatDelta ParquetDelta Parquet

Both are built on Delta tables in Parquet format, stored in OneLake. The Warehouse additionally supports T-SQL DDL (CREATE TABLE, INSERT, UPDATE, DELETE).

Power BI Semantic Model

The semantic model (formerly called datasets) is a data structure that organises data and defines relationships between tables, serving as the foundation for Power BI reports.

  • Relationships — maintains table relationships for accurate analysis
  • Measures — computed on-the-fly when the report runs; not stored physically
  • Performance — reports reference the centralised Lakehouse data directly, avoiding redundancy

Semantic Model vs. Tabular Model:

Power BI Semantic ModelTabular Model (SSAS)
Data storageReferenced from Lakehouse — no duplicationCan be imported (in-memory) or DirectQuery
Usage contextPower BI Service reportingAdvanced modelling, SSAS, on-premises
Write environmentPower BI Service / FabricPower BI Desktop / SSAS

Row-Level Security (RLS)

RLS in the semantic model restricts data access for specific users based on roles:

  • Roles — defined with DAX filters that determine what data is visible
  • Dynamic filteringUSERNAME() or USERPRINCIPALNAME() automatically filter data based on login credentials
  • Assignment — users are assigned to roles in Power BI Service after publishing

Apache Spark in Fabric

Apache Spark provides the distributed compute layer for Fabric's data engineering and data science workloads.

Strengths:

  1. Distributed Computing — operations execute in parallel across a cluster
  2. In-Memory Processing — data cached in memory, not disk — significantly faster than traditional approaches
  3. Resilient Distributed Datasets (RDDs) — fault-tolerant data partitions distributed across the cluster
  4. Multi-Language Support — Scala, Python, Java, SQL
  5. Batch and Streaming — handles both workload types natively
  6. ML Integration — MLlib and integration with MLflow for machine learning

Loading Data from a Lakehouse

df = spark.read.format("csv") \
    .option("header", "true") \
    .option("inferSchema", "true") \
    .load("lakehouse_path")

df.show()

Writing a DataFrame Back to a Lakehouse

As a Delta table (recommended):

df.write.format("delta").saveAsTable("tablename")

As a CSV file:

df.write.csv("lakehouse_path")

Verify the write:

new_df = spark.read.format("delta").load("lakehouse_path")

Temporary Views

Temporary views allow you to query DataFrames using SQL within the same Spark session:

# Create a temporary view
sales.createOrReplaceTempView("sales_temp_view")

# Query it with SQL
result = spark.sql("SELECT * FROM sales_temp_view")
result.show()

Temporary views are session-scoped — they disappear when the notebook session ends. Use them for ad-hoc analysis and joining multiple DataFrames with SQL syntax.

Summary

Microsoft Fabric is a significant bet on centralised analytics. For organisations running fragmented stacks — separate ADF, Synapse, Power BI, and Databricks environments — Fabric offers a compelling simplification. The trade-off is lock-in to Microsoft's ecosystem. For teams already deeply invested in Azure, that trade-off is often favourable.

Radek Řezáč • Senior Lead Data Engineer • © 2026