This post is about module 8 of the Cloud Challenge series. It's the first of my series where there was no Learn Live video session for it. It is not a topic that is listed as covered by the DP-600 exam, although it is part of Microsoft Fabric and part of the skills challenge collection. I am completing the 3 modules not covered in the videos simply because those completing the challenge will be getting a 50% off voucher for the exam.
Previous posts in this series:
- Part 1 (Introduction, Lakehouses)
- Part 2 (Apache Spark)
- Part 3 (Delta Lake tables)
- Part 4 (Using Data Factory pipelines)
- Part 5 (Ingesting data with Dataflows Gen2)
- Part 6 (Getting started with data warehouses)
- Part 7 (Administration of Microsoft Fabric)
- Part 8 (Medallion architecture design)
- Part 9 (Spark & notebooks)
Microsoft Learn Module
Getting started with Real-Time Analytics in Microsoft Fabric
This module teaches some background to where KQL is used and why it is there, as well as some of the core commands. The exercise walks you through creating a KQL database, a table and then some queries of the data you load.
One note to others who haven't run through the lab yet, part of the exercise where the table is created and data is loaded, the steps do not line up with the current wizard. I've submitted feedback on this, but you need to look under Advanced Settings to see an option to identify that the first row contains the column headers. It's worded differently in the lab than it is in the wizard.
Key Takeaways
Synapse Real-Time Analytics is a fully managed service inside Microsoft Fabric that is optimized for streaming time-series data, at scale, wherever it is on the continuum of unstructured to structured data. The data is stored in a KQL database (Kusto Query Language). Some elements of KQL were familiar where they have similarities to SQL Server (tables, functions, views), and elements that were completely new to me where the streaming data side comes in.
KQL queries have some syntax similar to SQL but some extensions enable advanced text and pattern matching, statistical analysis, time-series projections and more. The commands seem easy enough to understand and more literal, like ".create table" and ".ingest into table X", and to query there isn't a "select * from <table X>" but just "sales" to get the data from table "sales".
Overall, I like the syntax of the KQL language, parts of it feel more intuitive to me than T-SQL.