Data Modelling in the Cloud

Azure offers a broad spectrum of services for enabling an end-to-end data analytics flow.

Some of the most relevant ones include : Data Movement & Orchestration service (Azure Data Factory), Predictive Modelling (Azure Machine Learning), Big Data computing (Data Lake Family, HDInsight), Real-time processing (Event Hub + Azure Stream Analytics) and most importantly the Visualisation service for communicating the insights harvested from all above.

One of the key challenge I commonly experience in this flow is the lack of a semantic layer for harnessing the outputs from versatile Azure services. This semantic layer, often known as Data Modelling is what diverts people to rely on Power BI Desktop instead of architecting analytics solution entirely on the Cloud.

Microsoft has announced a preview Azure Analysis Services (i.e. Tabular on the Cloud) in an attempt to fill up the technology gap. While this service is destined for endorsement from an enterprise point of view, it is quite expensive to use at same time (read here for more detail)

Today, I will share with you a viable alternative for building such data model in the Cloud at virtually no cost.

Best of all, it only takes 3 steps!


I will use a demo to accompany the subsequent walk-through part of this blog.

The demo showcases a solution for monitoring office resource utilisation. The idea is to track resource availability on a custom floor plan in real-time. When a user claims a hot desk by plug-in their laptop, a JSON stream is fired at Power BI for it to flip the desk’s status to “occupied”. Similarly, when a user books for a meeting room at a certain hour for a certain duration, the map will reflect the usage accordingly.

Data Model

Albeit simple, to support such solution we will need some kind of data model for hosting the DAX calculations as well as for controlling Power BI’s filter propagation behaviour. The diagram below depicts how I envisage the model for this solution :

How-To Guide

Here I will use Power BI REST API for building the aforementioned data model.

First, we need to create a Power BI dataset.  A REST API dataset in Power BI consists of following few components:

  1. Dataset name (mandatory)
  2. Table schema (mandatory)
  3. Relationship (optional)
  4. Measure (optional)
  5. Calculated column (optional)

We use PUT method to call the endpoint to submit the dataset definitions.

Content-Type                    application/json

URI Parameter                 none


  Defines dataset name. In Power BI dataset name does not have to be unique.

  Defines table schema including column name, data type, formatting and aggregation method.

  Defines measure. A valid DAX expression for calculation.

  Defines relationship. A relationship type can be either OneDirection, Both or Auto.

Once submitted successfully, you will find the dataset appearing under the Power BI Dataset section instantaneously with the tables and schemas already created.

Next, we will ingest some data into the newly created data model.

We will call the{yourdatasetGUID}/tables/{yourtable}/rows endpoint to do this.

Content-Type                    application/json

URI Parameter                 your dataset GUID, your table name


We now have a functioning data model with some data loaded. We can use them to perform analysis and start creating powerful visualisations exactly the same way we do in Power BI Desktop.

Power BI REST API allows programatically accessing the Power BI resources. Anything from building a real-time solution (as seen in this blog) to enabling a meta-data driven report creation becomes feasible under this method. The application of which is boundless.

You can download a complete JSON scripts I used to build this demo from here

I certainly look forward to seeing more adoption of Power BI REST API from the community as well as in the commercial space.

The following two tabs change content below.

Jixin Jia

Jixin 'Gin' Jia (MBA, MCSE, MCP, ASA) is a certified Azure Solutions Architect. He is a MCSE in Business Intelligence and Data Management & Analytics. With a Mathematics, MBA and IT background, Jixin combines advanced analytics and modern BI technologies in building data-driven intelligence. He is a regular speaker about machine learning, quantitative analysis and Microsoft BI in many public forums. He is now an active Data Science program candidate at Harvard University.