Manage your Data Warehousing Challenges with Advanced Data Analytics

The Cortana Analytics Suite (CAS) is made up of different components in Azure, allowing users to custom build an analytical application to suit a wide range of analytics scenarios such as real-time recommendations, customer churn forecasting, fraud detection, and predictive maintenance just to name a few.

In this post, we’ll look at four problems with traditional data warehouses and show how the new Azure SQL Data Warehouse (part of the CAS) overcomes them and makes analytics available to organizations of all sizes.

Problem #1: Trying to Predict Future Growth

When developing a new data warehouse, one of the first steps is sizing and commissioning hardware requirements. However, sizing a data warehouse for both storage and processing can be difficult as you only know your present source data needs and therefore have to predict the rest. Also, purchasing and configuring hardware can be cost prohibitive.

Solution: Pay as You Grow

Implementing a fully managed Azure SQL Data Warehouse requires no large, upfront cost as it’s deployed in the cloud in minutes and you only pay for what you use. The cost of the SQL Data Warehouse compute depends on the amount of Data Warehouse Units (DWU) available, and as data volumes increase the compute can increase. For instance, during ETL (Extract, Transform, Load) processing, additional DWUs can be added and then scaled back after the ETL process is complete. This provides a fast, simple and elastic way of managing your data warehouse processing needs. Another nice feature of Azure is that you pay for your storage and compute separately, which reduces cost and increases flexibility.

Problem #2: Exponential Data Growth

It’s estimated we create 2.5 quintillion bytes of data each day and 90% of the data in the world today has been created in the last two years. These statistics become less surprising when you consider that in today’s connected world technology allows us to create, collect and exchange data between systems, providing a rich set of data for analysis. However, much of this data is unstructured, in silos, making it difficult to gain insights into its raw form.

Solution: Query and Integrate Unstructured Data with PolyBase

Azure SQL Data Warehouse provides PolyBase (a technology that accesses data outside the database) to query and integrate data from multiple sources using the T-SQL (Transact-SQL) language to integrate any data with traditional structured sources. Unstructured or semi-structured data can be accessed from other Cortana Analytics components such as an HDInsight (Hadoop) instance, where data is loaded from an Azure Data Factory (ADF) pipeline or data streaming in from event hubs and stream analytics.

​​​Problem #3: Growing Historical Data Restricts Predictive Analytics

Data warehouses often contain a lot of historical data which users need to analyze what has happened in the past. However, forward-looking predictive and prescriptive analytics are often missing.

Solution: Azure Machine Learning Provides a Fully Managed Service

Once data is integrated in the SQL Data Warehouse, Cortana Analytics’ machine learning capabilities can be used to develop data models. Machine Learning provides a fully managed service to create predictive analytical solutions, giving users the ability to create their own models or use an existing model available in the Cortana Analytics Suite.

Data in SQL Data Warehouse can be used as a source for the machine learning model and further data analysis and cleansing can be applied to prepare the data for the model. A range of statistical algorithms can be applied to data sets to provide predictive results. These results can be saved to the SQL Data Warehouse or the model can be exposed via an API.

Problem #4: Visualizing Results

The ability to easily visualize data, quickly modify views and interrogate data can be difficult and time-consuming due to the difficulty and effort required to update reports and dashboards.

Solution: Leverage Power BI

Power BI provides a seamless way to analyze data by providing SQL data warehouse direct connect as well as other connectors for familiar data sources. With Power BI users, can create rich interactive reports and real-time dashboards that can be shared in seconds. It comes with a rich set of visualizations which can be used for a personalized reporting experience.

At SNP, we have deep experience and expertise in ETL and data warehousing. Learn how SNP deployed a data warehousing solution on Microsoft Azure for St. Vincent’s Health Partners Helping Them Save a Projected $100,000 a Year

Data Warehousing Solutions with Advanced Data Analytics
Business
Sachin Parikh Linkedin Profile

Sachin Parikh

Sachin Parikh is the VP Business Development at SNP Technologies. With over 21 years in the technology space, he is responsible for delivering business value via business development, sales, marketing, product evangelization, strategic alliances, business operations and driving revenue KPIs.