Azure Data Architecture Guide
User Manual: Pdf
Open the PDF directly: View PDF .
Page Count: 62
![](asset-1.png)
Azure Data
Architecture Guide
![](asset-2.png)
Agenda
![](asset-3.png)
Subway Map
![](asset-4.png)
Types of Data
![](asset-5.png)
Real-Time Streaming Architecture
![](asset-6.png)
Lambda Architecture
![](asset-7.png)
Hot Path
•View Actionable Data Quickly
•Likely Looking at an Aggregate
•Requires Professional Presentation
•Better if Ad-Hoc Querying is Supported
![](asset-8.png)
Cold Path
•Long-term Storage
•Must be cheap to query and store
•May be processed multiple times
•Analysis
•Prediction
•Learning
![](asset-9.png)
Hot & Cold Path Analysis
What
happened?
Why did it
happen?
What will
happen?
How can we
make it
happen?
Difficulty
Value
![](asset-a.png)
Kappa Architecture
![](asset-b.png)
IoT Message Processing
![](asset-c.png)
Data Warehousing
![](asset-d.png)
Data Warehousing
![](asset-e.png)
Data Warehouse Pattern
![](asset-f.png)
Data Warehouse Technology Choices
•SMP (small/medium data)
•Azure SQL Database
•SQL Server in a virtual machine
•Azure SQL Database managed
instance
![](asset-10.png)
Data Warehouse Technology Choices
•MPP (big data)
•Azure Data Warehouse
•Apache Hive on HDInsight
•Interactive Query (Hive LLAP) on
HDInsight
![](asset-11.png)
SQL Data Warehouse
![](asset-12.png)
SQL Data Warehouse
Azure Storage
![](asset-13.png)
Data Warehousing
![](asset-14.png)
No-SQL
![](asset-15.png)
Document Data Stores
![](asset-16.png)
Key-Value Data Stores
![](asset-17.png)
Graph Data Stores
![](asset-18.png)
And Then There’s More…
•Columnar
•Object
•Time Series
•External Index
![](asset-19.png)
Processing JSON
![](asset-1a.png)
No-SQL Technology Choices
•Azure Data Factory
•Azure Logic Apps
•Azure Functions
•App Service
•Azure Data Lake Analytics
•Azure HDInsight
![](asset-1b.png)
No-SQL Technology Choices
•Spark SQL
•HBase
•Hive
•SQL Data Warehouse
•Azure Machine Learning Workbench
•SQL SSIS
![](asset-1c.png)
Azure Cosmos DB
Key-Value
Column-family
Graph
Documents
SQL
![](asset-1d.png)
Clickstream Analysis
![](asset-1e.png)
Storing Relational and No-SQL Data
![](asset-1f.png)
Processing JSON in Real-Time
![](asset-20.png)
Data Lake
Analytics
Storage
WebHDFS
YARN
Unstructured Semi-Structured Structured
U-SQL
![](asset-21.png)
HDInsight
Authoring Jobs App Integration
Core Hadoop
Consistent REST API’s
Breadth of Clients (Java, JS, .NET, etc)
Authoring frameworks and languages
End User Tooling (IDE’s, Analyst tools, Command lines)
Connectivity
Programmability
Security
Loosely coupled
Lightweight
Low cost to extend
Scenario oriented
Innovation flows upward
New compute models
Perf enhancements
Extend breadth & depth
Enable new scenarios
Integrate with current tool
chains
![](asset-22.png)
HDInsight & Data Lake
Azure Data Lake Store
WebHDFS-compatible REST API
Azure HDInsight
Hadoop WebHDFS Client
![](asset-23.png)
On-Demand Big Data Analytics
![](asset-24.png)
Natural Language Processing
![](asset-25.png)
Processing Free-Form Text
![](asset-26.png)
Natural Language Processing (NLP)
“Great to meet you! I
need to extend my
booking next week by one
day. Can you also book
me a car?”
![](asset-27.png)
Processing Free-Form Text using NLP
![](asset-28.png)
Language Understanding (LUIS)
Create your
own LU model
Train by providing
examples
Deploy to an HTTP
endpoint and
activate on any
device
Maintain model
with ease
![](asset-29.png)
Natural Language Processing Technology Choices
•Azure HDInsight
•with Spark and Spark MLlib
•Microsoft Cognitive Services
•LUIS
•Bing Search APIs
![](asset-2a.png)
Speech-to-Text Translation
can can you
here me
Can you hear
me?
Automatic
Speech
Recognition
Machine
Translation
TrueText
![](asset-2b.png)
Speech-to-Speech Translation
can can you
here me
Can you hear
me?
Automatic
Speech
Recognition
Machine
Translation
TrueText
Text to
Speech
![](asset-2c.png)
Intelligent Applications
![](asset-2d.png)
Advanced Analytics & Deep Learning
![](asset-2e.png)
Data Pipeline
![](asset-2f.png)
Extract, transform, and load
![](asset-30.png)
Extract, load, and transform
![](asset-31.png)
Extract, Transform, Load (ETL)
![](asset-32.png)
Semantic Modeling
![](asset-33.png)
Semantic Modeling
![](asset-34.png)
Online Analytical Processing Pattern (OLAP)
![](asset-35.png)
OLAP Technology Choices
•SQL Server with Columnstore indexes
•Azure Analysis Services
•SQL Server Analysis Services (SSAS)
![](asset-36.png)
Azure Analysis Services
Data
Modeling
SQL Database
SQL Data Warehouse
Data Lake
HDInsight/Spark
Other
Other
SQL Server/Oracle
Third-Party
Power BI Desktop
Excel
Power BI
Lifecycle
Management
Security
Business Logic
& Metrics
In-Memory
Cache
Azure Analysis Services
![](asset-37.png)
Data Mart
•Search
•Browse
•Filter
Discover
•Metadata
•Experts
•Context
Understand
•Data Assets
•Familiar Tools
•Existing
Processes
Consume
•Tag
•Document
•Publish
Contribute
![](asset-38.png)
Business Intelligence
![](asset-39.png)
Relational Data
![](asset-3a.png)
Relational Data
![](asset-3b.png)
Transactional Data
![](asset-3c.png)
Online Transaction Processing (OLTP) Pattern
![](asset-3d.png)
OLTP Technology Choices
•Azure SQL Database
•Azure SQL Database Managed Instance
•SQL Server in an Azure Virtual Machine
•Azure Database for MySQL
•Azure Database for PostgreSQL
![](asset-3e.png)
Thank You!
E-mail: sidney@seesharprun.net
Twitter: @sidney_andrews