Data Warehousing

Data Warehousing

Define

What is Data Warehousing?

Data warehouse architecture is the design and building blocks of the modern data warehouse. With the evolution of technology and demands of the data-driven economy, multi-cloud architecture allows for the portability to relocate data and workloads as the business expands.

Learn More

TYPES OF DATA WAREHOUSE ARCHITECTURE

1
Single-Tier

Single-tier architecture, which aims to deduplicate data to minimize the amount of stored data.

2
Two-Tier

Two-tier architecture, which separates physical data sources from the data warehouse, making it incapable of expansion or supporting many end users.

3
Three-Tier

The bottom tier, the database of the data warehouse servers
The middle tier, an online analytical processing (OLAP) server providing an abstracted view of the database for the end-user
The top tier, a front-end client layer consisting of the tools and APis used to extract data .

Data Warehouse Database



The essential component of data warehouse, a database stores and provides access to all business data. Cloud-based database services include Amazon Redshift and Azure SQL.

Extraction, Transformation, and Loading Tools (ETL)



Traditional ETL tools extract data from various sources, transform it into a digestible format, and load it into a data warehouse.



Metadata provides a framework for and descriptions of data, enabling the construction, storage, handling, and use of the data.

Data Warehouse Access Tools



Access tools allow users to discover actionable information from warehouse data. These warehouse tools can include query and reporting tools, application development tools, data mining tools, and OLAP tools. Essential Characteristics of Data Warehouse Architecture While traditional architectures were designed and deployed for on-premises environments, modern data warehousing solutions should capitalize on the cloud’s benefits. A cloud-optimized data warehouse architectures should have these attributes: Centralized storage for all data Independent scaling of computing and storage resources Near-unlimited concurrency without competing for resources Load and query data simultaneously without degrading performance Replicate data across multiple regions and clouds to enhance business continuity and simplify expansion Share data without setting up APIs or establishing cumbersome ETL procedures A robust metadata service that applies across the entire system The ability to take advantage of storage-as-a-service, where data storage expands and contracts automatically and transparently to the user

SNOWFLAKE’S ARCHITECTURE



Snowflake is built on a patented, multi-cluster, shared data architecture. It was created for the cloud to revolutionize data warehousing, data lakes, data analytics, and a host of other use cases.
Snowflake uses a central data repository for persisted data accessible from all compute nodes in the data warehouse. Snowflake also processes queries using massively parallel processing (MPP) compute clusters where each node in the cluster stores a portion of the entire data set locally. This approach offers both the data management simplicity of a shared-disk architecture and the performance and scale-out benefits of a shared-nothing architecture.
Snowflake can easily accommodate both ETL and ELT, but with secure data sharing capabilities and high, on-demand elasticity, can also eliminate the need for traditional extract, transform load processes, which are often resource- and bandwidth constrained.

Our Technology Stack


CSS 3
Amazon RDS
MongoDB
MongoDB
MySQL
MySQL