Data Warehouse Concepts Overview

     If you are thinking of becoming a Business Intelligence developer you need to be good at data warehousing concepts. Commonly people take concepts of data warehousing very casually, which eventually becomes a hurdle in gaining expertise in business intelligence concepts. 

    Features of data warehousing that had made it so popular are: Subject-orientedness, Non-volatile data, Integrated and Organised data store. The data is segregated and stored with respect to the different subjects also called Data mart which makes it subject-oriented. We also store the historical data in a data warehouse and ideally never delete anything which makes it Non-volatile. It also establishes or Integrates a common unit of measurement after collecting data from different sources and different types. We have various schemas which makes it more organize and help us to access the data warehouse easily. 



    Data warehousing is a process of connecting and managing different data sources to analyze the data and provide better meaningful business insights. A data warehouse is a massive store and to manage any store with such magnitude we require a proper structure and process. Following are important concepts that one needs to be aware of to manage and design any data warehouse.

  1. Fact Table 
  2. Dimension table
  3. Schema
  4. Primary key
  5. Foreign Key

1. Fact Table: 

    Fact table consists of facts of a particular dimension model in data warehousing. The fact table is always the center table in the model that is surrounding by all the dimension tables. Fact table consists of measures and key column values only. We have a limited number of fact tables in a data warehouse. Fact table contains the foreign key from different dimension tables which also act as a key column.

2. Dimension table:

    Dimension table shows us all the characteristics of the data. Dimension table can describe the data with respect to each subject, for example: For every important entity in the model, we have a dimension table with all its characteristics. Dimension table has the primary key. All dimension tables are connected to the central fact table.


3. Schema:

    Schema is the blueprint of every data warehouse. There are 3 most popular schemas in data warehousing they are as follows:
  • Star Schema
  • Snowflake Schema
  • Fact Constellation schema or Galaxy Schema
  1. Star Schema: With one fact table in the center and surrounded by all the dimension tables to form a star-like structure as shown in the above diagram is called a star schema.
  2. Snowflake Schema: With one fact table in the center and surrounded by all the dimension tables and one or multiple derived dimension tables connected to form a snowflake-like structure is called a snowflake schema.
  3. Fact Constellation schema or Galaxy Schema: With Star or snowflake schema if we have multiple fact tables connected to one another then that structure is called Fact Constellation schema or Galaxy Schema.

4. Primary key:

The primary key is a constraint that uniquely identifies each record in a table. Because primary key contains unique values and can not contain any NULL values. The table can have only one primary key and the primary key can be made from one or multiple columns. 

5. Foreign Key:

Foreign key is a constraint that uniquely identifies each record from the reference table and acts as its reference in another table. A foreign key can not contain any NULL values. The table can have multiple Foreign keys. A primary key acts as a foreign key in another table.

I have tried to explain in brief the data warehousing and its concepts in the most simple language. The data warehouse is a vast topic and can not be covered in one blog, So If you find this helpful please follow for more blogs on similar topics. Hope it was helpful!

Comments

Post a Comment

Popular posts from this blog

Business Intelligence tools and its Categories

What is Data from Business Intelligence prespective?