Dimensional modeling offers set of strategies and ideas that are utilized in DW layout. According to DW representative, Ralph Kimball, dimensional modeling is a layout approach for databases supposed to guide cease-consumer queries in a information warehouse. It is orientated round understandability and performance. According to him, even though transaction-orientated ER may be very useful for the transaction seize, it must be prevented for stop-person delivery.
Dimensional modeling continually uses facts and measurement tables. Facts are numerical values which can be aggregated and analyzed at the fact values. Dimensions define hierarchies and outline on fact values.
Dimension Table
Dimension table shops the attributes that describe items in a Fact table. A Dimension table has a number one key that uniquely identifies every size row. This secret is used to associate the Dimension desk to a Fact table.
Dimension tables are usually de-normalized as they're not created to execute transactions and handiest used to analyze records in element.
Example
In the subsequent size table, the client dimension normally includes the name of customers, cope with, purchaser identity, gender, profits group, education tiers, and many others.
Customer ID | Name | Gender | Income | Education | Religion |
---|---|---|---|---|---|
1 | Brian Edge | M | 2 | 3 | 4 |
2 | Fred Smith | M | 3 | 5 | 1 |
3 | Sally Jones | F | 1 | 7 | 3 |
Fact Tables
Fact table contains numeric values which are called measurements. A Fact desk has two forms of columns − data and overseas key to dimension tables.
Measures in Fact table are of three kinds −
- Additive − Measures that may be brought throughout any measurement.
- Non-Additive − Measures that cannot be brought across any measurement.
- Semi-Additive − Measures that can be delivered throughout some dimensions.
Example
Time ID | Product ID | Customer ID | Unit Sold |
---|---|---|---|
4 | 17 | 2 | 1 |
8 | 21 | 3 | 2 |
8 | 4 | 1 | 1 |
This truth tables contains foreign keys for time size, product dimension, client size and measurement price unit offered.
Suppose a enterprise sells products to customers. Every sale is a truth that occurs inside the company, and the fact desk is used to document these information.
Common facts are − quantity of unit bought, margin, sales sales, and so forth. The dimension desk list elements like patron, time, product, and many others. Via which we need to research the records.
Now if we recollect the above Fact desk and Customer dimension then there will also be a Product and time dimension. Given this reality table and those 3 measurement tables, we are able to ask questions like: How many watches had been sold to male clients in 2010?
Difference among Dimension and Fact Table
The functional difference among dimension tables and reality tables is that reality tables preserve the facts we need to research and measurement tables keep the facts required to permit us to query it.
Aggregate Table
Aggregate desk consists of aggregated statistics which can be calculated via using different combination features.
An combination function is a characteristic where the values of multiple rows are grouped collectively as enter on certain criteria to shape a unmarried cost of extra great which means or size.
Common combination features encompass −
- Average()
- Count()
- Maximum()
- Median()
- Minimum()
- Mode()
- Sum()
These aggregate tables are used for overall performance optimization to run complicated queries in a statistics warehouse.
Example
You save tables with aggregated facts like every year (1 row), quarterly (4 rows), monthly (12 rows) and now you need to do evaluation of information, like Yearly handiest 1 row might be processed. However in an un-aggregated desk, all of the rows might be processed.
MIN | Returns the smallest value in a given column |
MAX | Returns the largest value in a given column |
SUM | Returns the sum of the numeric values in a given column |
AVG | Returns the average value of a given column |
COUNT | Returns the total number of values in a given column |
COUNT (*) | Returns the number of rows in a table |
Select Avg (profits) from worker where identify = ‘developer’. This assertion will return the average profits for all personnel whose title is equal to 'Developer'.
Aggregations may be implemented at database degree. You can create aggregates and keep them in combination tables within the database or you could practice mixture at the fly on the file level.
Note − If you shop aggregates on the database stage it saves time and presents overall performance optimization.