Slowly Changing
Dimension Techniques
Type 0: Retain
Original
With slowly changing
dimension type 0, the dimension attribute value never changes, so facts
are always grouped by this original value. Type 0 is appropriate for any
attribute labeled “original,” such as a customer’s original credit score or a
durable identifier. It also applies to most attributes in a date dimension.
Type 1: Overwrite
With slowly changing
dimension type 1, the old attribute value in the dimension row is
overwritten with the new value; type 1 attributes always reflects the most
recent assignment, and therefore this technique destroys history. Although this
approach is easy to implement and does not create additional dimension rows,
you must be careful that aggregate fact tables and OLAP cubes affected by this
change are recomputed.
Type 2: Add New Row
Slowly changing dimension
type 2 changes add a new row in the dimension with the updated attribute
values. This requires generalizing the primary key of the dimension beyond the
natural or durable key because there will potentially be multiple rows
describing each member. When a new row is created for a dimension member, a new
primary surrogate key is assigned and used as a foreign key in all fact tables
from the moment of the update until a subsequent change creates a new dimension
key and updated dimension row. A minimum of three additional columns should be
added to the dimension row with type 2 changes: 1) row effective date or
date/time stamp; 2) row expiration date or date/time stamp; and 3) current row
indicator.
Type 3: Add New
Attribute
Slowly changing
dimension type 3 changes add a new attribute in the dimension to
preserve the old attribute value; the new value overwrites the main attribute
as in a type 1 change. This kind of type 3 change is sometimes called an
alternate reality. A business user can group and filter fact data by either the
current value or alternate reality. This slowly changing dimension technique is
used relatively infrequently.
Type 4: Add
Mini-Dimension
Slowly changing
dimension type 4 is used when a group of attributes in a dimension rapidly
changes and is split off to a mini-dimension. This situation is
sometimes called a rapidly changing monster dimension. Frequently used
attributes in multimillion-row dimension tables are mini-dimension design
candidates, even if they don’t frequently change. The type 4 mini-dimension
requires its own unique primary key; the primary keys of both the base
dimension and mini-dimension are captured in the associated fact tables.
Type 5: Add
Mini-Dimension and Type 1 Outrigger
Slowly changing
dimension type 5 is used to accurately preserve historical attribute
values, plus report historical facts according to current attribute values.
Type 5 builds on the type 4 mini-dimension by also embedding a current type 1
reference to the mini-dimension in the base dimension. This enables the
currently-assigned mini- dimension attributes to be accessed along with the others in the base
dimension without linking through a fact table. Logically, you’d represent the
base dimension and mini-dimension outrigger as a single table in the
presentation area. The ETL team must overwrite this type 1 mini-dimension
reference whenever the current mini-dimension assignment changes.
Type
6: Add Type 1 Attributes to Type 2 Dimension
Like
type 5, slowly changing dimension type 6 also delivers both historical
and current dimension attribute values. Type 6 builds on the type 2 technique
by also embedding current type 1 versions of the same attributes in the
dimension row so that fact rows can be filtered or grouped by either the type 2
attribute value in effect when the measurement occurred or the attribute’s
current value. In this case, the type 1 attribute is systematically overwritten
on all rows associated with a particular durable key whenever the attribute is
updated.
Type
7: Dual Type 1 and Type 2 Dimensions
Slowly changing dimension type 7 is the final hybrid
technique used to support both as-was and as-is reporting. A fact table can be
accessed through a dimension modeled both as a type 1 dimension showing only
the most current attribute values, or as a type 2 dimension showing correct
contemporary historical profiles. The same dimension table enables both
perspectives. Both the durable key and primary surrogate key of the dimension
are placed in the fact table. For the type 1 perspective, the current flag in
the dimension is constrained to be current, and the fact table is joined via
the durable key. For the type 2 perspective, the current flag is not
constrained, and the fact table is joined via the surrogate primary key. These
two perspectives would be deployed as separate views to the BI applications.
No comments:
Post a Comment