Friday, December 30, 2011

Types of Dimensions

The Type property setting provides information about the contents of a dimension to server and client applications. In some cases, the Type setting only provides guidance for client applications and is optional. In other cases, such as Accounts or Time dimensions, the Type property settings for the dimension and its attributes determine specific server-based behaviours and may be required to implement certain behaviours in the cube. For example, the Type property of a dimension can be set to Accounts to indicate to client applications that the standard dimension contains account attributes.

The default setting for the dimension type is Regular, which makes no assumptions about the contents of the dimension. This is the default setting for all dimensions when you initially define a dimension unless you specify Time when defining the dimension using the Dimension Wizard. You should also leave Regular as the dimension type if the Dimension Wizard does not list an appropriate type for Dimension type.


Available Dimension Types
The following table describes the dimension types available in Microsoft SQL Server Analysis Services.

Regular - A dimension whose type has not been set to a special dimension type.
Time - A dimension whose attributes represent time periods, such as years,
semesters, quarters, months, and days.
Organization - A dimension whose attributes represent organizational information,
such as employees or subsidiaries.
Geography - A dimension whose attributes represent geographic information, such
as cities or postal codes.
BillOfMaterials-A dimension whose attributes represent inventory or manufacturing
information, such as parts lists for products.
Accounts - A dimension whose attributes represent a chart of accounts for
financial reporting purposes.
Customers - A dimension whose attributes represent customer or contact
information.
Products - A dimension whose attributes represent product information.
Scenario - A dimension whose attributes represent planning or strategic
analysis information.
Quantitative - A dimension whose attributes represent quantitative information.
Utility - A dimension whose attributes represent miscellaneous information.
Currency - This type of dimension contains currency data and metadata.
Rates - A dimension whose attributes represent currency rate information.
Channel - A dimension whose attributes represent channel information.
Promotion - A dimension whose attributes represent marketing promotion
information.

According to dimension modeling there are different types of dimnesions:

Conformed dimension


A conformed dimension is a set of data attributes that have been physically implemented in multiple database tables using the same structure, attributes, domain values, definitions and concepts in each implementation. A conformed dimension cuts across many facts.
Dimensions are conformed when they are either exactly the same (including keys) or one is a perfect subset of the other. Most important, the row headers produced in the answer sets from two different conformed dimensions must be able to match perfectly.
Conformed dimensions are either identical or strict mathematical subsets of the most granular, detailed dimension. Dimension tables are not conformed if the attributes are labeled differently or contain different values. Conformed dimensions come in several different flavors. At the most basic level, conformed dimensions mean exactly the same thing with every possible fact table to which they are joined. The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.

Junk dimension

A junk dimension is a convenient grouping of typically low-cardinality flags and indicators. By creating an abstract dimension, these flags and indicators are removed from the fact table while placing them into a useful dimensional framework. A Junk Dimension is a dimension table consisting of attributes that do not belong in the fact table or in any of the existing dimension tables. The nature of these attributes is usually text or various flags, e.g. non-generic comments or just simple yes/no or true/false indicators. These kinds of attributes are typically remaining when all the obvious dimensions in the business process have been identified and thus the designer is faced with the challenge of where to put these attributes that do not belong in the other dimensions.
One solution is to create a new dimension for each of the remaining attributes, but due to their nature, it could be necessary to create a vast number of new dimensions resulting in a fact table with a very large number of foreign keys. The designer could also decide to leave the remaining attributes in the fact table but this could make the row length of the table unnecessarily large if, for example, the attributes is a long text string.
The solution to this challenge is to identify all the attributes and then put them into one or several Junk Dimensions. One Junk Dimension can hold several true/false or yes/no indicators that have no correlation with each other, so it would be convenient to convert the indicators into a more describing attribute. An example would be an indicator about whether a package had arrived, instead of indicating this as “yes” or “no”, it would be converted into “arrived” or “pending” in the junk dimension. The designer can choose to build the dimension table so it ends up holding all the indicators occurring with every other indicator so that all combinations are covered. This sets up a fixed size for the table itself which would be 2^x rows, where x is the number of indicators. This solution is appropriate in situations where the designer would expect to encounter a lot of different combinations and where the possible combinations are limited to an acceptable level. In a situation where the number of indicators are large, thus creating a very big table or where the designer only expect to encounter a few of the possible combinations, it would be more appropriate to build each row in the junk dimension as new combinations are encountered. To limit the size of the tables, multiple junk dimensions might be appropriate in other situations depending on the correlation between various indicators.
Junk dimensions are also appropriate for placing attributes like non-generic comments from the fact table. Such attributes might consist of data from an optional comment field when a customer places an order and as a result will probably be blank in many cases. Therefore the junk dimension should contain a single row representing the blanks as a surrogate key that will be used in the fact table for every row returned with a blank comment field.

A dimension key, such as a transaction number, invoice number, ticket number, or bill-of-lading number, that has no attributes and hence does not join to an actual dimension table. Degenerate dimensions are very common when the grain of a fact table represents a single transaction item or line item because the degenerate dimension represents the unique identifier of the parent. Degenerate dimensions often play an integral role in the fact table's primary key.

Role-playing dimensions

Dimensions are often recycled for multiple applications within the same database. For instance, a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension".

Degenerated Dimension (DD):

An item that is in the fact table but is stripped off of its description, because the description belongs in dimension table, is referred to as Degenerated Dimension. Since it looks like dimension, but is really in fact table and has been degenerated of its description, hence is called degenerated dimension. Now coming to the slowly changing dimensions (SCD) and Slowly Growing Dimensions (SGD): I would like to classify them to be more of an attributes of dimensions its self.
Although other might disagree to this view but Slowly Changing Dimensions are basically those dimensions whose key value will remain static but description might change over the period of time. For example, the product id in a companies, product line might remain the same, but the description might change from time to time, hence, product dimension is called slowly changing dimension.
Lets consider a customer dimension, which will have a unique customer id but the customer name (company name) might change periodically due to buy out / acquisitions, Hence, slowly changing dimension, as customer number is static but customer name is changing, However, on the other hand the company will add more customers to its existing list of customers and it is highly unlikely that the company will acquire astronomical number of customer over night (wouldn’t the company CEO love that) hence, the customer dimension is both a Slowly changing as well as slowly growing dimension.

Learn SSAS

Introduction

This article describes the topics to be used when learning SQL Server Analysis Services. It focuses mainly on SSAS training / SSAS interview.

Topics

Below is a list of roughly drafted high level areas of SSAS in no particular order, which can be considered as a descent coverage, whether it's considered for SSAS training / SSAS interview. Keep in view that though the below coverage covers a major ground, it's not exhaustive and it can be used as a reference check to make sure you cover enough in your trainings / to make sure you have covered major fundamental areas.

• Types of Dimensions
• Types of Measures
• Types of relationships between dimensions and measuregroups: None (IgnoreUnrelatedDimensions), Fact, Regular, Reference, Many to Many, Data Mining
• Star Vs Snowflake schema and Dimensional modelling
• Data storage modes - MOLAP, ROLAP, HOLAP
• MDX Query syntax
• Functions used commonly in MDX like Filter, Descendants, Basc and others
• Difference between EXISTS AND EXISTING, NON EMPTY keyword and function, NON_EMPTY_BEHAVIOR, Parallel Period, AUTOEXISTS
• Difference between static and dynamic set
• Difference between natural and unnatural hierarchy, attribute relationships
• Difference between rigid and flexible relationships
• Difference between attribute hierarchy and user hierarchy
• Dimension, Hierarchy, Level, and Members
• Difference between database dimension and cube dimension
• Importance of CALCULATE keyword in MDX script, data pass and limiting cube space
• Effect of materialize
• Partition processing and Aggregation Usage Wizard
• Perspectives, Translations, Linked Object Wizard
• Handling late arriving dimensions / early arriving facts
• Proactive caching, Lazy aggregations
• Partition processing options
• Role playing Dimensions, Junk Dimensions, Conformed Dimensions, SCD and other types of dimensions
• Parent Child Hierarchy, Naming Template property, MemberWithLeafLevelData property
• Cube performance, MDX performance
• How to pass parameter in MDX
• SSAS 2005 vs. SSAS 2008
• Dimension security vs. Cell security
• SCOPE statement, THIS keyword, SUBCUBE
• CASE (CASE, WHEN, THEN, ELSE, END) statement, IF THEN END IF, IS keyword, HAVING clause
• CELL CALCULATION and CONDITION clause
• RECURSION and FREEZE statement
• Common types of errors encountered while processing a dimension / measure groups / cube
• Logging and monitoring MDX scripts and cube performance