KRISHAN'S BLOG: 2011

Friday, December 30, 2011

Types of Dimensions

The Type property setting provides information about the contents of a dimension to server and client applications. In some cases, the Type setting only provides guidance for client applications and is optional. In other cases, such as Accounts or Time dimensions, the Type property settings for the dimension and its attributes determine specific server-based behaviours and may be required to implement certain behaviours in the cube. For example, the Type property of a dimension can be set to Accounts to indicate to client applications that the standard dimension contains account attributes.

The default setting for the dimension type is Regular, which makes no assumptions about the contents of the dimension. This is the default setting for all dimensions when you initially define a dimension unless you specify Time when defining the dimension using the Dimension Wizard. You should also leave Regular as the dimension type if the Dimension Wizard does not list an appropriate type for Dimension type.

Available Dimension Types
The following table describes the dimension types available in Microsoft SQL Server Analysis Services.

Regular - A dimension whose type has not been set to a special dimension type.
Time - A dimension whose attributes represent time periods, such as years,
semesters, quarters, months, and days.
Organization - A dimension whose attributes represent organizational information,
such as employees or subsidiaries.
Geography - A dimension whose attributes represent geographic information, such
as cities or postal codes.
BillOfMaterials-A dimension whose attributes represent inventory or manufacturing
information, such as parts lists for products.
Accounts - A dimension whose attributes represent a chart of accounts for
financial reporting purposes.
Customers - A dimension whose attributes represent customer or contact
information.
Products - A dimension whose attributes represent product information.
Scenario - A dimension whose attributes represent planning or strategic
analysis information.
Quantitative - A dimension whose attributes represent quantitative information.
Utility - A dimension whose attributes represent miscellaneous information.
Currency - This type of dimension contains currency data and metadata.
Rates - A dimension whose attributes represent currency rate information.
Channel - A dimension whose attributes represent channel information.
Promotion - A dimension whose attributes represent marketing promotion
information.

According to dimension modeling there are different types of dimnesions:

Conformed dimension

A conformed dimension is a set of data attributes that have been physically implemented in multiple database tables using the same structure, attributes, domain values, definitions and concepts in each implementation. A conformed dimension cuts across many facts.
Dimensions are conformed when they are either exactly the same (including keys) or one is a perfect subset of the other. Most important, the row headers produced in the answer sets from two different conformed dimensions must be able to match perfectly.
Conformed dimensions are either identical or strict mathematical subsets of the most granular, detailed dimension. Dimension tables are not conformed if the attributes are labeled differently or contain different values. Conformed dimensions come in several different flavors. At the most basic level, conformed dimensions mean exactly the same thing with every possible fact table to which they are joined. The date dimension table connected to the sales facts is identical to the date dimension connected to the inventory facts.

Junk dimension

A junk dimension is a convenient grouping of typically low-cardinality flags and indicators. By creating an abstract dimension, these flags and indicators are removed from the fact table while placing them into a useful dimensional framework. A Junk Dimension is a dimension table consisting of attributes that do not belong in the fact table or in any of the existing dimension tables. The nature of these attributes is usually text or various flags, e.g. non-generic comments or just simple yes/no or true/false indicators. These kinds of attributes are typically remaining when all the obvious dimensions in the business process have been identified and thus the designer is faced with the challenge of where to put these attributes that do not belong in the other dimensions.
One solution is to create a new dimension for each of the remaining attributes, but due to their nature, it could be necessary to create a vast number of new dimensions resulting in a fact table with a very large number of foreign keys. The designer could also decide to leave the remaining attributes in the fact table but this could make the row length of the table unnecessarily large if, for example, the attributes is a long text string.
The solution to this challenge is to identify all the attributes and then put them into one or several Junk Dimensions. One Junk Dimension can hold several true/false or yes/no indicators that have no correlation with each other, so it would be convenient to convert the indicators into a more describing attribute. An example would be an indicator about whether a package had arrived, instead of indicating this as “yes” or “no”, it would be converted into “arrived” or “pending” in the junk dimension. The designer can choose to build the dimension table so it ends up holding all the indicators occurring with every other indicator so that all combinations are covered. This sets up a fixed size for the table itself which would be 2^x rows, where x is the number of indicators. This solution is appropriate in situations where the designer would expect to encounter a lot of different combinations and where the possible combinations are limited to an acceptable level. In a situation where the number of indicators are large, thus creating a very big table or where the designer only expect to encounter a few of the possible combinations, it would be more appropriate to build each row in the junk dimension as new combinations are encountered. To limit the size of the tables, multiple junk dimensions might be appropriate in other situations depending on the correlation between various indicators.
Junk dimensions are also appropriate for placing attributes like non-generic comments from the fact table. Such attributes might consist of data from an optional comment field when a customer places an order and as a result will probably be blank in many cases. Therefore the junk dimension should contain a single row representing the blanks as a surrogate key that will be used in the fact table for every row returned with a blank comment field.

A dimension key, such as a transaction number, invoice number, ticket number, or bill-of-lading number, that has no attributes and hence does not join to an actual dimension table. Degenerate dimensions are very common when the grain of a fact table represents a single transaction item or line item because the degenerate dimension represents the unique identifier of the parent. Degenerate dimensions often play an integral role in the fact table's primary key.

Role-playing dimensions

Dimensions are often recycled for multiple applications within the same database. For instance, a "Date" dimension can be used for "Date of Sale", as well as "Date of Delivery", or "Date of Hire". This is often referred to as a "role-playing dimension".

Degenerated Dimension (DD):

An item that is in the fact table but is stripped off of its description, because the description belongs in dimension table, is referred to as Degenerated Dimension. Since it looks like dimension, but is really in fact table and has been degenerated of its description, hence is called degenerated dimension. Now coming to the slowly changing dimensions (SCD) and Slowly Growing Dimensions (SGD): I would like to classify them to be more of an attributes of dimensions its self.
Although other might disagree to this view but Slowly Changing Dimensions are basically those dimensions whose key value will remain static but description might change over the period of time. For example, the product id in a companies, product line might remain the same, but the description might change from time to time, hence, product dimension is called slowly changing dimension.
Lets consider a customer dimension, which will have a unique customer id but the customer name (company name) might change periodically due to buy out / acquisitions, Hence, slowly changing dimension, as customer number is static but customer name is changing, However, on the other hand the company will add more customers to its existing list of customers and it is highly unlikely that the company will acquire astronomical number of customer over night (wouldn’t the company CEO love that) hence, the customer dimension is both a Slowly changing as well as slowly growing dimension.

Learn SSAS

Introduction

This article describes the topics to be used when learning SQL Server Analysis Services. It focuses mainly on SSAS training / SSAS interview.

Topics

Below is a list of roughly drafted high level areas of SSAS in no particular order, which can be considered as a descent coverage, whether it's considered for SSAS training / SSAS interview. Keep in view that though the below coverage covers a major ground, it's not exhaustive and it can be used as a reference check to make sure you cover enough in your trainings / to make sure you have covered major fundamental areas.

• Types of Dimensions
• Types of Measures
• Types of relationships between dimensions and measuregroups: None (IgnoreUnrelatedDimensions), Fact, Regular, Reference, Many to Many, Data Mining
• Star Vs Snowflake schema and Dimensional modelling
• Data storage modes - MOLAP, ROLAP, HOLAP
• MDX Query syntax
• Functions used commonly in MDX like Filter, Descendants, Basc and others
• Difference between EXISTS AND EXISTING, NON EMPTY keyword and function, NON_EMPTY_BEHAVIOR, Parallel Period, AUTOEXISTS
• Difference between static and dynamic set
• Difference between natural and unnatural hierarchy, attribute relationships
• Difference between rigid and flexible relationships
• Difference between attribute hierarchy and user hierarchy
• Dimension, Hierarchy, Level, and Members
• Difference between database dimension and cube dimension
• Importance of CALCULATE keyword in MDX script, data pass and limiting cube space
• Effect of materialize
• Partition processing and Aggregation Usage Wizard
• Perspectives, Translations, Linked Object Wizard
• Handling late arriving dimensions / early arriving facts
• Proactive caching, Lazy aggregations
• Partition processing options
• Role playing Dimensions, Junk Dimensions, Conformed Dimensions, SCD and other types of dimensions
• Parent Child Hierarchy, Naming Template property, MemberWithLeafLevelData property
• Cube performance, MDX performance
• How to pass parameter in MDX
• SSAS 2005 vs. SSAS 2008
• Dimension security vs. Cell security
• SCOPE statement, THIS keyword, SUBCUBE
• CASE (CASE, WHEN, THEN, ELSE, END) statement, IF THEN END IF, IS keyword, HAVING clause
• CELL CALCULATION and CONDITION clause
• RECURSION and FREEZE statement
• Common types of errors encountered while processing a dimension / measure groups / cube
• Logging and monitoring MDX scripts and cube performance

Friday, January 14, 2011

To Act is to Lead

“The smallest of actions is always better than the noblest in intentions “ Leadership is a lot more than just dreaming up big. It’s about acting to them.
What separates the ordinary ones from the great ones is a simple fact: the best of best execute brilliantly around their most vital priorities. In a leadership presentation, you can use concept BRBO: best resources on biggest opportunities.
Leadership is, in so many ways, about getting good stuff done, not started, not in process, done.
So Today makes decision that will revolutionize your career/Life/Self: never leave the site of an opportunity without doing something to seize it. And be less about talk – and more about DO.

Thursday, January 13, 2011

multidimensional-expression-part-4

How do I group dimension members dynamically in MDX?

You can create calculated members for dimension and then use them in the query. Example below will create 3 calculated members based on filter condition:

With MEMBER [Product].[Category].[Result1] AS 'Aggregate(Filter([Product].[Category].[All].Children, [Product].[Category].currentmember.Properties("Key") > "3"))'
MEMBER [Product].[Category].[Result2] AS 'Aggregate(Filter([Product].[Category].[All].Children, [Product].[Category].currentmember.Properties("Key") = "3"))'
MEMBER [Product].[Category].[Result3] AS 'Aggregate(Filter([Product].[Category].[All].Children, [Product].[Category].currentmember.Properties("Key") > "3"))'
SELECT NON EMPTY {[MEasures].[Order Count]} ON COLUMNS
,{[Product].[Category].[Result1],[Product].[Category].[Result2],[Product].[Category].[Result3]} ON ROWS
FROM [Adventure Works]

Result will be:
Order Count
Result1 19,523
Result2 9,871
Result3 19,523

How you can add another dimension as a row on your sample code?
Just add new members for new dimension and add same members in you row details.

How do you write MDX query that returns measure ratio to parent value?
Below is example on how is ratio calculated for measure [Order Count] using Date dimension. Using parent function, your MDX is in dependent on level that you are querying data on. In example below, if you query data at year level, ratio will be calculated to level [All]:

WITH MEMBER [Measures].[Order Count Ratio To Parent] AS
IIF( ([Measures].[Order Count], [Date].[Calendar].CurrentMember.Parent) = 0
, NULL
, [Measures].[Order Count]
/
([Measures].[Order Count], [Date].[Calendar].CurrentMember.Parent)
)
, FORMAT_STRING = "Percent"

SELECT {[Measures].[Order Count], [Measures].[Order Count Ratio To Parent]} ON 0
, {DESCENDANTS([Date].[Calendar].[All Periods], 1), [Date].[Calendar].[All Periods]
} ON 1
FROM [Adventure Works]

Result will be like:

Order Count Order Count Ratio To Parent
CY 2001 1,379 4.38%
CY 2002 3,692 11.74%
CY 2003 12,440 39.55%
CY 2004 13,944 44.33%
CY 2006 (null) (null)
All Periods 31,455 (null)

If you query data at month level, ratio will be calculated comparing to level quarter:

WITH MEMBER [Measures].[Order Count Ratio To Parent] AS
IIF( ([Measures].[Order Count], [Date].[Calendar].CurrentMember.Parent) = 0
, NULL
, [Measures].[Order Count]
/
([Measures].[Order Count], [Date].[Calendar].CurrentMember.Parent)
)
, FORMAT_STRING = "Percent"

SELECT {[Measures].[Order Count]
, [Measures].[Order Count Ratio To Parent]} ON 0
, {DESCENDANTS([Date].[Calendar].[Calendar Quarter].&[2003]&[4], 1)
, [Date].[Calendar].[Calendar Quarter].&[2003]&[4]
} ON 1
FROM [Adventure Works]

Result will be like:

Order Count Order Count Ratio To Parent
October 2003 1,779 29.95%
November 2003 1,888 31.79%
December 2003 2,272 38.26%
Q4 CY 2003 5,939 58.87%

Monday, January 3, 2011

Multidimensional Expression Part 3

MDX query to get sales by product line for specific period plus number of months with nonempty sales?

Solution:
You can use COUNT () function with Exclude Empty option. For count function you specify set that is cross join of Date members at the month level and measure that you are interested in.

WITH Member [Measures].[Months With Above Zero Sales] AS
COUNT(
DESCENDANTS({[Date].[Calendar].[Calendar Year].&[2003]: [Date].[Calendar].[Calendar Year].&[2004]}
, [Date].[Calendar].[Month]) * [Measures].[Sales Amount]
, ExcludeEmpty
)
SELECT {[Measures].[Sales Amount], [Measures].[Months With Above Zero Sales]} ON 0
, [Product].[Product Model Lines].[Product Line].Members on 1
FROM [Adventure Works]
WHERE ([Date].[Calendar].[Calendar Year].&[2003]: [Date].[Calendar].[Calendar Year].&[2004])

Result will be:

Sales Amount Months With Above Zero Sales
Accessory $1,987,396.37 19
Components $454,644.34 18
Mountain $24,430,307.51 19
Road $24,919,506.74 19
Touring $16,010,837.10 13

Same you can apply for Years:

WITH Member [Measures].[Years With Above Zero Sales] AS
COUNT(
DESCENDANTS({[Date].[Calendar].[Calendar Year].&[2003]: [Date].[Calendar].[Calendar Year].&[2004]}
, [Date].[Calendar].[Years]) * [Measures].[Sales Amount]
, ExcludeEmpty
)
SELECT {[Measures].[Sales Amount], [Measures].[Years With Above Zero Sales]} ON 0
, [Product].[Product Model Lines].[Product Line].Members on 1
FROM [Adventure Works]
WHERE ([Date].[Calendar].[Calendar Year].&[2003]: [Date].[Calendar].[Calendar Year].&[2004])

Result will be:
Sales Amount Years With Above Zero Sales
Accessory $1,987,396.37 2
Components $454,644.34 2
Mountain $24,430,307.51 2
Road $24,919,506.74 2
Touring $16,010,837.10 2

You can change Measures:

WITH Member [Measures].[Months With Above Zero Sales] AS
COUNT(
DESCENDANTS({[Date].[Calendar].[Calendar Year].&[2003]: [Date].[Calendar].[Calendar Year].&[2004]}
, [Date].[Calendar].[Month]) * [Measures].[Internet Sales Amount]
, ExcludeEmpty
)
SELECT {[Measures].[Internet Sales Amount], [Measures].[Months With Above Zero Sales]} ON 0
, [Product].[Product Model Lines].[Product Line].Members on 1
FROM [Adventure Works]
WHERE ([Date].[Calendar].[Calendar Year].&[2003]: [Date].[Calendar].[Calendar Year].&[2004])

Sunday, January 2, 2011

Multidimensional Expression (MDX) – part 2

Many developer start MDX query learning with MDX tutorial, learning tuples, namedsets, axis, select query syntax etc….. But still after that, many queries remain to resolve and need hands-on experience of some real time project when one really starts applying functions, tweaking or nesting queries and parameters.
In this session will start with some complex query with adventure works database.
Hope this helps all to resolve last movement MDX requirements when developers don’t have time to learn all the syntax, structure and theory:

How to find bottom 10 customers with lowest sales in 2003 that were not null?
Simple using BOTTOMCOUNT will return customers with null sales and we can use NONEMPTY or FIlTER to remove null data.

SYNTAX:

SELECT {[Measures].[Internet Sales Amount] } ON COLUMNS,
BOTTOMCOUNT(NONEMPTY(DESCENDANTS([Customer].[Customer Geography].[All Customers]
, [Customer].[Customer Geography].[Customer])
, ([Measures].[Internet Sales Amount]))
, 10
,( [Measures].[Internet Sales Amount])
) ON ROWS
FROM [Adventure Works]
WHERE ([Date].[Calendar].Calendar Year].&[2003])

Result will be:
Internet Sales Amount
Ariana Peterson $2.29
Olivia Brown $2.29
Abigail L. Bennett $2.29
Natalie L. Bryant $2.29
Madison D. Lee $2.29
Lauren Miller $2.29
Stephanie B. Murphy $2.29
Cameron L. Rodriguez $2.29
Melanie Peterson $2.29
Alfredo Romero $2.29

Saturday, January 1, 2011

What pain points will the right business intelligence application solve for any organization?

It might be easy for McDinald’s to create a hamburger that tastes the same no matter what country ir is served in, but it is not easy for enterprise to apply the same business rules, accounting principles and financial reporting practices to operations in different countries. To paint an accurate financial picture, global enterprise have to put in place systems that compile financial date from geographically dispersed divisions using a myriad of technological applications and a variety of accounting practices, business rules and cultural expectations.

The right BI application should provide organization with:

Integration and Financial Accuracy: It is a key to find a single, integrated application that will provide your organization with one central repository for enterprise-wide financial and non-financial data. This single application and information repository should result in enterprise-wide financial integrity, elimination of redundancy, consistent application of business rules and f firm knowledge that business decisions are being made from a basis of financial truth.

Fast and Accurate Reporting: Fast and accurate report will give you and your executives more time to analyze data. A more thoughtful analysis of the bottom line allows you monitor results against set objectives and develop more meaningful, long term strategic plans that are actionable. It also enables you to peer into the depth of your complex organization and gain an insightful view of its financial reality.

BI is Growing UP

Organizations that use Bi show increased level of maturity- up 5.7 percent from 2009 but there’s still far to go, according to the this year’s BI maturity survey from Forrester Research.
For “BI Maturity in The Enterprise: 2010 Update” Forrester surveyed only 173(of course out of thousand) technology professionals familiar with the respective organization’s BI efforts.
Compared with the previous year, 2010 findings show fewer organizations centralizing BI, with 43 percent employing agile BI and BI self –service, and 38 percent using centralization and consolidation. Forrester also found only 35 percent of organizations are employing pervasive Bi, which is defined as enterprise BI applications being available to most strategic and operation decision makers.
BI talent and skill acquisition also leave much room for improvement, with 46 of respondents in this Forrester study expressing dissatisfaction with procedures to hire, train and retain skilled BI staff.
“Traditional BI approaches, processes, technologies and architectures are no longer enough to continue to drive higher level of BI adoption and increase BI maturity in enterprise,” says Forrester Research Analysts. “New approaches such as agile BI and BI self-service by the business users are needed to supplement traditional BI.”

KRISHAN'S BLOG