Wednesday, June 30, 2010

Multidimensional Expressions (MDX): PART 1

Multidimensional Expressions (MDX) lets you query multidimensional objects, such as cubes, and return multidimensional cell sets that contain the cube's data.

Microsoft SQL Server OLAP Services provides architecture for access to multidimensional data. This data is summarized, organized, and stored in multidimensional structures for rapid response to user queries. Through OLE DB for OLAP, a PivotTable Service provides client access to this multidimensional online analytical processing (OLAP) data. For expressing queries to this data, OLE DB for OLAP employs full-fledged, highly functional expression syntax: multidimensional expressions (MDX).

We are assuming the reader is familiar with multidimensional data warehousing and OLAP terms. Before talking about MDX and how it queries data, it is worthwhile to give a brief description of the structure of a cube.

Cubes:

Cubes are key elements in online analytic processing. They are subsets of data from the OLAP store, organized and summarized into multidimensional structures. These data summaries provide the mechanism that allows rapid and uniform response times to complex queries.

Cubes are the main objects in online analytic processing (OLAP), a technology that provides fast access to data in a data warehouse. A cube is a set of data that is usually constructed from a subset of a data warehouse and is organized and summarized into a multidimensional structure defined by a set of dimensions and measures.

The fundamental cube concepts to understand are dimensions and measures.

 Dimensions provide the categorized descriptions by which the measures are separated for analysis such as Customers, geographical information and Products etc.

 Measures identify the numerical values that are summarized for analysis, such as price, cost, or quantity sold.

A dimension can be created for use in an individual cube or in multiple cubes. A dimension created for an individual cube is called a private dimension, whereas a dimension that can be used by multiple cubes is called a shared dimension.

Pre-calculated summary data called aggregations provides the mechanism for rapid and uniform response times to queries. Aggregations are created for a cube before end users access it. The results of a query are retrieved from the aggregations, the cube's source data in the data warehouse, a copy of this data on the Analysis server, the client cache, or a combination of these sources. An Analysis server can support many different cubes, such as a cube for sales, a cube for inventory, a cube for customers, and so on.

Every cube has a schema, which is the set of joined tables in the data warehouse from which the cube draws its source data. The central table in the schema is the fact table, the source of the cube's measures. The other tables are dimension tables, the sources of the cube's dimensions.

Each cube dimension can contain a hierarchy of levels to specify the categorical breakdown available to users. For example, a Store dimension might include the following level hierarchy: Country, State, City, and Store Name. Each level in a dimension is of a finer grain than its parent. Similarly, the hierarchy of a time dimension might include levels for year, quarter, and month. Multiple hierarchies can exist for a single dimension.

For Example: The fiscal period hierarchy levels Fiscal Year, Fiscal Quarter, and Month. The calendar hierarchy levels Calendar Year, Calendar Quarter, and Month.

One final important item of note is the concept of a member. A member is nothing more than an item in a dimension or measure. A calculated member is a dimension member whose value is calculated at run time using a specified expression. Calculated members can also be defined as measures. Only the definitions for calculated members are stored; values are calculated in memory when needed to answer a query.

Getting Starters with MDX:

Let's start by outlining one of the simplest forms of an MDX expression, bearing in mind this is for an outline of an expression returning two cube dimensions:

SELECT (query) axis specification ON COLUMNS,
(query) axis specification ON ROWS
FROM cube_name
WHERE (slicer) axis specification

A SELECT clause that determines the query axes of an MDX SELECT statement. A FROM clause that determines which cube will be queried, An Optional WHERE clause that determines which members or tuples to use on the slicer axis to restrict the data returned.

Comparing the Syntax of the MDX SELECT Statement to SQL

The syntax format for the MDX SELECT statement is similar to that of SQL syntax. However, there are several fundamental differences:

1. MDX syntax distinguishes sets by surrounding tuples or members with braces (the { and } characters.)
2. MDX queries can have 0, 1, 2 or up to 128 query axes in the SELECT statement. Each axis behaves in exactly the same way, unlike SQL where there are significant differences between how the rows and the columns of a query behave.
3. As with an SQL query, the FROM clause names the source of the data for the MDX query. However, the MDX FROM clause is restricted to a single cube. Information from other cubes can be retrieved on a value-by-value basis by using the LookupCube function.
4. The WHERE clause describes the slicer axis in an MDX query. It acts as something like an invisible, extra axis in the query, slicing the values that appear in the cells in the result set; unlike the SQL WHERE clause it does not directly affect what appears on the rows axis of the query. The functionality of the SQL WHERE clause is available through other MDX functions such as the FILTER function.

SELECT Statement Example

The following example shows a basic MDX query that uses the SELECT statement. This query returns a result set that contains the 2002 and 2003 sales and tax amounts for the Southwest sales territories.

SELECT
{ [Measures].[Sales Amount],
[Measures].[Tax Amount] } ON COLUMNS,
{ [Date].[Fiscal].[Fiscal Year].&[2002],
[Date].[Fiscal].[Fiscal Year].&[2003] } ON ROWS
FROM [Adventure Works]
WHERE ( [Sales Territory].[Southwest] )

In this example, the query defines the following result set information:

1. The SELECT clause sets the query axes as the Sales Amount and Tax Amount members of the Measures dimension, and the 2002 and 2003 members of the Date dimension.
2. The FROM clause indicates that the data source is the Adventure Works cube.
3. The WHERE clause defines the slicer axis as the Southwest member of the Sales Territory dimension.

For more learning on MDX kindly wait for next article………….. Multidimensional Expressions (MDX): PART 2

No comments:

Post a Comment