MOLAP vs ROLAP vs HOLAP

Today topic once again something i was confused about, always had an high level idea but never really read and understood the difference correctly until i was enlightened by a friend about a MS white paper on the  performance optimization on the CUBES so here goes the link and for the people who are lazy not wanting to read through whole lots of things in there just the extract of Storage modes difference goes as below:

SQL Server 2005

http://download.microsoft.com/download/8/5/e/85eea4fa-b3bb-4426-97d0-7f7151b2011c/SSAS2005PerfGuide.doc

http://www.sql-server-performance.com/2009/ssas-storage-modes/

 MOLAP:

MOLAP stores the aggregations and the data from the fact tables and dimensions on the Analysis services instance, the data and aggregations are stored in Compressed state Multidimensional Data structure, which is suited for fast retrieval and occupy very less space. precisely  this is the reason despite of all fact and dimension data getting copied still the MOLAP occupies much less space then the corresponding relational data. On a average as a guidance Analysis Services compresses the source fact data and its aggregations to approximately 30 percent of the size of the same data stored in a relational database, actual ration can vary based on number of duplicate keys and bit encoding algorithms.

In MOLAP viz a viz ROLAP and HOLAP where Analysis Services relies on on the relational engine to perform potentially large table joins against fact data stored in the relational database to resolve some or all queries, MOLAP relies on the Multidimensional data alone.

Pros of MOLAP:

Faster data access, good user experience.

All the calculations are pre-generated when the cube is processed and stored locally on the OLAP server hence even the complex calculations, as a part the query result, will be pefromed quickly.

MOLAP does not need to have a permanent connection to the underlying relational database (only at the time of processing) as it stores the detail and aggregate data in the OLAP server so the data can be viewed even when there is connection to the relational database.

Cons of MOLAP

With MOLAP  mode, we need frequent processing to pull refreshed data after last processing resulting in drain on system resources.

If the data volume is high, the cube processing can take longer, though you can use incremental processing to overcome this.

ROLAP:

This mode requires the least storage, none of the data and aggregations are stored on the Analysis Server, everything resides on the Relational DB side, this can be mostly useful in scenarios where the constraints are not applied on the Relational DB side, then we need to ensure that the UNKNOWN ember is enabled both on dimension and facts.

 Analysis Services rely on the relational engine to resolve all queries against the relational tables, storing both fact data and aggregations. The sheer number of queries with large table joins in large or complex cubes frequently overwhelms the relational engine.

Pros

The data available is such cube is near Real time

Since ROLAP keeps the data on the Relational DB,  it has less storage requirements. This is very advantageous for large datasets which are queried infrequently such as historical data.

Cons

Compared to MOLAP or HOLAP the response is generally slower, as all things lies on the Relational DB

HOLAP:

HOLAP is the combination of good things of HOLAP and ROLAP, it stores the aggregations on the Analysis Server and data(fact and dimensional) remains on the Relational DB

As a result, whenever Analysis Services needs to resolve a query against fact data stored in a HOLAP partition, it has to query the relational database directly rather than querying a multidimensional structure as in case of MOLAP

 

As far as processing goes there is no significant difference in processing performance between MOLAP partitions and HOLAP partition, in all fact data is read from the relational database, and aggregations are calculated. With HOLAP, Analysis Services does not store fact data. This difference has minimal impact on processing performance, but can have a significant impact on query performance, as HOLAP and MOLAP processing speeds are nearly same and MOLAP query performance is superior, MOLAP is the optimum storage choice.

Pros:

Stores the detail and aggregate data in the OLAP server in a compressed multidimensional format; as a result the cube browsing is fastest in this mode.

All the calculations are pre-generated when the cube is processed and stored locally on the OLAP server hence even the complex calculations, as a part the query result, will be pefromed quickly.

MOLAP uses compression to store the data on the OLAP server and so has less storage requirements than relational databases for same amount of data.

CONS:

 With MOLAP  mode, you need frequent processing to pull refreshed data after last processing resulting in drain on system resources.

Just after the processing if there is any changes in the relational database it will not be reflected on the OLAP server unless   re-processing is performed.

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: