Custom schema definition - Aggregate tables

On this page:

Overview

If you use custom schema with large cube fact tables (which have tens of millions of rows) then you might need to add Mondrian aggregate tables to speed up initial summary queries.

Read more about Mondrian aggregate tables in Pentaho documentation site.

Install Aggregation Designer

The easiest way to add aggregate tables is using Pentaho Aggregation Designer. Please download the latest version and extract it on your local development computer.

If you are using MySQL then download MySQL JDBC driver and copy included mysql-connector-java-*.jar to lib directory (MySQL JDBC driver is not included due to GPL license restrictions).

Start it either using Mac OS X application or Linux or Windows startup script.

Generate Mondrian XML schema file

For Aggregation Designer you will need to provide Mondrian schema definition file in XML format. Open your custom schema definition file (we will use sample config/schemas/foodmart_schema.rb file here) and add at the end of the file

File.open(File.expand_path("config/schemas/foodmart_schema.xml", EazybiPrivate.home_dir), "w") do |file|
  file.puts Dwh.schema.to_xml
end

Now go to corresponding account Analyze tab and after that in config/schemas/foodmart_schema.xml Mondrian schema in XML format will be saved. When you will finish using Aggregation Designer then please delete these lines and generated XML file.

Configure Aggregation Designer

When you will start Aggregation Designer you will need to configure database connection and select generated Mondrian schema XML. See example screenshots:

After clicking Apply button you should be able to select Cube for which aggregation tables should be generated.

After that click Connect button. You might get some errors or warnings - typically they can be ignored but please review them:

Generate aggregate tables definition

If connection was successful then click Advisor button and specify Max Aggregates parameter (for example, 5) and click Recommend button.

You can review suggested aggregate table definitions:

Update names of aggregate tables if needed before you continue with export of table definitions.

Click Export button and then Preview button to get

  • Create aggregates DDL statements - should be executed once to create aggregate tables
  • Populate aggregates DML statements - should be included in regular data import scripts (before executing these INSERT statements at first always TRUNCATE aggregate tables)
  • Schema XML - will need to copy fragment with aggregates definitions from it and include in eazyBI custom schema definition file.

During development you can also create and populate aggregate tables directly from Aggragation Designer using Export button.

Enable aggregate tables in eazyBI

You need to modify config/eazybi.yml and enable using of aggregate tables:

  mondrian:
    # ... existing settings ...
    "mondrian.rolap.aggregates.Read": true
	"mondrian.rolap.aggregates.Use": true

You will need to restart private eazyBI to use this setting.

Add aggregate tables in eazyBI custom schema

In Aggregation Designer generated Schema find and copy <AggName name="..."> definitions and paste them in eazyBI custom schema definition (in our example config/schemas/foodmart_schema.rb) as xml block inside table element block:

  cube 'Sales' do
    table 'sales_fact_1997' do
      xml <<-XML
        <AggName name="default_Sales_1">
          <AggFactCount column="sales_fact_1997_fact_count">
          </AggFactCount>
          <AggMeasure column="sales_fact_1997_Unit_Sales" name="[Measures].[Unit Sales]">
          </AggMeasure>
          <AggMeasure column="sales_fact_1997_Store_Sales" name="[Measures].[Store Sales]">
          </AggMeasure>
          <AggMeasure column="sales_fact_1997_Store_Cost" name="[Measures].[Store Cost]">
          </AggMeasure>
          <AggLevel column="product_class_Product_Family" name="[Products].[Product Family]">
          </AggLevel>
          <AggLevel column="time_by_day_Year" name="[Time].[Year]">
          </AggLevel>
          <AggLevel column="time_by_day_Quarter" name="[Time].[Quarter]">
          </AggLevel>
          <AggLevel column="time_by_day_Week" name="[Time.Weekly].[Week]">
          </AggLevel>
          <AggLevel column="customer_Country" name="[Customers].[Country]">
          </AggLevel>
        </AggName>
        ... all other aggregate table definitions ...
      XML
    end
  end

Testing aggregate tables

These instructions are for the latest eazyBI version 3.0.

If you would like to see which aggregate tables are being used then you can turn on Mondrian debugging in config/eazybi.toml file by uncommenting Mondrian logging settings:

[mondrian.log]
"mondrian" = "ERROR"
"mondrian.mdx" = "DEBUG"
"mondrian.sql" = "DEBUG"

And then in log/eazybi-mondrian.log file you will see generated SQL statements and will be able to see which aggregate tables are used for which queries.