Skip to main content

Precompute the join relations

Pre-computation of the join relations refers to the process of expanding the joined tables of a model into a flat table based on mappings, and then building indexes based on the flat table. Kylin will precompute each join relation and generate a flat table that contains dimensions, measures and columns referenced by computed columns by default. This article will cover the principles and features of Precompute Join Relationships.

Principles

This article takes Fact as the fact table and Dim as the dimension table to introduce how Precompute Join Relationships will affect the generation of a flat table. Suppose the table structures and data are as follows:

  • Table Fact
col1col2col3
1aAAA
2bBBB
3cCCC
  • Table DIM
col1col2col3
1A1AAAA
1A2BBBB
2B1CCCC
3C1DDDD

If Fact inner joins Dim and Precompute Join Relationships is enabled, it will generate a flat table as below:

Fact.col1Fact.col2Fact.col3Dim.col1Dim.col2Dim.col3
1aAAA1A1AAAA
1aAAA1A2BBBB
2bBBB2B1CCCC
3cCCC3C1DDDD

If Fact inner joins Dim and Precompute Join Relationships is disabled, the flat table generated will be:

Fact.col1Fact.col2Fact.col3
1aAAA
2bBBB
3cCCC

[!NOTE]

In this scenario, the generation of a flat table does not rely on the dimension table and it will be stored as a snapshot in Kylin during the building process.

Feature comparison

To strike the right balance between performance and cost, you can choose whether to enable Precompute Join Relationships based on your business needs and data characteristics when designing a model. The following table compares the features of enabling and disabling Precompute Join Relationships.

Precompute Join RelationshipsQuery performanceBuilding durationStorage costsAdaptability to new query scenariosImpact
EnableGoodLongerHigherFair● All columns in dimension tables can be set as dimensions, or defined as measures or computed columns.
DisableFairShorterLowerGood● Columns in dimension tables cannot be set as dimensions, or defined as measures or computed columns, which means they cannot be referenced by indexes.
● Indexes and corresponding dimension snapshots will be hit by queries simultaneously, so users can get the query results through real-time join queries.
In snowflake models, if a foreign key corresponds to a dimension table, and the table is set as an excluded table or Precompute Join Relationships is disabled, the dimension table will not be referenced when generating indexes.

FAQ

  • Question: If Precompute Join Relationships is enabled in a model, what will happen if I disable it?

    Answer: If Precompute Join Relationships is disabled, Kylin will automatically delete all related indexes, dimensions, measures, and computed columns. Please use caution when you perform this operation.

  • Question: If the table relationship is one-to-many or many-to-many, is there anything I should be aware of before enabling Precompute Join Relationships?

    Answer: In such a scenario, derived dimension queries will be disabled. If columns of the joined tables are not set as dimensions, these columns will not be referenced when generating indexes, or aggregate indexes or table indexes to accelerate queries.

  • Question: If a table is excluded, will it affect precomputing the join relations?

    Answer: Even if Precompute Join Relationships is enabled, this table will not be used to generate a flat table or referenced when generating indexes.

  • Question: What's the difference between excluding tables and disabling Precompute Join Relationships?

    Answer: The table below summarizes the main differences.

CategoryEffective levelApplicable scenario
Exclude tablesProject-levelOften used when returning the latest data for queries is required. The corresponding foreign keys of the join relations, instead of the columns of the excluded tables, will be solidified into the indexes.
Disable Precompute Join RelationshipsModel-levelOften used to reduce storage costs and improve building effeciency, for example, in one-to-many or many-to-many relationships.

[!NOTE]

When designing a model, please do not use the columns of the excluded tables as dimensions, or else the index building job may fail.