In part one, we discussed best practices for the OACS RPD Physical Layer. In this section we assume that your physical layer is complete and you have moved onto the OACS RPD Logical Layer.
RPD Logical Layer Best Practices
Apply one-to-many logical joins between logical dimension tables and the fact tables wherever possible. The business model should ideally resemble a star schema in which each fact table is joined directly to its dimensions.
Try to avoid the snowflake schema where possible. In some cases, snow flake schemas are a must as it reduces redundancy. Illustration: Large customer dimensions where, for example, 80 percent of the fact table measurements involve anonymous visitors about whom you collect little detail, and 20 percent involve reliably registered customers about whom you collect much detailed data by tracking many dimensions.
Either way a snowflake schema will be broken into a flat logical star in the BMM layer, and only modelled as a snowflake in the physical layer.
Dimensional hierarchies are very useful, it’s very important that every dimension has a hierarchy associated with it. This allows us to model non-conformed dimensions, perform aggregate navigation and drilldown.
Use a separate logical table for each logical dimension. Where possible, we try not to merge logical sources as it can overcomplicate the query produced by OACS. In some cases this is unavoidable e.g. some snowflakes, but where possible we need to avoid multiple logical sources to one logical table.
All fact attributes should be stored in a logical dimension table, this ensures that the model is as simple as possible and there are no logical level errors when we produce reports in the front end.
Very important this one. Many a time I’ve seen logical level errors popping up in reports, and incorrect data being identified by the customer, and its regularly due to non-or-incorrect setting of logical content levels. Always apply these.
Each dimension hierarchy level must have a unique logical level key.
This is another big one for me personally, theres nothing worse than coming into a clients office, knowing nothing about their data and seeing a number of unused logical columns in the logical tables. If you aren’t using the logical column, don’t keep it in the logical layer. Failing this, rename and reorder to group all the ones together you are using. It makes everything much simpler and produces a quicker learning curve for new starters and others debugging the RPD. Not to mention that it looks much better.
ETL vs RPD Logic
This is generally on the call of the development teams, but as a general rule id say; “complex measure formula should be calculated at ETL level, as placing them in the OACS repository would cause the expression to be evaluated every time the query is executed resulting poor performance”
Elements at Level
Always specify the number of elements at each hierarchy level, even if it is a rough estimate. This will be used to identify aggregate tables and mini-dimensions.
Warnings on Consistency Checker
Too often I notice that a client will have 89 warnings in the RPD. Its best practice the fix these warnings as and when they appear, as very often they point toward something that could be causing performance issues or runtime errors. Not to mention that if there are any additional warnings, its tough to find them amongst a load of previous warnings.
Dimensional Attributes on Fact Tables
Ensure that fact tables are kept free from dimensional attributes unless these are true degenerate dimensions.