It’s an early start and a fairly solid commute for me to a client site in Winchester, so I thought I’d open up the new blog by talking about something fairly common across the industry. Let’s look at OACS best practices. I’ll be adding a series of posts on Business Intelligence best practices over the coming weeks – so here’s a start.
Having worked on many-a-client site, it seems they all implement different best practices, which is fine, generally a team lead in the BI space or a consultant would look at a master list and adapt the OACS best practices to best fit the company. I have included what I feel is a solid ‘master list’. If there is anything you feel is missing, please let me know.
OACS RPD Best Practices – Physical Layer
It is recommended to use table aliases frequently in the Physical layer to eliminate circular joins among tables, it also allows us to re-use a dimension. For example; a date dimension can be aliased twice, once for Start Date, once for End date. These can join to the same fact using different keys. It’s worth noting that we ONLY join on aliases, we don’t touch the original source table unless we need to change something in all its aliases. E.g. a column name in the database changes.
In the physical layer we have the option to define if the column is allowed to contain nulls. We need to check that we have allowed any relevant columns to be nullable. For example its generally OACS RPD best practice to seet dates to nullable, to avoid the default 0/0/0 date appearing in any queries.
This is again for aliases – its important that we prefix our dimension aliases with DIM_ and our fact tables with FACT_. This greatly helps other users debug their issues. Another really useful piece of physical RPD modelling ive seen on a client site with a huge amount of tables is to add in dummy columns prefixed with a letter and number, e.g. A01 Dimensions—— This way they could prefix name all the relevant alias tables with A01, A02, etc, and have the tables visible in a nice grouping, this help ensure we don’t get mixed up between original tables and source tables!
OACS Opaque Views
An opaque view should be used only if there is no other solution to your modeling problem. Ideally, a simple database view over a physical table should be created, or alternatively a materialized view to optimize the query generated by BI Server to get the best performance. I’ve been on a client site recently where there was no option but to use a huge opaque view, the impact on performance was pretty huge, not to mention it a whole new area to debug for errors.
In OACS we have the option to select ODBC as our database driver for the connection in the RPD. Its recommended that you select a database specific driver instead of ODBC. For example use OCI to connect to an Oracle database.
Maximum Number of Connections
Again in the connection pool we have the option to set the maximum number of connections. It is strongly recommended that you calculate an approximate value for this. It’s often the case where this is overlooked and performance problems follow swiftly. The best way to get an estimate of the maximum number of connections is to use the following formula.
10-20% of concurrent users of system x Average Number of Reports on dashboard.
RPD Physical Layer Caching
Use Physical Layer caching as this can lead to increased query response times. This involves checking a box to ensure that the table is marked as cacheable in the OACS Physical Layer. As long as the cache is automatically cleared either at a set time, or after an ETL load then the display of stale information is avoided. Note, we can set the refresh time for the cache if required, this is useful for near real time OACS Implementations.
Foreign Key Joins
We can apply both foreign key and complex joins in the physical layer. A foreign key join will use a simple equals operator. A complex join may contain some form of logic. In OACS a physical complex join will appear as RED in the physical model. The complex join will often cause an impact on performance, mainly because the BI server has to do additional matching processing, and sometimes we also lose the use of an index in the database.
It is OACS Best Practice to create separate connection pools to run our initialization blocks. This is because running the init blocks have an effect on the number of maximum connections, and if we reach the maximum number of connections this will affect all init blocks. Best to spread them out a bit!
Research query logging and the requirement for your particular system, this will vary wildly between environments. For example in a dev OACS environment, we may want a high level of logging, as developers are constantly taking SQL outputs for analysis. However in a production environment, we wont need this level of logging. Query logging has a significant effect on performance, so as a general rule, keep it to a minimum in your production environments.