Dynamic Data security in OBIEE

Here is an interesting scenario that I came across recently. The client has 2 hierarchies based on cost centers, Management and P&L. The business push behind this was that lot of users had dual roles in terms of how they look at expenses. Both the hierarchies had cost center at the lowest level and the same number of levels in the hierarchy. The differences were 1) users would have access different set of cost centers in both hierarchies (not necessarily mutually exclusive) 2) roll ups for the cost centers would be different in both hierarchies. The business had a couple of requirements around this.

1) The users should be able to pick the hierarchy that they would like to view the report by.

2) When the user flips between the hierarchies the security applied should change accordingly, ie if they are viewing the report by the P&L hierarchy then the P&L security should be applied and if they pick management then Management hierarchy security should be applied.

Here is a very simplified version of how we implemented the solution.

The security profile table

Hierarchy User_id Cost_center
P&L USER1 1
P&L USER1 2
P&L USER1 3
P&L USER1 4
MGMT USER1 3
MGMT USER1 4
MGMT USER1 5
MGMT USER1 6
MGMT USER1 7

User1 has access to cost centers 1-4 in the P&L hierarchy and 3-7 in the Management Hierarchy.

Here are the 2 hierarchy tables for this example.

Hierarchy Cost_center Level2 Level1
P&L 1 P&L LVL2 1 P&L LVL1 1
P&L 2 P&L LVL2 1 P&L LVL1 1
P&L 3 P&L LVL2 1 P&L LVL1 1
P&L 4 P&L LVL2 1 P&L LVL1 1
P&L 5 P&L LVL2 2 P&L LVL1 1
P&L 6 P&L LVL2 2 P&L LVL1 1
P&L 7 P&L LVL2 2 P&L LVL1 1
Hierarchy Cost_center Level2 Level1
MGMT 1 MGMT LVL2 1 MGMT LVL1 1
MGMT 2 MGMT LVL2 1 MGMT LVL1 1
MGMT 3 MGMT LVL2 1 MGMT LVL1 1
MGMT 4 MGMT LVL2 2 MGMT LVL1 1
MGMT 5 MGMT LVL2 2 MGMT LVL1 1
MGMT 6 MGMT LVL2 2 MGMT LVL1 1
MGMT 7 MGMT LVL2 2 MGMT LVL1 1

I have an expenses fact table where the cost center is a key.

clip_image002

Now lets see how this is modeled in the RPD.

Physical Layer Joins

I have aliased the security profile table as Sec Profile PL and Sec Profile Mgmt

clip_image004

There is a similar join between the Sec Profile Mgmt table with the Management Hierarchy table.

Here is how the BMM layer is done.

The cost center table will have 2 logical sources and each of these logical sources would have an inner join to the appropriate sec profile alias table.

clip_image006clip_image008

All the columns are mapped to both the logical sources.

Now we need to define appropriate fragmentation content for these 2 logical sources. Here is the Mgmt source

clip_image010

Add the user_id column to the logical table and map it to the appropriate logical sources.

Now all that’s left is to go to the relevant group in the RPD and add the security filter.

clip_image012

Create reports with the hierarchy as one of the columns and set filter to a presentation variable

clip_image014

Also create a prompt and set a presentation variable in the prompt.

clip_image016

Throw the 2 in a dashboard and lets see how the reports work for USER1

clip_image018clip_image020

Its very important to have the filter on the hierarchy column in every query otherwise you would end up overstating the numbers.

Advertisements

Letting the users pick joins in OBIEE

The users want to define the joins between tables in answers, when doing adhoc queries. These views are used for real-time reporting and were sitting on top of EBS tables. This requirement was driven by the fact that most of these views could be joined in more than 2 ways depending of the business scenario. In a typical design we would create alias tables to correspond to each of the joins. But in our case the users were not able to define the joins in advance and there is a possibility of the view definitions changing over a period of time. To sum it up the users wanted to simulate a SQL tool with answers, sort of.

clip_image002

In our case View 1 and View 2 could be joined up to 4 different ways and there were quite a few views like that, making the whole aliasing approach cumbersome. One of the developers in the team came up with an idea, though it’s not the perfect solution, but definitely simple and worth mentioning.

First up we extended all the views in the database by adding a dummy column and the values were always 1.

View 1

Col1 Col2 Col3 Col4 Dummy
1
1

View2

Col1 Col2 Col3 Col4 Dummy
1
1

The physical layer in the RPD would have the join between the views as View1.dummy = View2.Dummy. There is nothing special about the BMM layer, the usual and the same with the presentations layer.

Now comes the answers part. If I were to run a query between the 2 views without any filters in the query the results would not make any since the join is View1.dummy = View2.Dummy.

Let’s say for query 1 I want to join on View1.col1 = View2.col1. Here are the steps.

Put a filter on View1.Col1. (View1.Col1 = value1)

Convert that filter to SQL

Equate View1.Col1 to View2.Col2 in the filter SQL.

clip_image004

This will have your physical queries appended by 2 join conditions 1) which is defined in the RPD(View1.dummy = View2.Dummy) and 2) View1.col1 = View2.col1 .

You could define any joins in a similar fashion. One serious limitation is that the joins are always inner joins.

Dynamic selection of connection pool / DB user id

Sometimes in data warehouses, there would be more than one user created for the OBIEE application. This makes more sense in Teradata data warehouses. This is mostly done to so that any single user (Data base user id used in the connection pool) doesn’t run out of spool space.

Lets say we have three user id’s created the database DW_id1 , DW_id2 and DW_id3. If we have one connection pool in the RPD (for example using DW-id1) all the users logging into OBIEE would be using DW_id1 and all the queries would be run under that id. So there is as high chance of that Teradata user_id running out of spool space as the number of concurrent users increase, assuming they are running some heavy queries. Ideally we would like to distribute the queries send back from OBIEE to the Teradata database between these three user_id’s (databse user_id’s). Here I will discuss about a couple of approaches.

1) First Approach – Using a session variable in the connection pool for the user name. This is the most widely used approach.

I have three users created in the OBIEE environment (USER1, USER2, and USER3). As per our requirement USER1 should use DW_id1 , USER2 DW_id2 and USER3 should use DW_id3.

First up I create a table in my database.

OBIEE_LOGIN DW_LOGIN
USER1 DW_id1
USER2 DW_id2
USER3 DW_id3

We now need a session variable created based on the OBIEE user_id which would be initialized to their corresponding DW_LOGIN. You will have to have a separate connection pool for the variable. Use the session variable created in the user name in the User Name text box in the connection pool properties. Once this is done every query by USER1 would be send to the database would be under DW_id1 and USER 2 DW_id2.

There are 2 major drawbacks to this approach.

a) We cannot guarantee that the database user_id’s would be evenly distributed at any given point in time. For e.g. if we have 100 users with each user id assigned one of the three database id’s assigned. The worst case scenario in this case would be that all the OBIEE users assigned to database id DW-id1 login at the same time. So you could have 33 OBIEE users using one database id and the other database id’s are not being used because those OBIEE users are not logged in.

b) Let’s assume that USER1 and USER2 have the same data security profile setup. In this case you would expect USER 2 to hit the cache created by USER1. This would not happen. To get a cache hit the logical query and all the session variables involved in that query would have to be an exact match. In this case the used_id for the database is a session variable and it is different for USER1 and USER2 and this would result in a cache miss.

So in theory if I could do something as to get rid of the session variable and introduce some sort of round robbing logic in terms of picking up the database user id I would have circumvented both the issues highlighted above.

2) Second approach – Using multiple connection pools.

First up

I create three connection pools since I have the database user id’s, one for each database id.

clip_image002clip_image004

Use the appropriate User Names in TWO and THREE respectively.

Create a sequence in the DB.

CREATE SEQUENCE “BI_DW”.”SEQUENCE_LOGIN” MINVALUE 1 MAXVALUE 3 INCREMENT BY 1 START WITH 3 CACHE 2 ORDER CYCLE ;

Create three groups in the RPD, GROUP_ONE, GROUP_TWO and GROUP_THREE. Go to the permission tab in the connection pool and explicitly deny every group access to the connection pool. Then for Connection Pool one give read access to GROUP_ONE. Refer the picture below (GROUP_TWO)

clip_image006

Now create an initialization block (row wise) to initialize the RPD group “GROUP”. If you have an init block that already initializes “GROUP” then append the following query.

select ‘GROUP’,

case when

sequence_login.nextval = 1 then ‘GROUP_ONE’

ELSE CASE WHEN

SEQUENCE_LOGIN.NEXTVAL = 2 THEN ‘GROUP_TWO’

ELSE ‘GROUP_THREE’

END

END AS VALUE

from dual;

So what is happening right now is that the first user logging in would be assigned GROUP_ONE and the second one GROUP_TWO and third one GROUP_THREE and the fourth one to GROUP_ONE, since the max value for the sequence is 3. Now we have gotten rid of the session variable in the connection pool and no more cache misses. And the database user id’s would be evenly distributed in a round robin fashion.

Procedure to create a simple Time Dimension Table seed data

Here is how to create a simple time dimesion table and to populate it with seed data.

CREATE TABLE “CUBE_DATA”.”W_DAY_D”
(    “ROW_WID” NUMBER(10,0),
“DAY_DATE” DATE,
“YEAR_NUM” NUMBER(10,0),
“MONTH_NAME” VARCHAR2(30 BYTE),
“MONTH_NUMBER” NUMBER(10,0),
“QUARTER_NAME” VARCHAR2(30 BYTE),
“QUARTER_NUMBER” NUMBER(10,0),
“WEAK_YR” NUMBER(10,0),
“WEAK_MTH” NUMBER(10,0),
“MONTH_AGO_WID” NUMBER(10,0),
“QUARTER_AGO_WID” NUMBER(10,0),
“YEAR_AGO_WID” NUMBER(10,0),
“FISCAL_YEAR_NUM” NUMBER(10,0),
“FISCAL_MONTH_NUMBER” NUMBER(10,0),
“FISCAL_QUARTER_NAME” VARCHAR2(30 BYTE),
“FISCAL_QUARTER_NUMBER” NUMBER(10,0),
“FISCAL_WEEK_YR” NUMBER(10,0)
)

create or replace
PROCEDURE W_DAY_D_PROC
( BEGIN_YEAR IN NUMBER
, END_YEAR IN NUMBER,FY_DIFF IN NUMBER
)
AS
/* FY_DIFF is the diiference between Calendar year and Fiscal Year. For example
if the Fiscal year starts in Aug then the difference is 5 */
V_BEGIN_DATE DATE;
V_END_DATE DATE;
V_YEAR NUMBER;
V_ROW_WID NUMBER;
V_MONTH_NAME VARCHAR(30);
V_MONTH_NUMBER NUMBER;
V_QUARTER_NAME VARCHAR(30);
V_QUARTER_NUMBER NUMBER;
V_WEAK_YR NUMBER;
V_WEAK_MTH NUMBER;
————————Ago Variables——————————————-
V_MONTH_AGO_DAY DATE;
V_QUARTER_AGO_DAY DATE;
V_YEAR_AGO_DAY DATE;
V_MONTH_AGO_WID NUMBER(10);
V_QUARTER_AGO_WID NUMBER(10);
V_YEAR_AGO_WID NUMBER(10);

——————— Fiscal Year Variables ————————————
V_FY_YEAR NUMBER;
V_FY_MONTH_NUMBER NUMBER;
V_FY_QUARTER_NAME VARCHAR(30);
V_FY_QUARTER_NUMBER NUMBER;
V_FY_WEEK_YR NUMBER;
V_FY_DATE DATE;
BEGIN

V_BEGIN_DATE := TO_DATE(TO_CHAR(BEGIN_YEAR*10000+100+1),’YYYY/MM/DD’);
V_END_DATE := TO_DATE(TO_CHAR(END_YEAR*10000+1200+31),’YYYY/MM/DD’);

LOOP
————– Calendar Date Information ————————————-
V_YEAR := TO_NUMBER(TO_CHAR(V_BEGIN_DATE,’YYYY’));
V_ROW_WID := TO_NUMBER(TO_CHAR(V_BEGIN_DATE,’YYYYMMDD’));
V_MONTH_NAME := TO_CHAR(V_BEGIN_DATE,’Month’);
V_MONTH_NUMBER := TO_NUMBER(TO_CHAR(V_BEGIN_DATE,’MM’));
V_WEAK_YR := TO_NUMBER(TO_CHAR(V_BEGIN_DATE,’WW’));
V_WEAK_MTH := TO_NUMBER(TO_CHAR(V_BEGIN_DATE,’W’));
V_MONTH_AGO_DAY := ADD_MONTHS(V_BEGIN_DATE,-1);
V_MONTH_AGO_WID := TO_NUMBER(TO_CHAR(V_MONTH_AGO_DAY,’YYYYMMDD’));
V_QUARTER_AGO_DAY := ADD_MONTHS(V_BEGIN_DATE,-3);
V_QUARTER_AGO_WID := TO_NUMBER(TO_CHAR(V_QUARTER_AGO_DAY,’YYYYMMDD’));
V_YEAR_AGO_DAY := ADD_MONTHS(V_BEGIN_DATE,-12);
V_YEAR_AGO_WID := TO_NUMBER(TO_CHAR(V_YEAR_AGO_DAY,’YYYYMMDD’));

CASE
WHEN V_MONTH_NUMBER IN (1,2,3) THEN V_QUARTER_NUMBER := 1;
WHEN V_MONTH_NUMBER IN (4,5,6) THEN V_QUARTER_NUMBER := 2;
WHEN V_MONTH_NUMBER IN (7,8,9) THEN V_QUARTER_NUMBER := 3;
ELSE V_QUARTER_NUMBER := 4;
END CASE;

V_QUARTER_NAME := TO_CHAR(V_YEAR)||’ Q’||TO_CHAR(V_QUARTER_NUMBER) ;

——– Fiscal Date Information ———————————————-

V_FY_DATE := ADD_MONTHS (V_BEGIN_DATE,FY_DIFF);
V_FY_YEAR := TO_NUMBER(TO_CHAR(V_FY_DATE,’YYYY’));
V_FY_MONTH_NUMBER := TO_NUMBER(TO_CHAR(V_FY_DATE,’MM’));
V_FY_WEEK_YR := TO_NUMBER(TO_CHAR(V_FY_DATE,’WW’));

CASE
WHEN V_FY_MONTH_NUMBER IN (1,2,3) THEN V_FY_QUARTER_NUMBER := 1;
WHEN V_FY_MONTH_NUMBER IN (4,5,6) THEN V_FY_QUARTER_NUMBER := 2;
WHEN V_FY_MONTH_NUMBER IN (7,8,9) THEN V_FY_QUARTER_NUMBER := 3;
ELSE V_FY_QUARTER_NUMBER := 4;
END CASE;

V_FY_QUARTER_NAME := TO_CHAR(V_FY_YEAR)||’ Q’||TO_CHAR(V_FY_QUARTER_NUMBER) ;

INSERT INTO W_DAY_D(ROW_WID,DAY_DATE,YEAR_NUM, MONTH_NAME,MONTH_NUMBER,QUARTER_NAME,QUARTER_NUMBER,
WEAK_YR,WEAK_MTH,MONTH_AGO_WID,QUARTER_AGO_WID,YEAR_AGO_WID,FISCAL_YEAR_NUM,FISCAL_MONTH_NUMBER,
FISCAL_QUARTER_NAME,FISCAL_QUARTER_NUMBER,FISCAL_WEEK_YR)
VALUES (V_ROW_WID, V_BEGIN_DATE,V_YEAR,V_MONTH_NAME,V_MONTH_NUMBER,V_QUARTER_NAME,V_QUARTER_NUMBER,
V_WEAK_YR,V_WEAK_MTH,V_MONTH_AGO_WID,V_QUARTER_AGO_WID,V_YEAR_AGO_WID,V_FY_YEAR,V_FY_MONTH_NUMBER,
V_FY_QUARTER_NAME,V_FY_QUARTER_NUMBER,V_FY_WEEK_YR);

IF V_BEGIN_DATE >= V_END_DATE THEN
EXIT;
END IF;
V_BEGIN_DATE := V_BEGIN_DATE + 1;
END LOOP;
COMMIT;

END W_DAY_D_PROC;