Sunday, November 8, 2009

Denormalized HR employees table, dimension hierarchy and level drills in OBIEE

I looked at the table of employees which is placed on standard Oracle HR schema and I thought that it would be great to see it in the denormalized shape in which every row contains data about parent levels.

Table employees has a self referential join (employee_id - manager_id) which describes us relation. Each employee have its parent employee (manager) except employee_id 100 which is president and placed on the top of hierarchy.

Employee hierarchy is unbalanced, and balanced hierarchy means that each branch has the same number of levels (for example products dimension table, sh.products). We'll denormalize employee table (dimension) to get user friendly hierarchy. After we make denormalized structure we'll build dimension with levels in obiee and use the second copy of employees table as a fact table to find out what is the sum of the salary of all employees first level down for the current level of employee/manager.

First we want to find out what is a maximum level depth in employees table:

select
max(LEVEL)
from employees
start with email='SKING'
connect by prior employee_id=manager_id

The maximum level depth is 4. Four levels we'll have in the denormalized shape. We use email column instead of first name and last name to identify each employee and this would be a primary key in each level.

Number of employees on each level:

select
count(email),
LEVEL
from employees
start with email='SKING'
connect by prior employee_id=manager_id
group by LEVEL
order by 2




This is employee data in the basic form:

select LEVEL,
employee_id,
email,
job_id,
manager_id
from employees
start with email='SKING'
connect by prior employee_id=manager_id
order by 1,2




In this example we don't have loops in the data, value below the parent cannot be his parent, so we don't need NOCYCLE parameter in the CONNECT BY condition.

So, employees are placed on each level. If you look at level 3 for example there are lots of employees whose job_id is ST_CLERK and they managers are at level 2 having ST_MAN as job_id. The similar is with level 4, IT_PROG, FI_ACCOUNT and AC_ACCOUNT having managers with job_id IT_PROG, FI_MGR and AC_MGR. Some employees ends at level 2, for example I add new employee which job_id is PRES_ASST (add this job_id into table jobs with president personal assistant as job title), and he doesn't have levels below him.

The first step is going into obiee Administrator physical level and make new physical table as select table type:



I use this code for employee table levels denormalization:

select
employee_id,
employee_level,
value_below,
nvl(level4_id, nvl(level3_id, nvl(level2_id, level1_id))) level4_id,
nvl(level4_desc, nvl(level3_desc, nvl(level2_desc,level1_desc))) level4_desc,
nvl(level3_id, nvl(level2_id, level1_id)) level3_id,
nvl(level3_desc, nvl(level2_desc,level1_desc)) level3_desc,
nvl(level2_id, level1_id) level2_id,
nvl(level2_desc,level1_desc) level2_desc,
level1_id,
level1_desc,
salary,
department_id,
manager_id
from
(
select
LEVEL,
concat('LEVEL ', cast(LEVEL as varchar2(1))) as employee_level,
case when CONNECT_BY_ISLEAF =1 then 'No' else 'Yes' end as value_below,
employee_id,
manager_id,
email,
salary,
department_id,
decode(LEVEL,1,to_char(employee_id),substr
(
SYS_CONNECT_BY_PATH(employee_id, '/'),
instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,1)+1,
instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,2)-instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,1)-1
)
) level1_id,
decode(LEVEL,1,to_char(email),substr
(
SYS_CONNECT_BY_PATH(email, '/'),
instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,1)+1,
instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,2)-instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,1)-1
)
) level1_desc,
--Level2
decode(LEVEL,2,to_char(employee_id),substr
(
SYS_CONNECT_BY_PATH(employee_id, '/'),
instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,2)+1,
instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,3)-instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,2)-1
)
) level2_id,
decode(LEVEL,2,to_char(email),substr
(
SYS_CONNECT_BY_PATH(email, '/'),
instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,2)+1,
instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,3)-instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,2)-1
)
) level2_desc,
--Level 3
decode(LEVEL,3,to_char(employee_id),substr
(
SYS_CONNECT_BY_PATH(employee_id, '/'),
instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,3)+1,
instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,4)-instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,3)-1
)
) level3_id,
decode(LEVEL,3,to_char(email),substr
(
SYS_CONNECT_BY_PATH(email, '/'),
instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,3)+1,
instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,4)-instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,3)-1
)
) level3_desc,
--Level 4
decode(LEVEL,4,to_char(employee_id),substr
(
SYS_CONNECT_BY_PATH(employee_id, '/'),
instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,4)+1,
instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,5)-instr(SYS_CONNECT_BY_PATH(employee_id, '/'),'/',1,4)-1
)
) level4_id,
decode(LEVEL,4,to_char(email),substr
(
SYS_CONNECT_BY_PATH(email, '/'),
instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,4)+1,
instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,5)-instr(SYS_CONNECT_BY_PATH(email, '/'),'/',1,4)-1
)
) level4_desc,
SYS_CONNECT_BY_PATH(email, '/') "desc_all",
SYS_CONNECT_BY_PATH(employee_id, '/') "id_all",
SYS_CONNECT_BY_PATH(job_id, '/') "job_all"
from employees
start with employee_id=100
connect by prior employee_id=manager_id
order by employee_id
)


For employees that are identified at the high levels I'll propagate current level id and description to lowest levels and for that I used:

nvl(level4_id, nvl(level3_id, nvl(level2_id, level1_id))) level4_id,
nvl(level4_desc, nvl(level3_desc, nvl(level2_desc,level1_desc))) level4_desc,
nvl(level3_id, nvl(level2_id, level1_id)) level3_id,
nvl(level3_desc, nvl(level2_desc,level1_desc)) level3_desc,
nvl(level2_id, level1_id) level2_id,
nvl(level2_desc,level1_desc) level2_desc


This is not mandatory.

For example SKING is identified only at level 1 and the same value is propagated till the level 4.

Before examples, one thing is left to do. In the physical layer make alias of Employee Hierarchy View that we build and name it as Employee Hierarchy Normal. We'll use this as complete employees denormalization data view. Make another alias for employees table, we'll use it as a fact table to get summ of the first level down for the current employee.

So our physical diagram looks like this:



Foreign keys:

"Employee Hierarchy View".employee_id = "Employee Fact Sum First Level Down".MANAGER_ID
"Employee Hierarchy View Normal".employee_id = "Employee Fact Sum First Level Down".MANAGER_ID

In the BMM we make left join from Employee Hierarchy Drills to Employee Fact to return all record even if there is no sum salary of employees first level down for the current employee (manager):



Example - employees dimension hierarchy and level drills using multiple logical table sources



Logical table sources in order:

















Little explanation why we put in the filter of each logical table source current level, for example if we are on the high level (SKING) and want to drill down to second level this query will be generated:

select
distinct
level1_desc,
level2_desc
from Employee Hierarchy View
where level1_desc='SKING'


You will get 16 rows, including the row that has level1_desc='SKING' and level2_desc='SKING' which is actually data placed on the first level (values propagated to lowest level). So with the logical table source filter we are excluding these rows.

Employee hierarchy and drills test in Answers

Report structure:



First level:



Value below column tell us if this level has level below it. Notice measure sum salary first level down and salary current employee dimension attribute.

Drill to the second level:



Now if you try to drill from the TTEST which doesn't have level below him you would get:



This is because when crossing from the second to the third logical table source it trying to execute where condition:

select
distinct
level1_desc,
level2_desc,
level3_desc
from Employee Hierarchy View
where level1_desc='SKING'
and level2_desc='TTEST'
and employee_level='LEVEL 3'


There is no level 3 for TTEST.

Drill now to the third level for any value that has level down:



Not bad, NGREENBE has a salary 12000 and all employees which he's a manager have sum of the salary 39600. Chech this when drilling from him to lowest level, the last fourth level:



The sum works fine.

What if we want to show in the report all employees with denormalized structure and the fact measure without starting from the higher level and without using drills?

Lets try that:



The result is not good because the query goes to the last logical table source which contains all columns on the level 4 and the employee_level='LEVEL 4' filter is applied:





So we must think something else in order to see all employees.

Options to show detailed (complete) view of all employees

1. Change the filter of the last logical table source dynamic with non-system session variable and re-set it from the dashboard prompt

We'll make two reports, one starts with highest level and used for drill down, other is detailed with all levels shown in the report but without using drills.

Create initialization block and non-system session variable:



Create dashboard prompt which shows only Complete and Dimension hardcoded values and set request (session) variable SV_EMPLOYEE_HIERARCHY. Create two sections, each has guided navigation report to show or hide it according to session variable value that user sets from the prompt. I'll not show how to implement that, just the result.

In the logical table source for level 4 write:

orcl."".HR."Employee Hierarchy View".employee_level=CASE WHEN VALUEOF(NQ_SESSION."SV_EMPLOYEE_HIERARCHY")='Dimension' THEN 'LEVEL 4' ELSE orcl."".HR."Employee Hierarchy View".employee_level END



Test

Choose dimension view with drills enabled:



Don't forget to enable Drill in place option in the section properties.

Now you are able to drill down to the lowest level (level 4) like we have shown and described before, everything is the same.



And if you drill to the level 4 it used valid code from the last logical table source:



Now choose complete view with drills disabled.



The result:



In complete view all employees are included:



NQQuery.log:



2. Make the separate employees logical table without dimension drills

Make new logical table using Employee Hierarchy View Normal alias from the physical layer that we made before:



In the BMM join this table to Employee fact logical table:



Don't forget left join:



You can use this logical table for detailed (complete) view to show all employees and still use previous logical table with dimension and drills. In that case you don't need this filter in the Employee Hierarchy Drills logical table in the logical table source for level 4:

orcl."".HR."Employee Hierarchy View".employee_level=CASE WHEN VALUEOF(NQ_SESSION."SV_EMPLOYEE_HIERARCHY")='Dimension' THEN 'LEVEL 4' ELSE orcl."".HR."Employee Hierarchy View".employee_level END

Make detailed report using Employee Hierarchy Normal logical table.

3 comments:

Anonymous said...

hi- i've been trying to implement unbalanced hierarchies in a similar way, but getting a consistency error on the logical source definitions. where is employee_id in your logical tables, and isnt that used to join fact with dimension? if you dont mind being contacted for more info, please respond with contact information. thank you.

Goran Ocko said...

On the physical layer I use employee_id as the physical foreign key for join between physical tables. On the BMM I use level4_id as the logical key for join between logical tables. Level4_id has always value and it is equal to employee_id. Any logical column can be a logical key, just be sure that it is mapped only on the last LTS (and level) if you have many LTS in single LT.

Unknown said...

Great tutorial, the sql help a lot to solve my problem. Thank you