Header image  
database IV  
line decor
  HOME :: BUSINESS :: COMPUTING :: LIFESTYLE :: PASTIME ::SCIENCES ::
line decor
   
 database IV
 
       

DBA Responsibilities:

* Installation, configuration and upgrading of Oracle server software and related products
* Evaluate Oracle features and Oracle related products
* Establish and maintain sound backup and recovery policies and procedures
* Take care of the Database design and implementation
* Implement and maintain database security (create and maintain users and roles, assign priveledges)
* Do database tuning and performance monitoring
* Do application tuning and performance monitoring
* Setup and maintain documentation and standards
* Plan growth and changes (capasity planning)
* Work as part of a team and provide 7x24 support when required
* Do general technical trouble shooting and give consultation to development teams
* Interface with Oracle Corporation for technical support.

Oracle DBA Responsibilities

The DBA should posses the following skills or Desired Skills:


Typical Interview Questions
General Questions

General Oracle Questions

Did you use online or off-line backups?
What version of Oracle were you running?
Haw many databases and what sizes?
If you have to advise a backup strategy for a new application, how would you approach it and what questions will you ask?
If a customer calls you about a hanging database session, what will you do to resolve it?
Compare Oracle to any other database that you know. Why would you prefer to work on one and not on the other?

 

Outer joins - Usage and efficiency

Contents of this document


Purpose

Outer joins enable rows to be returned from a join where one of the tables does not contain matching rows for the other table.

Suppose we have two tables:

     Person ------
         Person_id   Name                   Address_id
         ---------   ----------------       ----------
         00001       Fred Bloggs            00057
         00002       Joe Smith              00092
         00003       Jane Doe
         00004       Sue Jones              00111
 
     Address -------
         Address_id   Address_Desc
         ----------   -------------------------
         00057        1, Acacia Avenue, Anytown
         00092        13, High Street, Anywhere
         00113        52, Main Road, Sometown
 

Then the simple join:

         SELECT PERSON.NAME, ADDRESS.ADDRESS_DESC
           FROM PERSON, ADDRESS
          WHERE PERSON.ADDRESS_ID = ADDRESS.ADDRESS_ID

returns:

    NAME          ADDRESS_DESC
    ----------    ------------
    Fred Bloggs   1, Acacia Avenue, Anytown
    Joe Smith     13, High Street, Anywhere

But the outer join:

         SELECT PERSON.NAME, ADDRESS.ADDRESS_DESC
           FROM PERSON, ADDRESS
          WHERE PERSON.ADDRESS_ID = ADDRESS.ADDRESS_ID(+)

returns:

    NAME          ADDRESS_DESC
    ----------    ------------
    Fred Bloggs   1, Acacia Avenue, Anytown
    Joe Smith     13, High Street, Anywhere
    Jane Doe
    Sue Jones
 

Note the two new rows for Jane Doe and Sue Jones. These are the people who do not have matching records on the ADDRESS table. Sue Jones had an address_id on her PERSON record, but this didn't match an address_id on the ADDRESS table. ( Probably a data inconsistency ). Jane Doe had NULL in her PERSON.ADDRESS_ID field, which obviously doesn't match any address_id on the ADDRESS table.

Note that the outer join is created by including (+) on the WHERE clause which joins the two tables. The (+) is put against the column-name on the deficient table, ie. the one with the missing rows. It is very important to put the (+) on the correct table: putting it on the other table will give different results. eg. the query:

         SELECT PERSON.NAME, ADDRESS.ADDRESS_DESC
           FROM PERSON, ADDRESS
          WHERE PERSON.ADDRESS_ID(+) = ADDRESS.ADDRESS_ID
 

returns:

    NAME          ADDRESS_DESC
    ----------    ------------
    Fred Bloggs   1, Acacia Avenue, Anytown
    Joe Smith     13, High Street, Anywhere
                  52, Main Road, Someplace
 

Outer joins and other conditions

Note that it is pointless implementing an outer join on a table if there are other conditions on the deficient table.

For example, consider the query ...

           SELECT ORDER.CUSTOMER_NO,
                  ORD_LINE.AMOUNT
             FROM ORDER, ORDER_LINE
            WHERE ORDER.ORDER_NO = ORD_LINE.ORDER_NO(+)
              AND ORD_LINE.ITEM_NO = 7

When an outer join returns rows from the driving table which have no matching row on the driven table, the dummy rows from the driven table are populated with nulls. If there is an additional condition on the driven table ( ORD_LINE.ITEM_NO = 7, in the above example ), then that will eliminate the additional rows which the outer join supplied. In this situation the outer join can be safely replaced by a standard join. This gives the optimiser greater flexibility in executing the query (see below) and may lead to dramatic performance improvements in the query.

There is a caveat here: the assertion that it is pointless implementing an outer join on a table if there are other conditions on the deficient table is only true when we are considering other conditions which don't take account of null values. If the other conditions do take account of null values ( eg. NVL(ORD_LINE.ITEM_NO,7) = 7 ) then replacing the outer join by a standard join may not be safe. Conditions which take account of null values include nvls, decodes and further outer-joins to other tables.


Efficiency implications of Outer joins

Outer joins affect the performance of queries in several ways. One is obvious and expected, the others are not so obvious.

The obvious implication of outer joins is that, since a row is returned for every row in the driving table, an outer join will not reduce the number of rows passed through to the next join condition in the same way that a standard join would.

A much more significant effect of outer joins is that under the rule based optimiser, they force the query to select the non-deficient table as the driving table. This may be undesirable.

Consider the following example:

           SELECT ORDER.CUSTOMER_NO,
                  ORD_LINE.AMOUNT
             FROM ORDER, ORDER_LINE
            WHERE ORDER.ORDER_NO = ORD_LINE.ORDER_NO(+)
              AND ORD_LINE.ITEM_NO = 7

Because there is an outer join condition on ORDER_LINE, the rule-based optimiser will choose ORDER as the driving table. This is particularly distressing if there are thousands of orders but the index on ORDER_LINE.ITEM_NO was very selective and would have reduced the number of rows processed had it been used. In this case, the outer-join was unnecessary and could have been replaced by a normal join (see notes above), resulting in a dramatic performance improvement.

( As an interesting side-issue: under Oracle 7.3.3, using optimiser_mode = 'CHOOSE', but without analysing the affected tables, the optimiser is not forced to drive from the non-deficient table, even though the optimiser is mean't to be equivalent to the rule-based optimiser in this situation. ie. the Cost-based optimiser running against unanalysed tables is not identical to the rule-based optimiser ).


Outer joins and views

There is another situation in which using an outer-join can cause serious performance problems ( and to which, unfortunately, there seems to be no general solution ). This is the problem of whether the optimiser can integrate the view into the query SQL, or whether it has to use the VIEW, SORT and MERGE operators in the query plan.

Consider the following example from the HR (Personnel) database ...

The view WTE_VALUES is defined as follows:

      CREATE OR REPLACE VIEW WTE_VALUES AS
      SELECT    asg.assignment_id                    assignment_id,
                asg.person_id                        person_id,
                SUBSTR(eva.screen_entry_value,1,4)   wte_value,
                asg.effective_start_date             asg_start,
                SUBSTR(grp.segment10,1,3)            work_type
      FROM      per_assignments_f                    asg,
                pay_input_values_f                   iva,
                pay_element_entry_values_f           eva,
                pay_element_entries_f                ent,
                pay_people_groups                    grp
      WHERE   asg.assignment_id        = ent.assignment_id
      AND     eva.effective_start_date = ent.effective_start_date
      AND     eva.element_entry_id     = ent.element_entry_id
      AND     iva.input_value_id       = eva.input_value_id
      AND     iva.name                 = 'WTE'
      AND     asg.people_group_id      = grp.people_group_id

The standard-join query ...

     SELECT ...
     FROM PER_ASSIGNMENTS_F A,
          WTE_VALUES V
     WHERE V.ASSIGNMENT_ID = A.ASSIGNMENT_ID
     AND A.ASSIGNMENT_ID = 5004
 

gives statistics ...

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      1.40       1.54          5          0         10           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1      0.01       0.15          9         29          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3      1.41       1.69         14         29         10           0
 

and an execution plan of ...

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   HINT: RULE
      0   NESTED LOOPS
      0    NESTED LOOPS
      0     NESTED LOOPS
      2      NESTED LOOPS
      4       NESTED LOOPS
      3        INDEX (RANGE SCAN) OF 'PER_ASSIGNMENTS_F_PK' (UNIQUE)
      4        TABLE ACCESS (BY ROWID) OF 'PER_ASSIGNMENTS_F'
      6         INDEX (RANGE SCAN) OF 'PER_ASSIGNMENTS_F_PK' (UNIQUE)
      4       INDEX (UNIQUE SCAN) OF 'PAY_PEOPLE_GROUPS_PK' (UNIQUE)
      0      TABLE ACCESS (BY ROWID) OF 'PAY_ELEMENT_ENTRIES_F'
      2       INDEX (RANGE SCAN) OF 'PAY_ELEMENT_ENTRIES_F_N51' (NON-UNIQUE)
      0     TABLE ACCESS (BY ROWID) OF 'PAY_ELEMENT_ENTRY_VALUES_F'
      0      INDEX (RANGE SCAN) OF 'PAY_ELEMENT_ENTRY_VALUES_F_N50' (NON-UNIQUE)
      0    TABLE ACCESS (BY ROWID) OF 'PAY_INPUT_VALUES_F'
      0     INDEX (RANGE SCAN) OF 'PAY_INPUT_VALUES_F_PK' (UNIQUE)
 

Whilst the outer-join query ...

     SELECT ...
     FROM PER_ASSIGNMENTS_F A,
          WTE_VALUES V
     WHERE V.ASSIGNMENT_ID(+) = A.ASSIGNMENT_ID
     AND A.ASSIGNMENT_ID = 5004
 

gives statistics of ...

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.06       0.06          0          0          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch        1    138.89     175.98      31378     931451          3           2
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        3    138.95     176.04      31378     931451          3           2
 
 

and an execution plan of ...

Rows     Execution Plan
-------  ---------------------------------------------------
      0  SELECT STATEMENT   HINT: RULE
      0   MERGE JOIN (OUTER)
      3    INDEX (RANGE SCAN) OF 'PER_ASSIGNMENTS_F_PK' (UNIQUE)
   4845    SORT (JOIN)
   4845     VIEW OF 'WTE_VALUES'
   4845      NESTED LOOPS
 115032       NESTED LOOPS
  27470        NESTED LOOPS
  18068         NESTED LOOPS
   2718          TABLE ACCESS (FULL) OF 'PAY_PEOPLE_GROUPS'
  18068          TABLE ACCESS (BY ROWID) OF 'PER_ASSIGNMENTS_F'
  20786           INDEX (RANGE SCAN) OF 'PER_ASSIGNMENTS_F_FK16' (NON-UNIQUE)
  27470         TABLE ACCESS (BY ROWID) OF 'PAY_ELEMENT_ENTRIES_F'
  45538          INDEX (RANGE SCAN) OF 'PAY_ELEMENT_ENTRIES_F_N51' (NON-UNIQUE)
 115032        TABLE ACCESS (BY ROWID) OF 'PAY_ELEMENT_ENTRY_VALUES_F'
 142502         INDEX (RANGE SCAN) OF 'PAY_ELEMENT_ENTRY_VALUES_F_N50' (NON-UNIQUE)
 115032       TABLE ACCESS (BY ROWID) OF 'PAY_INPUT_VALUES_F'
 230064        INDEX (RANGE SCAN) OF 'PAY_INPUT_VALUES_F_PK' (UNIQUE)

What is going on? Why is a standard join to a view able to be integrated into the query as nested loops, but an outer join is executed using a merge sort?

When a view cannot be integrated into the query, the VIEW operation is used to return ALL rows that the view would have returned without any additional WHERE clauses ( This is not strictly true. It appears that predicates, ie. WHERE clauses which refer to constants, or bound variables will be applied, but not WHERE clauses which are join conditions. In the above example, there are no predicates on the view WTE_VALUES, so the view is unbounded, and returns 4845 rows ). This data is sorted, the data from the main query is sorted and the two result sets are merged.

Performing a direct comparison of Fetch statistics, we see ...

Query

CPU time (secs)

Elapsed time (secs)

Physical I/O

Logical I/O

Standard Join

0.01

0.15

9

29

Outer Join

138.89

175.98

31378

931454

Degradation factor

13889x

1173x

3486x

32119x

The query with the outer join is much less efficient than the query with the standard join. This is because an unbounded WTE_VALUES view returns a lot of rows and processes a lot of rows internally to do so.


Why can't Oracle integrate an outer-join of a view?

Let's consider a simpler example. Suppose we have the following tables ...

    Order table
    -----------
 
         ORDER_NO   DATE_RAIS CUSTOMER_NAME
         ---------- --------- ---------------
         00001/23   04-APR-98 Dave Wotton
         00002/07   17-MAY-98 Fred Bloggs
 
 
    Order-Line table
    ----------------
 
         ORDER_NO   LINE_NO ITEM_NO      QUANTITY
         ---------- ------- ---------- ----------
         00001/23         1 1100                2
         00001/23         2 1101                7
         00001/23         3 1102                1
         00001/23         4 1103               13
         00001/23         5 1104                6
 
 
    Item table
    ----------
 
         ITEM_NO    DESCRIPTION               PRICE
         ---------- -------------------- ----------
         1100       Deluxe Widget             12.99
         1101       Standard Whotsit           5.37
         1102       Micro Thingy                8.5
 

with indexes:

     IND1 on ORDER(ORDER_NO)
     IND2 on ORDER_LINE(ORDER_NO,LINE_NO)
     IND3 on ITEM(ITEM_NO)
 

and suppose the view ORDER_LINE_VIEW is defined as ...

   SELECT L.ORDER_NO, L.LINE_NO, I.ITEM_NO, I.DESCRIPTION,
          I.PRICE, L.QUANTITY
     FROM ITEM I,
          ORDER_LINE L
    WHERE I.ITEM_NO = L.ITEM_NO
 

Then the standard-join query ....

     SELECT O.ORDER_NO, O.DATE_RAISED,
            V.LINE_NO, V.DESCRIPTION, V.PRICE, V.QUANTITY
      FROM ORDER_LINE_VIEW V,
           ORDER O
      WHERE V.ORDER_NO = O.ORDER_NO
 

returns the following data ....

      ORDER_NO   DATE_RAIS LINE_NO DESCRIPTION               PRICE   QUANTITY
      ---------- --------- ------- -------------------- ---------- ----------
      00001/23   04-APR-98       1 Deluxe Widget             12.99          2
      00001/23   04-APR-98       2 Standard Whotsit           5.37          7
      00001/23   04-APR-98       3 Micro Thingy                8.5          1

and has an execution plan of ...

      Rows     Execution Plan
      -------  ---------------------------------------------------
            0  SELECT STATEMENT   GOAL: RULE
            3   NESTED LOOPS
            5    NESTED LOOPS
            2     TABLE ACCESS (FULL) OF 'ORDER'
            5     TABLE ACCESS (BY ROWID) OF 'ORDER_LINE'
            7      INDEX (RANGE SCAN) OF 'IND2' (NON-UNIQUE)
            3    TABLE ACCESS (BY ROWID) OF 'ITEM'
            8     INDEX (RANGE SCAN) OF 'IND3' (NON-UNIQUE)

Whilst the outer-join query ...

    SELECT O.ORDER_NO, O.DATE_RAISED,
           V.LINE_NO, V.DESCRIPTION, V.PRICE, V.QUANTITY
     FROM ORDER_LINE_VIEW V,
          ORDER O
     WHERE V.ORDER_NO(+) = O.ORDER_NO

which returns the following data ...

     ORDER_NO   DATE_RAIS LINE_NO DESCRIPTION               PRICE   QUANTITY
     ---------- --------- ------- -------------------- ---------- ----------
     00001/23   04-APR-98       1 Deluxe Widget             12.99          2
     00001/23   04-APR-98       2 Standard Whotsit           5.37          7
     00001/23   04-APR-98       3 Micro Thingy                8.5          1
     00002/07   17-MAY-98

has an execution plan of ...

     Rows     Execution Plan
     -------  ---------------------------------------------------
           0  SELECT STATEMENT   GOAL: RULE
           3   MERGE JOIN (OUTER)
           2    SORT (JOIN)
           2     TABLE ACCESS (FULL) OF 'ORDER'
           3    SORT (JOIN)
           3     VIEW OF 'ORDER_LINE_VIEW'
           3      NESTED LOOPS
           5       TABLE ACCESS (FULL) OF 'ORDER_LINE'
           3       TABLE ACCESS (BY ROWID) OF 'ITEM'
           8        INDEX (RANGE SCAN) OF 'IND3' (NON-UNIQUE)

demonstrating the same effect as in the Personnel query.

If it were possible for the optimiser to process this query as nested loops, it would be equivalent to restructuring the query into one of the three following queries. But if we look at each one, we see that none are equivalent to the original query.

Alternative 1 is:

     SELECT O.ORDER_NO, O.DATE_RAISED,
            L.LINE_NO, I.DESCRIPTION, I.PRICE, L.QUANTITY
      FROM ITEM I,
           ORDER_LINE L,
           ORDER O
      WHERE L.ORDER_NO = O.ORDER_NO
        AND L.ITEM_NO = I.ITEM_NO(+)

and produces the following results ...

     ORDER_NO   DATE_RAIS LINE_NO DESCRIPTION               PRICE   QUANTITY
     ---------- --------- ------- -------------------- ---------- ----------
     00001/23   04-APR-98       1 Deluxe Widget             12.99          2
     00001/23   04-APR-98       2 Standard Whotsit           5.37          7
     00001/23   04-APR-98       3 Micro Thingy                8.5          1
     00001/23   04-APR-98       4                                         13
     00001/23   04-APR-98       5                                          6

Alternative 2 is:

     SELECT O.ORDER_NO, O.DATE_RAISED,
            L.LINE_NO, I.DESCRIPTION, I.PRICE, L.QUANTITY
      FROM ITEM I,
           ORDER_LINE L,
           ORDER O
      WHERE L.ORDER_NO(+) = O.ORDER_NO
        AND L.ITEM_NO = I.ITEM_NO

and produces ...

     ORDER_NO   DATE_RAIS LINE_NO DESCRIPTION               PRICE   QUANTITY
     ---------- --------- ------- -------------------- ---------- ----------
     00001/23   04-APR-98       1 Deluxe Widget             12.99          2
     00001/23   04-APR-98       2 Standard Whotsit           5.37          7
     00001/23   04-APR-98       3 Micro Thingy                8.5          1

and alternative 3 is:

     SELECT O.ORDER_NO, O.DATE_RAISED,
            L.LINE_NO, I.DESCRIPTION, I.PRICE, L.QUANTITY
      FROM ITEM I,
           ORDER_LINE L,
           ORDER O
      WHERE L.ORDER_NO(+) = O.ORDER_NO
        AND L.ITEM_NO = I.ITEM_NO(+)

producing ...

     ORDER_NO   DATE_RAIS LINE_NO DESCRIPTION               PRICE   QUANTITY
     ---------- --------- ------- -------------------- ---------- ----------
     00001/23   04-APR-98       1 Deluxe Widget             12.99          2
     00001/23   04-APR-98       2 Standard Whotsit           5.37          7
     00001/23   04-APR-98       3 Micro Thingy                8.5          1
     00001/23   04-APR-98       4                                         13
     00001/23   04-APR-98       5                                          6
     00002/07   17-MAY-98

None of which match the original query.


Is there anything that can be done?

The first point to note is that the Oracle 7.3.3 optimiser can integrate outer joins to simple views made up of a single table. ( The version 7.1.6 optimiser cannot do this ). Future versions of the optimiser may be able to integrate outer joins to more complex views.

Secondly, under version 7.3.3 or earlier, it does not help using the cost-based optimiser with INDEX, FIRST_ROWS or USE_NL hints: the optimiser still cannot integrate outer joins to views of more than one table.

Finally, it is worth noting that the merge join approach adopted by default by the optimiser may occasionally be the best approach.

eg. If you are performing an outer join of a large result set against a view which efficiently returns only a handful of rows, it is probably more efficient for the optimiser to sort the result set and the view rows and merge them than to be forced into a nested loops technique which repeatedly accesses the same few rows from the view. However, as we have seen from our examples, merge join is disastrously inefficient if we are performing an outer join of a small result set against a view which returns a large number of rows and/or is inefficient when unbounded.

Where a merge join is inefficient, there are some things that can be done which can dramatically improve performance in certain situations, but nothing simple which can be generally applied.

Option 1 - using DECODE and NVL

This option only works if the join column on the driving table either contains nulls or, where it contains a value, there is always a matching row on the driven table. In this case, the outer join is only being used to handle the null values in the driving table. It is useful for table lookups but does not work for parent/child joins where parent rows may have no children ( as the join column in the driving table would then contain values which are not matched in the driven table ).

eg. Replace ...

      SELECT A.ASSIGNMENT_ID, V.WTE_VALUE
      FROM PER_ASSIGNMENTS_F A,
           WTE_VALUES V
      WHERE V.ASSIGNMENT_ID(+) = A.ASSIGNMENT_ID

By ...

      SELECT A.ASSIGNMENT_ID, DECODE(A.ASSIGNMENT_ID,NULL,NULL,V.WTE_VALUE)
      FROM PER_ASSIGNMENTS_F A,
           WTE_VALUES V
      WHERE V.ASSIGNMENT_ID = NVL(A.ASSIGNMENT_ID,5000)

ie. in the WHERE clause, we use nvl to replace the value of A.ASSIGNMENT_ID, if it is null, by a known value, 5000, which we know will return a row from WTE_VALUES. In the SELECT clause, we use decode to return NULL if the value of A.ASSIGNMENT_ID is NULL, otherwise we return the looked up WTE_VALUE.

( This is a somewhat artificial example because, of course, A.ASSIGNMENT_ID will never be null here! )

Although this handles the case where the join column in the driving table contains nulls, it fails if the join column in the driving table can contain unmatched values, as the simple join to the view will then return no row, whereas the original outer join would have returned a null row. It is also important to choose a value to replace null values which only returns a single row from the view, otherwise rows from the driving table which contain nulls in the join column will generate multiple dummy rows from the join.

Option 2 - using user-defined functions

This approach can be very efficient, but is only applicable for lookups ( ie. where the join was present simply to return a single row containing a single value ). Instead of creating an outer-join to the lookup view, we define a function which does the lookups:

     CREATE OR REPLACE FUNCTION LOOKUP_WTE (L_ASSIGNMENT_ID VARCHAR2)
            RETURN VARCHAR2 IS
     --
        RETURN_VALUE VARCHAR2(80);
     --
        CURSOR C1 IS
           SELECT WTE_VALUE
            FROM  WTE_VALUES
           WHERE  ASSIGNMENT_ID = L_ASSIGNMENT_ID;
     --
         BEGIN
     --
           RETURN_VALUE := NULL;
     --
           IF ( L_ASSIGNMENT_ID IS NOT NULL ) THEN
              OPEN C1;
              FETCH C1 INTO RETURN_VALUE;
              CLOSE C1;
           END IF;
     --
           RETURN RETURN_VALUE;
     --
        END;

Now our query becomes ...

      SELECT A.ASSIGNMENT_ID, LOOKUP_WTE(A.ASSIGNMENT_ID)
      FROM PER_ASSIGNMENTS_F A

This approach has several advantages ...

These advantages can be so great that they even provide a significant performance improvement over a query which performs standard joins to a lookup view. ( But see the disadvantages, below ).

There are a few disadvantages ...


An incorrect approach

It might be tempting to modify alternative 3 above, integrating the view conditions into the main query manually, converting all simple joins from the view into outer joins, and then including additional WHERE clauses to allow only the resultant rows which have nulls in all the columns from the view's tables, or in none of the columns.

Taking our original example, transform ...

      SELECT ...
      FROM PER_ASSIGNMENTS_F A,
           WTE_VALUES V
      WHERE V.ASSIGNMENT_ID(+) = A.ASSIGNMENT_ID

into ...

      SELECT ...
        FROM PER_ASSIGNMENTS_F               A,
             PER_ASSIGNMENTS_F               ASG,
             PAY_INPUT_VALUES_F              IVA,
             PAY_ELEMENT_ENTRY_VALUES_F      EVA,
             PAY_ELEMENT_ENTRIES_F           ENT,
             PAY_PEOPLE_GROUPS               GRP
       WHERE  ASG.ASSIGNMENT_ID(+)           = A.ASSIGNMENT_ID
         AND  ENT.ASSIGNMENT_ID(+)           = ASG.ASSIGNMENT_ID
         AND  EVA.EFFECTIVE_START_DATE(+)    = ENT.EFFECTIVE_START_DATE
         AND  EVA.ELEMENT_ENTRY_ID(+)        = ENT.ELEMENT_ENTRY_ID
         AND  IVA.INPUT_VALUE_ID(+)          = EVA.INPUT_VALUE_ID
         AND  IVA.NAME(+)                    = 'WTE'
         AND  GRP.PEOPLE_GROUP_ID(+)         = ASG.PEOPLE_GROUP_ID
         AND  ( (   ENT.ASSIGNMENT_ID        IS NULL
                AND EVA.EFFECTIVE_START_DATE IS NULL
                AND EVA.ELEMENT_ENTRY_ID     IS NULL
                AND IVA.INPUT_VALUE_ID       IS NULL
                AND IVA.NAME                 IS NULL
                AND GRP.PEOPLE_GROUP_ID      IS NULL)
              OR
              (     ENT.ASSIGNMENT_ID        IS NOT NULL
                AND EVA.EFFECTIVE_START_DATE IS NOT NULL
                AND EVA.ELEMENT_ENTRY_ID     IS NOT NULL
                AND IVA.INPUT_VALUE_ID       IS NOT NULL
                AND IVA.NAME                 = 'WTE'
                AND GRP.PEOPLE_GROUP_ID      IS NOT NULL)
              )

Apart from being messy, it also won't work: the conditions are too restrictive.

For example, there could be a row on PAY_ELEMENT_ENTRIES_F (ENT) which matches the row on PER_ASSIGNMENTS_F (ASG), but no row on PAY_ELEMENT_ENTRY_VALUES_F (EVA) which matches the values of ENT.EFFECTIVE_START_DATE and ENT.ELEMENT_ENTRY_ID. In this case, the original view, WTE_VALUES, would return a null row to the outer join, but the integrated query will not return a row at all as ASG.ASSIGNMENT_ID and ENT.ASSIGNMENT_ID are not null, but all the other join columns are null. Similar arguments can be applied to the other join conditions and so we see that no combination of IS NULL and IS NOT NULL conditions will work. But we cannot remove these conditions altogether as we have already noted that this is too permissive.


Cartesian Products

When you join tables, make sure that the number of join predicates in the search condition is one less than the number of tables in the from list

. Otherwise, you will get many more rows returned than you probably intended. For example, table english and spanish look like this:
     * select * from english;               * select * from spanish;
     -----------------------------          -----------------------------
     |tag          |name         |          |tag          |name         |
     -----------------------------          -----------------------------
     |1            |one          |          |2            |dos          |
     |2            |two          |          |3            |tres         |
     |3            |three        |          |4            |cuatro       |
     -----------------------------          -----------------------------
     3 rows selected                        3 rows selected

If you select from both tables without joining them in the where clause, you get a cartesian product, every possible combination of both:

     * select * from english, spanish;
     ---------------------------------------------------------
     |tag          |name         |tag          |name         |
     ---------------------------------------------------------
     |2            |dos          |1            |one          |
     |3            |tres         |1            |one          |
     |4            |cuatro       |1            |one          |
     |2            |dos          |2            |two          |
     |3            |tres         |2            |two          |
     |4            |cuatro       |2            |two          |
     |2            |dos          |3            |three        |
     |3            |tres         |3            |three        |
     |4            |cuatro       |3            |three        |
     ---------------------------------------------------------
     9 rows selected

Most likely, this is not what you had in mind. Since there are two tables in the from_list, one join predicated is needed:

     * select * from english, spanish
     where english.tag = spanish.tag;
     ---------------------------------------------------------
     |tag          |name         |tag          |name         |
     ---------------------------------------------------------
     |2            |dos          |2            |two          |
     |3            |tres         |3            |three        |
     ---------------------------------------------------------
     2 rows selected

 

Inner and Outer Joins

A join between two tables does not include any rows from either table that have no matching rows in the other. This is called an inner join and frequently causes confusion since fewer rows are returned than the user expects. For example, tables english and spanish look like this:

     * select * from english;               * select * from spanish;
     -----------------------------          -----------------------------
     |tag          |name         |          |tag          |name         |
     -----------------------------          -----------------------------
     |1            |one          |          |2            |dos          |
     |2            |two          |          |3            |tres         |
     |3            |three        |          |4            |cuatro       |
     -----------------------------          -----------------------------
     3 rows selected                        3 rows selected

When you join these two tables, you get only the two rows that have the same tag:

     * select e.name, e.tag, s.name
     from english e, spanish s
     where e.tag = s.tag;
     -------------------------------------------
     |name         |tag          |name         |
     -------------------------------------------
     |two          |2            |dos          |
     |three        |3            |tres         |
     -------------------------------------------
     2 rows selected

Row one in table english and row cuatro in table spanish fall into the outer joins:

                              Joins 
 
                      +--------------+
     left outer --->  |  one      1  |
                      |        +--------------+
                +-->  |  two   |  2  :  dos   |
     inner join |     |        |     :        |
                +-->  |  three |  3  :  tres  |
                      +--------|- - -+        |
                               |  4     cuatro| <--- right outer
                               +--------------+

You can select outer join rows by using not exists. This query fetches the row in english that is not in spanish (the left outer join):

     * select  e.name as English, e.tag, '--no row --' as Spanish
     from      english e
     where     not exists
               (select * from spanish s
               where e.tag=s.tag);
     -------------------------------------------
     |English      |tag          |Spanish      |
     -------------------------------------------
     |one          |1            |--no row --  |
     -------------------------------------------
     one row selected

This query fetches the row in spanish that is not in english (the right outer join):

     * select  '--no entry--' as English, s.tag, s.name as Spanish
     from      spanish s
     where     not exists
               (select * from english e
               where e.tag=s.tag);
     -------------------------------------------
     |English      |tag          |Spanish      |
     -------------------------------------------
     |--no entry-- |4            |cuatro       |
     -------------------------------------------
     one row selected

You can string all statements together with union:

     * select  e.name::text as English, e.tag, s.name::text as Spanish
     from      english e, spanish s
     where     e.tag = s.tag
     union
     select    e.name::text, e.tag, '--no entry--'::text
     from      english e
     where     not exists
               (select * from spanish s
               where e.tag=s.tag)
     union
     select    '--no entry--'::text, s.tag, s.name::text
     from      spanish s
     where     not exists
               (select * from english e
               where e.tag=s.tag)
     order by 2;
 
     -------------------------------------------
     |English      |tag          |Spanish      |
     -------------------------------------------
     |one          |1            |--no entry-- |
     |two          |2            |dos          |
     |three        |3            |tres         |
     |--no entry-- |4            |cuatro       |
     -------------------------------------------
     4 rows selected

If you think this is a lot of trouble to retrieve outer join data, there's another way to handle known joins in Illustra that will factor in outer join data. Keep reading.

Solving Outer Joins in Illustra with ref()

Confusion with outer joins was described above. This section looks at another way to resolve outer join confusions in Illustra by using ref().

We start by creating the two tables like this and inserting data:

     create table spanish of new type spanish_t
     (name  varchar(20),
      tag   integer);
 
     create table english of new type english_t
     (name  varchar(20), 
      tag   integer,
      sname ref(spanish_t));
 
     insert into english (name, tag) values ('one',    1);
     insert into english (name, tag) values ('two',    2);
     insert into english (name, tag) values ('three',  3);
 
     insert into spanish (name, tag) values ('dos',    2);
     insert into spanish (name, tag) values ('tres',   3);
     insert into spanish (name, tag) values ('cuatro', 4);

Next we update the reference in english:

     * update english 
     set sname = (select unique ref(s1) from spanish s1
               where english.tag = s1.tag);
     3 rows updated
 
     * select * from english;
     -------------------------------------------
     |name         |tag          |sname        |
     -------------------------------------------
     |one          |1            |NULL         |
     |two          |2            |202d.2001    |
     |three        |3            |202d.2002    |
     -------------------------------------------
     3 rows selected

Notice that the select from english returned the oid reference to spanish. You can dereference that oid as follows:

     * select name as english, tag, deref(sname).name as spanish from english;
     -------------------------------------------
     |english      |tag          |spanish      |
     -------------------------------------------
     |one          |1            |NULL         |
     |two          |2            |dos          |
     |three        |3            |tres         |
     -------------------------------------------
     3 rows selected

We can also take it the opposite way by updating the spanish_t type and spanish table as follows:

     * alter type spanish_t
     add column ename ref(english_t);
 
     * update spanish
     set ename = (select unique ref(e1) from english e1
               where spanish.tag = e1.tag);
     3 rows updated
 
     * select name as spanish, tag, deref(ename).name as english from spanish;
     -------------------------------------------
     |spanish      |tag          |english      |
     -------------------------------------------
     |dos          |2            |two          |
     |tres         |3            |three        |
     |cuatro       |4            |NULL         |
     -------------------------------------------
     3 rows selected

Finally, we can use union to select from both:

     * select name as english, tag, deref(sname).name as spanish from english
     union
     select   deref(ename).name as english, tag, name as spanish from spanish
     order by 2;
     -------------------------------------------
     |english      |tag          |spanish      |
     -------------------------------------------
     |one          |1            |NULL         |
     |two          |2            |dos          |
     |three        |3            |tres         |
     |NULL         |4            |cuatro       |
     -------------------------------------------
     4 rows selected

Realize that if new rows are inserted into either table, the reference must be set in the tables that references it.