Oracle Tutorials - Sub queries (ctd)

The first of our series of advanced Oracle tutorials introduced the topic of sub queries and covered non-correlated sub queries. In this tutorial (the second of our series of advanced Oracle tutorials) we'll look at correlated sub queries and sub query factoring.


Subquery factoring

Since Release 9i, Oracle has allowed SQL sub queries to be declared before the main body of a query and given a name. This is particularly useful for sub queries that will be referenced more than once in the main query, or for sub queries that are very long or particularly complex.

From the developer's point of view for the main body of the query,

it enables the sub query to be treated as if it were a physical table and in fact the sub query may be optimised as if it were a real (temporary) table. The other way that Oracle optimises it is by treating it as an in-line view.

Let's look at a few examples. For this first, very simple example, we'll use the following table definitions:

CREATE TABLE author
(author_id   NUMBER PRIMARY KEY
,author_name VARCHAR2(30)
);

and

CREATE TABLE Oracle_tutorial 
(tutorial_id   NUMBER PRIMARY KEY
,tutorial_name VARCHAR2(30)
,author_id     NUMBER REFERENCES author(author_id)
);


Oracle Tips & Tricks to SKYROCKET Your Career!

If you're not already a subscriber to our ezine you're missing out on a myriad of tips and tricks in SQL and PL/SQL to help you become a better, faster, smarter Oracle developer. Subscribe now to ignite your career.


WITH the_author AS 
(
    SELECT author_id FROM author 
    WHERE author_name = 'Andy McKay'

SELECT tutorial_name 
FROM oracle_tutorial JOIN the_author USING (author_id);


That is a very simple example which shows that we can treat the SQL sub query as if it were a table and join it to other tables. 

In Oracle SQL, we can also define multiple subqueries and a sub query can reference another sub query as in this second example which reports the names and the number of Oracle tutorials written by authors that have written less than the average number of Oracle tutorials.

WITH tutorial_count AS 
(
    SELECT count(tutorial_id) bkcount, author_name 
    FROM author JOIN oracle_tutorial
 USING (author_id)
    GROUP BY author_name

tutorial_avg AS 
(
SELECT SUM(bkcount)/COUNT(*) bkavg 
FROM tutorial_count
)
SELECT bkcount,author_name
FROM tutorial_count
WHERE bkcount < bkavg

Of course these queries could be written in other ways, but, as mentioned ealier, writing queries this way does give them some structure and in this second of advanced Oracle tutorials we're only looking at sub query factoring and correlated sub queries. One point to note is that expressions in subqueries (such as count(tutorial_id) in the sub query tutorial_count) must be given an alias if you want to refer to the expressions elsewhere in the query.

Recursive Subquery Factoring

New with the release of Oracle 11g R2 is the ability to have recursive sub query factoring (i.e. the sub query refers to itself) in SQL. This makes Oracle 11g R2 and above compliant with ANSI standard and provides a new method of performing hierarchical queries instead of using the traditional CONNECT BYclause.
A recursive sub query factoring clause in SQL has 2 parts, the anchor member, which must appear first and cannot be recursive, and the recursive member which must reference the sub query name once and once only. The 2 parts must be combined via the UNION ALL operator. 

There are a number of restrctions and options that apply to recursive sub query factors. For example, the 2nd part of the recursiive sub query factor, the member, cannot include DISTINCT, GROUP BY, model clauses or aggregate functions. You can however use analytic functions and you can specify whether the query should search depth first or breadth first. 

For example, to find out the names and employee ids of everyone who is managed by Zlotkey (employee 149) both directly and indirectly, we could run the following query.

WITH managed_by_zlotkey (emp, name, mgr, grade) AS
(

SELECT employee_id, last_name, manager_id, 0 grade
FROM    employees
WHERE  employee_id = 149

UNION ALL

SELECT employee_id, last_name, manager_id, grade+1
FROM   employees emp
     , managed_by_zlotkey z

WHERE  z.employee_id = emp.mgr_id

)
SELECT emp, name, mgr, grade 
FROM managed_by_zlotkey
ORDER BY grade, emp;


We don't have the space in this in this series of advanced Oracle tutorials to cover all of the possible permuations of recursive sub query factors - see the Oracle SQL Reference Manual available from the Oracle Technology Network for full details.

Correlated Subqueries

Having covered sub query factoring quite comprehensively, we'll continue our second tutorial in this series of advanced Oracle tutorials by looking at correlated sub queries. 

In a correlated sub query, the sub query refers to a column from a table in the parent query which means that they are executed once per row in the parent query. The difference between correlated subqueries and non-correlated subqueries is that non-correlated subqueries won't refer to a column in the parent table and are executed once for the whole statement. Both types of subqueries are used to answer multi-part questions such as "list all employess in the marketing department earning more than the avregare salary of employess in the marketing department". 

Although subqueries and joins are usually interchangeable in SQL, there may be significant differences in the performance of the query in Oracle depending on how it is written. In particular, a SQL statement using a correlated sub-query is usually much faster than one using the equivalent non-correlated sub query. 

Let's look at an example using the employees and departments in the hr schema provided with the sample Oracle database. In this case we want to know which, if any, departments have no employees assigned to them. 

It might sound as if this is just a single-part question but if we break it down we realise that we need to answer 2 questions because the only link between departments and employess is from the employees table which has a department_id as one of the columns.

In other words only by looking at the employee table can we find out which departments have employees. Therefore to determine which departments have no employees we need to first of all determine which departments have employees and exclude those departments from the set of all departments. 

Let's look at another example to see how we would translate this into SQL.

Example 3 - correlated sub query

SELECT departments.name
FROM departments d
WHERE NOT EXISTS 
   (SELECT department_id
   FROM employees e
   WHERE e.department_id = d.department_id)

When run, Oracle will query each record in the employees table to see if there are records in with this department_id i.e to see if any employees work for this department. If a match found that department is skipped and the next one is examined and so on until the whole table has been processed.

If the employees table has an index on the department_id, then this query will only need to examine the index and won't need to query the table itself making it very efficient. As a non-correlated sub query, Oracle would have to perform a full table scan on the employees table. This could be quite a slow process for a large organisation and it also means the data would not be cached for re-use by other queries.

As well as reading our other Oracle tutorials, one of the best ways to learn more about Oracle is to attend a formal training course. Our partner Smartsoft offers Oracle training in the UK on and off-site.


Looking for an Oracle training course? Learn Oracle from the experts and sky- rocket your career with Oracle training in the UK from Smartsoft. 


Return form Oracle tutorials - subsqueries (ctd) to advanced Oracle tutorials #1
Return to home page