logo for asktheoracle.net
Home
What's new?
Oracle Tutorials
PL/SQL
SQL
Oracle Tips & Tricks
Advanced Tutorials
Certification
Oracle 10g
Oracle 11g
Oracle and .Net
Oracle Tools
Disclaimer
Privacy Policy
Search This Site
Oracle Questions?
leftimage for asktheoracle.net

SQL Aggregate Functions

The sql aggregate functions are a very important part of the language as they enable us to group a set of rows together to return a single result.  There are over 40 sql aggregate functions defined for the Oracle database but there are
only five that are very frequently used:
  • SUM
  • COUNT
  • MIN
  • MAX
  • AVG
A complete list of the different  aggregate functions is available in the Oracle SQL Language Reference manual.

There are 3 places where aggregate functions can appear in
a query:-
  • in the select list (the items before the FROM clause)
  • in the ORDER BY clause
  • and in the HAVING clause
You might believe that these functions always have to appear in the select list but that is not the case, it is perfectly valid to just use the function in the ORDER BY or HAVING clauses.

For example you might want a list of departments ordered by the average salary. To do this you would select the department id (or join to the departments table to get the department name) and ORDER BY average salary like so:

SELECT department_id FROM employees
GROUP BY
department_id ORDER BY avg(salary);

SQL aggregate functions are commonly used with the GROUP BY clause in queries (or sub queries) to return constants or other columns that are not being aggregated. The GROUP BY clause is mandatory when selecting expressions that exclude an sql aggregate function when it is used elsewhere in the query as shown in the above example. It is used to tell Oracle how to group the results.

It might seem obvious that if you wish to see the average salary of employees you would group by the department id, but you could just as easily group by the manager id and/or any other column that is not being aggregated and it is not necessary to include in the select list any of the columns that you are grouping by.

We could change our previous example to this:

SELECT "X" FROM employees
GROUP BY
department_id ORDER BY avg(salary);

which would return as many "X"s as there were different department ids in the table.

Or we could change the query to do this:

SELECT department_id FROM employees
GROUP BY
department_id,manager_id ORDER BY avg(salary);

which would return one row for each combination of department ids and manager ids in the table. These particular queries may not make sense from a business point of view but they are syntactically valid and you occasionally get quirky business requirements which make these seemingly strange queries necessary.

General Rules for Sql Aggregate Functions

There are a few general rules and conditions for sql aggregate functions that we need to be aware of.

Firstly, for some of them (particularly the most used ones (SUM, COUNT, AVG, MAX and MIN), the aggregate can be calculated using either ALL values (the default) or just on DISTINCT (UNIQUE) values.

Oracle Tips & Tricks to SKYROCKET Your Career!

If you're not already a subscriber to our ezine you're missing out on a myriad of tips and tricks in SQL and PL/SQL to help you become a better, faster, smarter Oracle developer. Subscribe now and ignite your career.

For example if we wanted to calculate the average salary of employees, if we calculated this on DISTINCT values then any duplicates would be ignored, so if 2 employees had the same salary only one would be used to compute the average.

This query which looks at all salaries in the employees table:

SELECT ROUND(AVG(salary)) FROM employees
;

produces a result of 6350 when run on our test database. Whereas this next query which ignores duplicates produces a result of 6861 in our test database.

SELECT ROUND(AVG(DISTINCT salary)) FROM employees;

The DISTINCT keyword is most commonly used with the SQL COUNT function to count unique occurrences of a particular item.

What about NULL values?

In Oracle the NULL value means the value of something is undefined so it can't take place in calculations. Therefore 10+NULL has a value of NULL (i.e. undefined) because obviously if you add 10 to an undefined number (one whose value you don't know) the result is going to be undefined. This means that if you attempt to calculate the sum or average of a null value, the answer will be null.

For example the query:

SELECT SUM(salary) FROM employees;

will produce a result of NULL if any employees have a NULL salary. NULL does not mean zero and it can never be equal to any value, not even NULL (because it is undefined - it doesn't have a value). The way to overcome this is to either use the NVL function to convert NULLs to a specific value (usually zero for numbers) or to restrict the rows aggregated to those with a not NULL value.

Our previous example could therefore be amended to be either:

SELECT SUM(NVL(salary,0)) FROM employees;
or
SELECT SUM(salary) FROM employees WHERE salary IS NOT NULL;

This affects all the SQL aggregate functions except for COUNT, GROUPING and GROUPING_ID.

Nesting SQL Aggregate Functions

Aggregate functions can be nested in the same way as other functions such as  to_number or to_char, so for example we can find out which department has the highest average salary with a query such as this:

SELECT MAX(AVG(salary)) FROM employees GROUP BY department_id;

The average salary (assuming salary can't be NULL) for each department is calculated first and then the largest of these is returned as the final result.

Eliminating Groups With the Having Clause

Just as the WHERE clause is used to eliminate individual rows from the calculation, the HAVING clause is used to eliminate aggregates from the results.

The following query for example calculates the average of the maximum salaries for those departments that have a maximum salary less than 12000.

SELECT AVG(MAX(salary)) FROM employees
GROUP BY department_id HAVING MAX(salary)<12000;

In other words, the maximum salary for all departments is calculated and any with a maximum salary >= 12000 are then eliminated from the calculation of the average. Obviously you're not restricted to using the HAVING clause just when you're nesting functions. One important point to note is that the HAVING clause works at the group level - after the aggregate has been calculated - so in performance terms it is better to eliminate data from the calculation by use of the WHERE clause as this restricts the number of rows that have to be processed.

As well as reading our other Oracle tutorials, one of the best ways to learn more about Oracle is to attend a formal training course. Smartsoft Computing offer Oracle training in the UK both on and off-site as well as expert Oracle training in New Zealand .