PostgreSQL – Correlated Subqueries

What is Correlated Subqueries ?

A correlated subquery is a query nested inside of another that uses values from the outer query.It uses values from the outer query.A correlated subquery basically means that the sub query portion is correlated with the outer query.

Example : We want to get all of those employees that make more than the average of their departments. In that case we’d need to form a relationship between these two query and that will cause it to be correlated.


SELECT first_name, salary
FROM employees e1
WHERE salary > (SELECT round(AVG(salary)) FROM employees e2 WHERE e1.department = e2.department)

e1.department with e2.department and now this is considered a correlated subquery and this is often also referred to as a repeating sub query. And the reason for that is this particular query this <i>subquery needs to run for every single record of the outer query</i>.

all the records for every single record this query “SELECT round(AVG(salary)) FROM employees e2 WHERE e1.department = e2.department” would need to be executed.

Every single record of the outer queries WHERE clause is going to be executing this query over and over again every single record of the outer query.This condition needs to be checked against the record and this query would need to be executed.

I understand that the inner query is going to run for every single record of the outer query write the sub query is running for every single record of the outer query because we are forming a link between the departments that are from the outer query with departments that are part of the inner sub query and that is what is known as correlation.

We are correlating these two data sets on a common column which is the department column. we get all the employees that make more than their average departmental salary.it’s using attributes of the outer query within the inner sub query

Example 2 : we wanted to see the average salary paid out by each department as part of the select list.


SELECT first_name,department, salary,
(SELECT round(AVG(salary)) FROM employees WHERE e1.department = e2.department) as avg_department_salary
FROM employees e1

The key thing to always remember and I to repeat this again and again here is that the correlated subquery is run for every single record of the outer query and it uses. It uses information from the outer query.

Not Correlated :

This subquery is not using any values from the outer query or it’s not correlated, ex:


SELECT first_name, salary
FROM employees
WHERE salary > (SELECT round(AVG(salary)) FROM employees)

the sub query runs once, we get the average of all the employees and thats the value that gets compared with salary.

Excercise 1 : Write a query to obtain the names of departments that have more than thirty eight employees working.


SELECT department
FROM departments d
WHERE 38 < (SELECT count(*) FROM employees e WHERE e.department = d.department)

Above query can be written as :


SELECT distinct department 
FROM employees e1
WHERE 38 < (SELECT COUNT(*) 
		FROM employees e2
		WHERE e1.department = e2.department)

Above can be wirtten as using GROUP BY command :


SELECT department 
FROM employees e1
WHERE 38 < (SELECT COUNT(*) 
		FROM employees e2
		WHERE e1.department = e2.department)
GROUP BY department

In the outer query using department is the better then employees, because it has minimum records for repeating.

Excercise 1 : I’d like to see the department and then some number in the second column and it is another column with the highest paid employees salary for each one of these departments.


SELECT department, (SELECT MAX(salary) FROM employees WHERE department = d.department)
FROM departments d
WHERE 38 < (SELECT count(*) FROM employees e WHERE e.department = d.department)

Following both query return same results :


SELECT department, (SELECT MAX(salary) FROM employees WHERE department = d.department )
FROM departments d

SELECT department, max(salary)
FROM employees
GROUP BY department

Excercise 2 : We want to display the table as show below –

department first_name salary salary_in_department
Automotive Mill 162522 HIGEST SALARY
Automotive Laurie 29752 LOWEST SALARY
Beauty Orland 162845 HIGEST SALARY

the query for this output is :-


SELECT department, first_name, salary,
CASE WHEN salary = max_by_department THEN 'HIGEST SALARY'
     WHEN salary = min_by_department THEN 'LOWEST SALARY'
END as salary_in_department
FROM (
SELECT department, first_name, salary,
	(SELECT max(salary) FROM employees e2
	 WHERE e1.department = e2.department) as max_by_department,
	(SELECT min(salary) FROM employees e2
	 WHERE e1.department = e2.department) as min_by_department
FROM employees e1

) a
WHERE salary = max_by_department OR
	salary = min_by_department
order by department

Leave a Reply

Your email address will not be published. Required fields are marked *