PostgreSQL – Subqueries

what’s a subquery ?

It’s actually a query that exists within a query.So a query within a query a select statement within a select statement.

Example : so when I say that select all of the employees where department is not in this list of departments that is generated from this query that’s going to give me all of the departments that exist in the Employees table that that do not exist in the departments table.


SELECT * FROM employees
WHERE department NOT IN (SELECT department FROM departments)

We’re using a subquery in the WHERE clause we can use the subquery in the FROM clause as well.

This is not a table right this is a query but it serves as a source, a source from where the data can be pulled from.


SELECT *
FROM (SELECT * FROM employees WHERE salary > 150000) a

SELECT a.employee_name, a.yearly_salary
FROM (SELECT first_name employee_name, salary yearly_salary FROM employees WHERE salary > 150000) a

we can have multiple sources.

SELECT a.employee_name, a.yearly_salary
FROM (SELECT first_name employee_name, salary yearly_salary 
	FROM employees WHERE salary > 150000) a,
	(SELECT department FROM departments) b

Subquery in the SELECT clause, but it is limit one record, otherwise it throw error.

SELECT first_name, last_name, salary, (SELECT first_name FROM employees limit 1)
FROM employees

Excersise 1 : Write a select statement that returns all of those employees that work in the electronics division.

SELECT * FROM employees
WHERE department
	IN (SELECT depatment FROM departments WHERE division = 'Electronics')

Excersise 2 : The assignment is give me those employees that work in Asia or Canada and that make over 130 thousand dollars.

SELECT * FROM employees 
	WHERE region_id 
	IN (SELECT region_id FROM regions WHERE country IN ('Asia', 'Canada'))
	AND salary > 130000

Excersise 3 : I want to see the first name as well as a department employee works for and how much less that particular employee makes than the highest paid employee in the company. And these employees should be working in Asia and Canada.

SELECT first_name, department, ((SELECT max(salary) FROM employees ) - salary) as Salary_Diff 
	FROM employees
	WHERE region_id 
	IN (SELECT region_id FROM regions WHERE country IN ('Asia', 'Canada'))

when comparing with a single value and this subquery is going to be returning multiple values. So there is a clause that was introduced way early on in sequel and this will work in all databases and that is the all or any clause ( ALL / ANY ).

And this is basically this becomes part of the condition you first give the operator (>, < , =, >=, <= ) and then you use the any or all . Basically they could be used in the WHERE clause as well as the having clause , it could also accept this any. So what is this. Well the any operator returns true if any of the sub query values meet the condition. so if the region ID of the employee is greater than any one of these regions it is returned by the subquery then this would evaluate to true or it this entire expression would evaluate to true and only those records will be returned. That's what any is the any operator returns true if any of the sub query values meet the condition.

SELECT * FROM employees
	WHERE region_id > ANY (SELECT region_id FROM regions WHERE country = 'United States')

You can use greater than or equal to less than or equal to all of those operators are available for use with any and all. So this any and all can be used with a WHERE clause as well as the HAVING clause to keep that in mind.

Exercise 1 : Write a query that returns all of those employees that work in the kids division AND the dates at which those employees where hired is greater than all of the hire_dates of employees who work in the maintenance department.

SELECT *
FROM employees
WHERE department = ANY (SELECT department 
			FROM departments
			WHERE division = 'Kids')
AND hire_date > ALL (SELECT hire_date
			FROM employees
			WHERE department = 'Maintenance')

The employees that we want to see in the output their higher rates should be greater than all of
these higher dates . And the higher dates of all of these employees is greater than the higher rates of the employees that work in the maintenance department. And We can not use ANY in this place.

Excersise 2 : This assignment I want you to give me those salaries that appear the most frequently

SELECT salary FROM  (
SELECT salary, count (*)
FROM employees
GROUP BY Salary
ORDER BY count(*) desc, salary desc
LIMIT 1 
) a 

Excersise 3 : The assignment I’d like you to write a query that returns only unique data. So I’d like to see Frank.I’d like to see Robert only once and I’d like to see Peter and Sam only once. But here’s a twist. I’d like to also see their IDs. So I want to see the Id call them as well as the name call them but I only want to see the unique names.

CREATE TABLE dupes (id integer, name varchar(10));

INSERT INTO dupes VALUES (1, 'FRANK');
INSERT INTO dupes VALUES (2, 'FRANK');
INSERT INTO dupes VALUES (3, 'ROBERT');
INSERT INTO dupes VALUES (4, 'ROBERT');
INSERT INTO dupes VALUES (5, 'SAM');
INSERT INTO dupes VALUES (6, 'FRANK');
INSERT INTO dupes VALUES (7, 'PETER');

TRUNCATE TABLE dupes;
DROP TABLE dupes;

I just want to see just one record of frank with one of those IDs one or two one record with Robert with one of those IDs three and four. And of course Sam and Peter. There’s only one occurrence.

SELECT min(id), name
FROM dupes
GROUP BY name

Note : IN clause is expecting a series of data from one column not multiple columns.see below

SELECT * FROM dupes
WHERE id IN (
SELECT min(id), name
FROM dupes
GROUP BY name
)

ERROR:  subquery has too many columns
LINE 2: WHERE id IN (
 

^
Actual Query :

SELECT * FROM dupes
WHERE id IN (
SELECT min(id)
FROM dupes
GROUP BY name
);

Above and the below will return same output :

SELECT min(id), name
FROM dupes
GROUP BY name

Again the subquery was not necessary. But having a subquery allows for more flexibility and as you learn SQL you’re going to come across points where you need to delete data.if I needed to delete data from dupes for that case we would we would in fact need a sub query.

This basically deletes the records that are filtered through the where clause.So we’re deleting from dupes where the ID is not in one of these IDs.

DELETE FROM dupes
WHERE id NOT IN (
SELECT min(id)
FROM dupes
GROUP BY name
);

SELECT * FROM dupes , you will notice those record permanently deleted.The duplicate records have been permanently deleted from this table.

Note : Sequel is case insensitive

Excersise 4 : The assignment is I’d like you to compute the average of all the employees that we have in our company but only to exclude the minimum and maximum salaries of the employee or it so that the employee that makes the least amount of money in the company and the employee that makes the most amount of money in the company. I’d like you to exclude those results from competing the average. So the deliverable for this assignment is I expect to one number and that’s going to be the average of all of our employees in the company excluding them the highest paid and the lowest paid.

SELECT ROUND (AVG(salary))
FROM employees
WHERE salary NOT IN (
(SELECT MIN(salary) FROM employees),
(SELECT MAX(salary) FROM employees)
)

Leave a Reply

Your email address will not be published. Required fields are marked *