PostgreSQL – GROUP BY, HAVING

I wanted to get the the total salary paid by each department I could do that by doing a sum.

SELECT SUM(salary) FROM employees

if I wanted to group by the different departments and say OK this department pays out a total of 10 million dollars to their employees and this other department is paying you know $15 million
to their employees and so on if you could break it down and group it by department How would you do that?.

SELECT department, SUM(salary)
FROM employees
GROUP BY department

We use the group by command and that goes after the select statement of course after the FROM clause and if there is a where clause it goes after the WHERE clause.If we had a where clause here after the WHERE clause that’s where you would have group by.

Now the where clause is there to filter specific employees.

I know region ID exists in employees.I can say were region ID in (4,5,6,7) and highlight this to execute and notice.

SELECT department,region_id, SUM(salary)
FROM employees
WHERE region_id in (54,5,6,7)
GROUP BY department, region_id
ORDER BY region_id

Excercise :
I just wanted to group by department and get the total number of employees that work for each department.

SELECT department,count(*) as EmployeeCount
FROM employees
GROUP BY department

Now we’re using one grouping function here but let’s say if I wanted to get the average salary per department or the minimum salary for a department maximum salary for department and the average salary of a department and the total number of employees that work in the department I can get all of that information by adding those functions here.

SELECT department,count(*) TotalEmployees, round(AVG(salary)), MIN(salary), MAX(salary)
FROM employees
GROUP BY department

if we wanted to filter.

we wanted to filter for only those employees that have a salary greater than $70000.

SELECT department,count(*) TotalEmployees, round(AVG(salary)), MIN(salary), MAX(salary)
FROM employees
WHERE salary > 70000
GROUP BY department

So this grouping is happening based on the filtered records.

Any column that’s specified in the select list if it’s a non aggregate column meaning you know there’s no grouping function being run on that call and there’s no aggregation happening and it’s basically a feature column such as department or gender or region.

That’s feature information that has nothing to do with aggregation.
If those columns are being referenced in the select statement you better you better include those columns in the group by clause as well.

SELECT department, gender, count(*) 
FROM employees
GROUP BY department, gender
ORDER BY department

Any non aggregate columns that are mentioned the select list must also be mentioned in the group by clause.

HAVING
I want to see only those departments that have more than 35 employees working.How would you do that ?
InValid Query :

SELECT department,  count(*) 
FROM employees
WHERE count(*) > 35
GROUP BY department 
ORDER BY department

It’s not allowed in sequel. The reason is the where clause is used to filter records.It’s not used to filter aggregated data to filter aggregated data.

There is another command known as the having command.That comes after the group by clause and before the order by.

SELECT department,  count(*) 
FROM employees
GROUP BY department 
HAVING count(*) > 35
ORDER BY department

This is the employees table the source of where the data is coming from. If you wanted to filter specific employees you could put the condition in the WHERE clause.But if you want to filter aggregated data you do that using the having clause.

Excersise 1 : The first assignment is I want to know how many people have the same first name in the company and what are those names in their accounts.

SELECT first_name, count(first_name) Total
FROM employees 
GROUP BY first_name
HAVING count(first_name) > 1

Excersise 2 : I want to see the unique departments in the Employees table but you cannot use the distinct keyword number

SELECT department
FROM employees
GROUP BY department

Excersise 3 : this assignment I’d like you to present to me a query that can give the different domain names in our company and the total number of employees that have that given domain name.

SELECT count(*) TotalCount, SUBSTRING(email, POSITION('@' in email) + 1) AS email_domain
FROM employees
WHERE email IS NOT NULL
GROUP BY SUBSTRING(email, POSITION('@' in email) + 1)
ORDER BY count(*) desc

Excersise 4 : Showing the minimum maximum and average salaries broken down by region and gender.

SELECT gender,region_id, MIN(salary), MAX(salary), round(AVG(salary))
FROM employees
GROUP BY gender,region_id
ORDER BY gender asc,region_id asc

Debt – Reserved surplus >25 X
NO Business to Business (Supliers to TATA) X
Business to Consumer OK
Profit Ratio : (Price / EPS) < 15 Ok Book Value : Total given Loan / No of Shares , Less than BookValue ok Dont Go NBFC : Dont go Rate Sensitives (Auto, realestate)

Post Views: 2

Leave a Reply Cancel reply