MySQL – How to get Top N rows for each group

We have the following table and data:

CREATE TEMPORARY TABLE girls(
name text,
haircolor text,
score INT
);
 
INSERT INTO girls VALUES ('Megan','brunette',9);
INSERT INTO girls VALUES ('Tiffany','brunette',5);
INSERT INTO girls VALUES ('Kimberly','brunette',7);
INSERT INTO girls VALUES ('Hester','blonde',10);
INSERT INTO girls VALUES ('Caroline','blonde',5);
 
SELECT * from girls;
+----------+-----------+-------+
| name     | haircolor | score |
+----------+-----------+-------+
| Megan    | brunette  |     9 | 
| Tiffany  | brunette  |     5 | 
| Kimberly | brunette  |     7 | 
| Hester   | blonde    |    10 | 
| Caroline | blonde    |     5 | 
+----------+-----------+-------+
5 rows in set (0.00 sec)

Session Variables

MySQL, at least on the versions I've checked, does not support ROW_NUMBER() function that can assign a sequence number within a group, the MySQL session variables can be used to build a workaround. Session variables do not need to be declared first and can be used to do calculations and perform actions based on them. They appear to require initialization. For instance:

@previous_haircolor := haircolor

This code is executed for each row and stores the value of haircolor column to @current_haircolor variable.
@girl_rank := IF(@previous_haircolor = haircolor, @girl_rank + 1, 1)

In this code, if @previous_haircolor equals the haircolor on the current row we increment rank, otherwise set it to 1. For the first row @previous_haircolor is NULL, so rank is also initialized to 1.

For correct ranking, we need to have ORDER BY haircolor,score DESC

So if we run:

SELECT @previous_haircolor := null; /* Initialize first, or girl_rank will always be 1 */
SELECT
name,haircolor,score,
@girl_rank := IF(@previous_haircolor = haircolor, @girl_rank + 1, 1) AS girl_rank,
@previous_haircolor := haircolor
FROM girls
ORDER BY haircolor,score DESC

We get the list of girls ranked by their score within the haircolor group:

+----------+-----------+-------+-----------+----------------------------------+
| name     | haircolor | score | girl_rank | @previous_haircolor := haircolor |
+----------+-----------+-------+-----------+----------------------------------+
| Hester   | blonde    |    10 |         1 | blonde                           | 
| Caroline | blonde    |     5 |         2 | blonde                           | 
| Megan    | brunette  |     9 |         1 | brunette                         | 
| Kimberly | brunette  |     7 |         2 | brunette                         | 
| Tiffany  | brunette  |     5 |         3 | brunette                         | 
+----------+-----------+-------+-----------+----------------------------------+
5 rows in set (0.00 sec)

When we have a rank assigned to each girl within her haircolor group, we can request the wanted range:

-- Get top 2 for each haircolor
SELECT name, haircolor, score
FROM (/*subquery above*/) ranked
WHERE girl_rank <= 2;

So the full query is:

SELECT @previous_haircolor := null; /* Initialize first, or girl_rank will always be 1 */
SELECT name, haircolor, score
FROM (SELECT
name,haircolor,score,
@girl_rank := IF(@previous_haircolor = haircolor, @girl_rank + 1, 1) AS girl_rank,
@previous_haircolor := haircolor
FROM girls
ORDER BY haircolor,score DESC) ranked
WHERE girl_rank <= 2;

Using oracle, SQL Server and PostgreSQL: ROW_NUMBER()

In Oracle, SQL Server and PostgreSQL (version 8.4 and higher) you can achieve the same functionality using ROW_NUMBER function:

SELECT name,haircolor,score
FROM
(SELECT name,haircolor,score
ROW_NUMBER() OVER (PARTITION BY haircolor ORDER BY score DESC) as girl_rank
FROM girls) ranked
WHERE girl_rank <= 2;
© GeekLabInfo MySQL - How to get Top N rows for each group is a post from GeekLab.info. You are free to copy materials from GeekLab.info, but you are required to link back to http://www.geeklab.info

1 Star2 Stars3 Stars4 Stars5 Stars (1 votes, average: 1.00 out of 5)
Loading...

One comment on “MySQL – How to get Top N rows for each group”

Leave a Reply