601. Human Traffic of Stadium

Description of Problem

Table: Stadium

+---------------+---------+
| Column Name   | Type    |
+---------------+---------+
| id            | int     |
| visit_date    | date    |
| people        | int     |
+---------------+---------+
visit_date is the column with unique values for this table.
Each row of this table contains the visit date and visit id to the stadium with the number of people during the visit.
As the id increases, the date increases as well.

Write a solution to display the records with three or more rows with consecutive id's, and the number of people is greater than or equal to 100 for each.

Return the result table ordered by visit_date in ascending order.

The result format is in the following example.

Example 1:

Input: 
Stadium table:
+------+------------+-----------+
| id   | visit_date | people    |
+------+------------+-----------+
| 1    | 2017-01-01 | 10        |
| 2    | 2017-01-02 | 109       |
| 3    | 2017-01-03 | 150       |
| 4    | 2017-01-04 | 99        |
| 5    | 2017-01-05 | 145       |
| 6    | 2017-01-06 | 1455      |
| 7    | 2017-01-07 | 199       |
| 8    | 2017-01-09 | 188       |
+------+------------+-----------+
Output: 
+------+------------+-----------+
| id   | visit_date | people    |
+------+------------+-----------+
| 5    | 2017-01-05 | 145       |
| 6    | 2017-01-06 | 1455      |
| 7    | 2017-01-07 | 199       |
| 8    | 2017-01-09 | 188       |
+------+------------+-----------+
Explanation: 
The four rows with ids 5, 6, 7, and 8 have consecutive ids and each of them has >= 100 people attended. Note that row 8 was included even though the visit_date was not the next day after row 7.
The rows with ids 2 and 3 are not included because we need at least three consecutive ids.

Solution

Tags: SQL GROUP BY

Explanation

Consider the following example: \[ \begin{matrix} id: & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 & 10 \\ \text{Greater than or equals to 100}: & No & No & Yes & Yes & Yes & Yes & No & Yes & Yes & No \\ \end{matrix} \]

After filtering and assigning ROW_NUMBER(), we get: \[ \begin{align} id: & [\text{3 4 5 6 8 9}] \\ rowNumber: & [\text{1 2 3 4 5 6}] \\ groupNumber: & [\text{2 2 2 2 3 3}] \\ \end{align} \]

Finally, filter the group numbers by their occurrences (using HAVING) \[ \begin{align} id: & [\text{3 4 5 6}] \\ groupNumber: & [\text{2 2 2 2}] \\ \end{align} \]

Code (MySQL)

WITH filtered_result AS (
    SELECT *, id - ROW_NUMBER() OVER (ORDER BY id) AS group_number
    FROM Stadium
    WHERE people >= 100
)
SELECT t1.id, t1.visit_date, t1.people
FROM filtered_result t1
WHERE t1.group_number IN (
    SELECT t2.group_number
    FROM filtered_result t2
    GROUP BY t2.group_number
    HAVING COUNT(*) >= 3
) 
ORDER BY t1.visit_date

Reference

rusgurik's solution