SQL language includes a robust set of functions useful for analytical processing. If you need to generate rankings to identify the top 10 salespeople in your company or if you are generating a financial report for your customer and need to calculate three-month rolling averages, you can use SQL’s built-in analytic functions to perform these types of calculations.
select quarter(payment_date) as quarter,
monthname(payment_date) as month,
sum(amount) as monthly_sales,
***max(sum(amount)) over() as max_overall_sales,
max(sum(amount)) over(partition by quarter(payment_date)) as max_qrtr_sales***
from payment
where year(payment_date) = 2005
group by 1,2;
This is what we called grouping rows into windows, which effectively partition the data for use by the analytic function without changing the overall result set.
over () then the window should include the entire result set.
over(partition by quarter(payment_date)) specifies that window should include only rows within the same quarter.select quarter(payment_date) as quarter,
monthname(payment_date) as month,
sum(amount) as monthly_sales,
***rank() over(order by sum(amount) desc) as sales_rank***
from payment
where year(payment_date) = 2005
group by 1,2
order by 1, month(payment_date);
<aside> 💡 You will still need an order by clause at the end of your query if you want the result set to be sorted in a particular way.
</aside>
We can provide a different set of rankings per quarter:
...
sum(amount) as monthly_sales,
***rank() over(partition by quarter(payment_date) order by sum(amount) desc) as sales_rank***
from payment
...
row_number
Returns a unique number for each row, with rankings arbitrarily assigned in case of a tie
rank
Returns the same ranking in case of a tie, with gaps in the rankings
dense_rank
Returns the same ranking in case of a tie, with no gaps in the rankings
In the case of top 10 customers:
From 1 to 10, but can be a problem when same values are found in the 10th ranked customer area.
Identify customer ranked 10 or less. Maybe the best option.
Identify rank 10 or less, but can yields a lot more results because multiple people holding the same rank.
Basically multiple sets of ranking in a result set, divided by time or categories.