In previous post I’ve tried to present my solution for solving max(count()) problem. The solution was slightly suboptimal and I’ve needed to speed it up a bit, because I’m using it now in some statistical calculations and every millisecond is important.
This is original solution.
CREATE VIEW data_view_source AS
SELECT DISTINCT ON (dp_id) dp_id, ds_id FROM
( SELECT dp_id, ds_id, count FROM data_view_source_count ORDER BY dp_id, count DESC) as glob
WHERE
(dp_id, count) IN
(SELECT dp_id, max(count) as max FROM
( SELECT dp_id, ds_id, count FROM data_view_source_count ORDER BY dp_id, count DESC) as minmax
GROUP BY dp_id)
ORDER BY dp_id;
and after bit of thinking a realizing how dummy I’m
CREATE VIEW data_view_source AS
WITH tmp1 AS (
SELECT dp_id, ds_id, count FROM data_view_source_count
)
SELECT DISTINCT ON (dp_id) dp_id, ds_id FROM
( SELECT dp_id, ds_id, count FROM tmp1 ORDER BY dp_id, count DESC) as glob
WHERE
(dp_id, count) IN
(SELECT dp_id, max(count) as max FROM
( SELECT dp_id, ds_id, count FROM tmp1 ORDER BY dp_id, count DESC) as minmax
GROUP BY dp_id)
ORDER BY dp_id;
Using WITH clause removes duplicate selects and speeds up data_view_source view by cca. 15-30% (for me).