The code to compute the median is not very practical, as it employs a cross join. On one of my modest tables, the 60,000 rows will become 3.6 billion in the intermediate result set. I let it run for over 30 minutes before I terminated the query. An alternate query could be based on something like the query shown below. On my 60,000 row table, it runs in just a few seconds:
DROP TABLE events
GO
CREATE TABLE events
(
id int IDENTITY(1,1) PRIMARY KEY
, num smallint
)
GO
SET NOCOUNT ON
INSERT events (num) VALUES (1)
INSERT events (num) VALUES (2)
INSERT events (num) VALUES (3)
INSERT events (num) VALUES (4)
INSERT events (num) VALUES (5)
INSERT events (num) VALUES (6)
INSERT events (num) VALUES (7)
INSERT events (num) VALUES (8)
INSERT events (num) VALUES (9)
INSERT events (num) VALUES (10)
INSERT events (num) VALUES (11)
INSERT events (num) VALUES (12)
INSERT events (num) VALUES (13)
INSERT events (num) VALUES (14)
INSERT events (num) VALUES (15)
INSERT events (num) VALUES (16)
SET NOCOUNT OFF
SELECT
((
SELECT Max(tbl.numVal) AS mode
FROM (SELECT TOP 50 PERCENT e.num AS numVal
FROM events e
ORDER BY e.num ASC
) AS tbl
)
+
(
SELECT Min(tbl.numVal) AS mode
FROM (SELECT TOP 50 PERCENT e.num AS numVal
FROM events e
ORDER BY e.num DESC
) AS tbl
)) / 2 AS mode
Discussion on:
View:
Show:
How about a cursor ?
Create a temporary table with a primary key.
insert into it with a select distinct of the data
Get the row count halve it, fetch next [absolute] to position the mid point and off you go
You don't even need the temp table if @@rowcount returns the cardinality of the cursor for your DBMS
No min, max or explicit ordering
This sort of thing is as likely to be performed on floating point data as well.
There I'd use one identity field, and another for the value insert insert select distinct, get the max identity/ rowcount and then jump in for nth or ((nth+1) + (nth) ) / 2
Median (middle) on character data can be useful in some situations as well. Such as splitting a client list into equal batches for manual processing A - M and N - Z if evenly distributed.
Create a temporary table with a primary key.
insert into it with a select distinct of the data
Get the row count halve it, fetch next [absolute] to position the mid point and off you go
You don't even need the temp table if @@rowcount returns the cardinality of the cursor for your DBMS
No min, max or explicit ordering
This sort of thing is as likely to be performed on floating point data as well.
There I'd use one identity field, and another for the value insert insert select distinct, get the max identity/ rowcount and then jump in for nth or ((nth+1) + (nth) ) / 2
Median (middle) on character data can be useful in some situations as well. Such as splitting a client list into equal batches for manual processing A - M and N - Z if evenly distributed.
I use SQL Server 2000. I tried the suggested cursor method, and it is comparable in speed, and possibly slightly faster. Here is one version that I tested:
--Cursor Version
DROP TABLE events
GO
CREATE TABLE events
(
id int IDENTITY(1,1) PRIMARY KEY
, num smallint
)
GO
SET NOCOUNT ON
INSERT events (num) VALUES (1)
INSERT events (num) VALUES (2)
INSERT events (num) VALUES (3)
INSERT events (num) VALUES (4)
INSERT events (num) VALUES (5)
INSERT events (num) VALUES (6)
INSERT events (num) VALUES (7)
INSERT events (num) VALUES (8)
INSERT events (num) VALUES (9)
INSERT events (num) VALUES (10)
INSERT events (num) VALUES (11)
INSERT events (num) VALUES (12)
INSERT events (num) VALUES (13)
INSERT events (num) VALUES (14)
INSERT events (num) VALUES (15)
INSERT events (num) VALUES (16)
--INSERT events (num) VALUES (17)
SET NOCOUNT OFF
DECLARE @mode int
, @rowCount int
, @midpoint int
, @num2 int
SELECT @rowcount = Count(*) FROM events
DECLARE modeCsr INSENSITIVE SCROLL CURSOR
FOR SELECT num FROM events ORDER BY num
FOR READ ONLY
OPEN modeCsr
IF @rowCount % 2 = 0
BEGIN
SET @midpoint = @rowcount / 2
FETCH ABSOLUTE @midpoint FROM modeCsr INTO @mode
END
ELSE
BEGIN
SET @midpoint = (@rowcount + 1) / 2
FETCH ABSOLUTE @midpoint FROM modeCsr INTO @mode
FETCH NEXT FROM modeCsr INTO @num2
SET @mode = ( @mode + @num2 ) / 2
END
CLOSE modeCsr
DEALLOCATE modeCsr
SELECT @mode AS mode
--Cursor Version
DROP TABLE events
GO
CREATE TABLE events
(
id int IDENTITY(1,1) PRIMARY KEY
, num smallint
)
GO
SET NOCOUNT ON
INSERT events (num) VALUES (1)
INSERT events (num) VALUES (2)
INSERT events (num) VALUES (3)
INSERT events (num) VALUES (4)
INSERT events (num) VALUES (5)
INSERT events (num) VALUES (6)
INSERT events (num) VALUES (7)
INSERT events (num) VALUES (8)
INSERT events (num) VALUES (9)
INSERT events (num) VALUES (10)
INSERT events (num) VALUES (11)
INSERT events (num) VALUES (12)
INSERT events (num) VALUES (13)
INSERT events (num) VALUES (14)
INSERT events (num) VALUES (15)
INSERT events (num) VALUES (16)
--INSERT events (num) VALUES (17)
SET NOCOUNT OFF
DECLARE @mode int
, @rowCount int
, @midpoint int
, @num2 int
SELECT @rowcount = Count(*) FROM events
DECLARE modeCsr INSENSITIVE SCROLL CURSOR
FOR SELECT num FROM events ORDER BY num
FOR READ ONLY
OPEN modeCsr
IF @rowCount % 2 = 0
BEGIN
SET @midpoint = @rowcount / 2
FETCH ABSOLUTE @midpoint FROM modeCsr INTO @mode
END
ELSE
BEGIN
SET @midpoint = (@rowcount + 1) / 2
FETCH ABSOLUTE @midpoint FROM modeCsr INTO @mode
FETCH NEXT FROM modeCsr INTO @num2
SET @mode = ( @mode + @num2 ) / 2
END
CLOSE modeCsr
DEALLOCATE modeCsr
SELECT @mode AS mode
Only one pass and no aggregates. I've got used to using them whenever I need to treat a list of records as a sequence instead of a set.
- Keyboard Shortcuts:
- Prev
- Next
- Toggle









































