Question

Locked

TSQL - Select all columns for JUST DISTINCT records

By coolaid09 ·
Have a table that is something like this:

table: eventRecords
cols:
timestamp - datetime
eventid - varchar
user - varchar
filepath - varchar

I want to select one record for each unique "eventid", and I need for the data in all columns to be returned - not just eventid. The problem is:
"Select distinct eventid, timestamp, user, filepath from eventRecords" returns duplicate reports for eventid. Yes, I know this is the expected functionality for SQL. I've tried numerous variations to just return ONE record for each eventid, including:
Declare @mycount varchar
Set @mycount=(SELECT COUNT (DISTINCT [eventid])
FROM eventRecords)
And then plugging in @mycount as:
SELECT DISTINCT top (@mycount) eventid, timestamp, user, filepath from eventRecords

This brings back the right number of records, but there are duplicates of eventid still...

What step am I missing?

This conversation is currently closed to new comments.

9 total posts (Page 1 of 1)  
| Thread display: Collapse - | Expand +

All Answers

Collapse -

Your problem is which one !

by Tony Hopkinson In reply to TSQL - Select all columns ...

The most recent one, least, the first one it bumbs into?

e.g.

Select * From
(select EventID,Max([timestamp]) as MostRecentEvent from eventrecords) latestevents, eventRecords
Where latestevents.eventid = eventrecords.eventid and latestevents.MostRecentEvent = eventrecords.[timestamp]

Your table schema has no uniqueness property, if I was guessing eventid should be caseid or ticketid

HtHs

Collapse -

Tried the example...

by coolaid09 In reply to Your problem is which one ...

Tony, I tried to run your example, but an error displayed that said eventid is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. So I changed the query to:

Select * From (select eventid,Max([TimeStamp]) as MostRecentEvent
from eventrecords group by eventid)
latestevents, eventrecords
Where latestevents.eventid = eventrecords.eventid
and latestevents.MostRecentEvent = eventrecords.[timestamp]
group by eventrecords.eventid, latestevents

Now, the following displays:
Invalid column name 'latestevents'.

I'm a SQL noob - what is latestevents, and is the synatx right in your example?

I've simplified the description of the table - there is a column for "ID" (PK, char), which is unique for each record, but not for each eventid. I did not include it because when I do, even more duplicate eventids are returned.

Not sure what you mean when you say "if I was guessing eventid should be caseid or ticketid".

Collapse -

oops.

by Tony Hopkinson In reply to Tried the example...

Select * From (select eventid,Max([TimeStamp]) as MostRecentEvent
from eventrecords group by eventid)
latestevents, eventrecords
Where latestevents.eventid = eventrecords.eventid
and latestevents.MostRecentEvent = eventrecords.[timestamp]

Only need the first group by

latestevents is a an internal name for the sub query in the parentheses.

Collapse -

Still some dups, but I think I've found something that works

by coolaid09 In reply to oops.

I tried making this change - some duplicate eventids were returned. But I found a query on another forum that seems to work:

SELECT eventid, TimeStamp, user, filepath
FROM (SELECT eventid, TimeStamp, filepath, user,
ROW_NUMBER() OVER(PARTITION BY eventid
ORDER BY TimeStamp DESC)
AS seq
FROM eventrecords) AS T
WHERE seq = 1 order by eventid;

Collapse -

Hah, you know more about it than I do !

by Tony Hopkinson In reply to Still some dups, but I th ...
Collapse -

Apparently not...

by coolaid09 In reply to Hah, you know more about ...

While the dups are gone, when I compare the total count next to a COUNT DISTINCT, I'm missing 23 records...argh!!

Collapse -

maybe i've fixed it?

by coolaid09 In reply to Apparently not...

Tried including a range of timestamps in the query, but I put the range toward the end of the statement. The results are more accurate with the timestamp range in the subquery.

SELECT eventid, TimeStamp, user, filepath
FROM (SELECT eventid, TimeStamp, filepath, user,
ROW_NUMBER() OVER(PARTITION BY eventid
ORDER BY TimeStamp DESC)
AS seq
FROM eventrecords where ((TimeStamp >= '2010-4-1 00:00:00.000')
and (TimeStamp <= '2010-4-21 00:00:00.000'))) AS T
WHERE seq = 1 order by eventid;

Collapse -

Couldn't say mate

by Tony Hopkinson In reply to Apparently not...

Never used over or partition, and I outright refuse to use row_number. If I can't describe it as a set, I change my schema not my SQL.

Back to Software Forum
9 total posts (Page 1 of 1)  

Software Forums