sqlsql-servergreatest-n-per-group

How do I select only the latest entry in a table for each group so each 'latest' is a single row?


I have a 3 table SQLServer Database: two fact tables and a table to link them.

CREATE TABLE Project (
    ProjectID INT IDENTITY PRIMARY KEY,
    ProjectName VARCHAR(256)
);

CREATE TABLE Thing (
    ThingID INT IDENTITY PRIMARY KEY,
    ThingName VARCHAR(256)
);

CREATE TABLE ProjectThingLink (
    ProjectID INT NOT NULL,
    ThingID INT NOT NULL,
    CreatedDate DATE NOT NULL DEFAULT GETDATE(),
    UNIQUE (ThingID, CreatedDate),
    FOREIGN KEY (ProjectID) REFERENCES Project (ProjectID),
    FOREIGN KEY (ThingID) REFERENCES Thing (ThingID)
);

When a Thing is assigned to a Project, an entry is put in the ProjectThingLink table. Things can move between Projects. The CreatedDate is used to know which Project a Thing was last moved to. Note a Thing will never be moved between projects multiple times on the same date (hence the UNIQUE constraint).

I am trying to create a list of all Projects with which Things are currently linked to them, but my brain is failing.

Is there an easy way of doing this?

With this sample data:

INSERT INTO Project
  VALUES
('Project 1'),
('Project 2'),
('Project 3');

INSERT INTO Thing
  VALUES
('Some Thing'),
('This Thing'),
('That Thing'),
('Unassigned');

INSERT INTO ProjectThingLink
  VALUES
(1, 1, '2000-01-01'),
(2, 2, '2000-01-01'),
(2, 3, '2000-02-01'),
(1, 2, '2000-03-01');

The results should be:

Project Thing
Project 1 Some Thing
Project 1 This Thing
Project 2 That Thing
Project 3 NULL

Solution

  • select p.projectName, t.ThingName
    from projects p
    join projectThingLink l on l.projectId = p.projectId
    join thing t on t.thingId = l.thingId
    where l.createdDate =
    ( select max(l2.createdDate)
      from projectThingLink l2
      where l2.thingId = l.thingId
    );
    

    NOTE: Corrected after comment