I'm using CakePHP 4 to build an application that shows an inventory of documents and the latest status for each document.
The tables are fairly simple:
documents
: This contains a list of documents, each of which has a unique id
, a human-friendly name, a filename etc.statuses
: This includes a list of about 10 different statuses that a document can go through within the application (e.g. "uploaded", "review requested", "reviewed", "rejected" etc). Each of these has a unique id
and name
(the name being the text of the status, such as "uploaded", "review requested" etc).documents_statuses
: This is a history table that contains all of the statuses that a document has gone through. Any given document (documents.id
) can appear multiple times in this table (using a foreign key of documents_statuses.document_id
referring to the relevant documents.id
). It also has a documents_statuses.status_id
corresponding to a statuses.id
in the statuses table mentioned above. Each row here has CakePHP's conventional created
timestamp so we know when rows were inserted. This is enough to tell us dates/times about when the document got to a particular status.What I'm trying to do is output a list of documents
and show the most recent status from documents_statuses
in my table. The HTML structure of the table is simple and contains 3 headings:
Writing a query to get the data for the first 2 columns is easy as the data for that belongs in the documents
table:
// src/Controller/DocumentsController.php
public function index()
{
$documents = $this->Documents->find();
$documents = $documents->paginate($document);
$this->set('documents', $documents);
}
In my template I can then reference $documents->name
and $documents->filename
to output the respective columns from the documents
table.
I understand that I need some extra logic in this query which will JOIN
to the documents_statuses
table and then order the records in descending order with a LIMIT
of 1 to get the most recent status per document. I know I also need to do a further JOIN
such that documents_statuses.status_id
returns the corresponding statuses.name
.
I know that I can adapt my query to contain documents_statuses
and statuses
:
$documents = $this->Documents->contain(['DocumentsStatuses', 'Statuses'])->find();
But I don't know how to loop through the records in documents_statuses
in this query and do the ->orderDesc->limit(1)
to get the most recent record. Furthermore I also know that to obtain the statuses.name
I would need to get this query to join the documents_statuses.status_id
and statuses.id
to return statuses.name
(e.g. "uploaded", "review requested" etc).
The application has been bake
'd and the models associations are defined correctly.
Might something equivalent be described in the CakePHP docs?
The following SQL is equivalent to what I'm trying to write using the ORM. The problem isn't particularly understanding the SQL involved, it's writing it using CakePHP's ORM syntax. Equally, if there is a "better" way to write this query I'm interested but the purpose of this question is how to make this work using CakePHP's ORM.
SELECT
documents.name,
documents_statuses.created,
statuses.name
FROM documents
LEFT JOIN
(SELECT documents_id, MAX(created) AS created
FROM documents_statuses
GROUP BY documents_id
) recent_statuses
ON documents.id = recent_statuses.documents_id
LEFT JOIN documents_statuses
ON documents.id = documents_statuses.documents_id AND recent_statuses.created = documents_statuses.created
LEFT JOIN statuses
ON documents_statuses.status_id = statuses.id
It can be done with a slightly complex JOIN
; see the [groupwise-max]
tag or my Groupwise-Max
Alternatively, you could use a different schema pattern:
Current
-- this always has the latest 'status' (plus other info). There would be one row per document.History
-- essentially as you have it now. But this is not looked at to find the "current status". This table has many rows per document.Your code would need to INSERT INTO History
and UPDATE Current
to update the status
. (Actually the Update could be a IODKU if you need to Insert when the row does not exist yet.)
The query in question would be simply SELECT ... FROM Current ...
-- no Join needed.