CakePHP 4 - Query to find the latest status from another table

385 views Asked by At

I'm using CakePHP 4 to build an application that shows an inventory of documents and the latest status for each document.

The tables are fairly simple:

  • documents: This contains a list of documents, each of which has a unique id, a human-friendly name, a filename etc.
  • statuses: This includes a list of about 10 different statuses that a document can go through within the application (e.g. "uploaded", "review requested", "reviewed", "rejected" etc). Each of these has a unique id and name (the name being the text of the status, such as "uploaded", "review requested" etc).
  • documents_statuses: This is a history table that contains all of the statuses that a document has gone through. Any given document (documents.id) can appear multiple times in this table (using a foreign key of documents_statuses.document_id referring to the relevant documents.id). It also has a documents_statuses.status_id corresponding to a statuses.id in the statuses table mentioned above. Each row here has CakePHP's conventional created timestamp so we know when rows were inserted. This is enough to tell us dates/times about when the document got to a particular status.

What I'm trying to do is output a list of documents and show the most recent status from documents_statuses in my table. The HTML structure of the table is simple and contains 3 headings:

  • Document Name
  • Filename
  • Last status

Writing a query to get the data for the first 2 columns is easy as the data for that belongs in the documents table:

// src/Controller/DocumentsController.php
public function index()
{
    $documents = $this->Documents->find();
    $documents = $documents->paginate($document);
    $this->set('documents', $documents);
}

In my template I can then reference $documents->name and $documents->filename to output the respective columns from the documents table.

I understand that I need some extra logic in this query which will JOIN to the documents_statuses table and then order the records in descending order with a LIMIT of 1 to get the most recent status per document. I know I also need to do a further JOIN such that documents_statuses.status_id returns the corresponding statuses.name.

I know that I can adapt my query to contain documents_statuses and statuses:

$documents = $this->Documents->contain(['DocumentsStatuses', 'Statuses'])->find();

But I don't know how to loop through the records in documents_statuses in this query and do the ->orderDesc->limit(1) to get the most recent record. Furthermore I also know that to obtain the statuses.name I would need to get this query to join the documents_statuses.status_id and statuses.id to return statuses.name (e.g. "uploaded", "review requested" etc).

The application has been bake'd and the models associations are defined correctly.

Might something equivalent be described in the CakePHP docs?

Edit - Raw SQL

The following SQL is equivalent to what I'm trying to write using the ORM. The problem isn't particularly understanding the SQL involved, it's writing it using CakePHP's ORM syntax. Equally, if there is a "better" way to write this query I'm interested but the purpose of this question is how to make this work using CakePHP's ORM.

SELECT 
    documents.name, 
    documents_statuses.created, 
    statuses.name 
FROM documents
LEFT JOIN
    (SELECT documents_id, MAX(created) AS created
        FROM documents_statuses
        GROUP BY documents_id
    ) recent_statuses
ON documents.id = recent_statuses.documents_id
LEFT JOIN documents_statuses
ON documents.id = documents_statuses.documents_id AND recent_statuses.created = documents_statuses.created
LEFT JOIN statuses
ON documents_statuses.status_id = statuses.id
2

There are 2 answers

0
hitesh shah On
$documents = $this->Documents->contain(['DocumentsStatuses'=>function(Query $q){ return $q->contain(['Statuses'])->orderDesc->limit(1);}])->find();

book.cakephp.org

3
Rick James On

It can be done with a slightly complex JOIN; see the [groupwise-max] tag or my Groupwise-Max

Alternatively, you could use a different schema pattern:

  • Current -- this always has the latest 'status' (plus other info). There would be one row per document.
  • History -- essentially as you have it now. But this is not looked at to find the "current status". This table has many rows per document.

Your code would need to INSERT INTO History and UPDATE Current to update the status. (Actually the Update could be a IODKU if you need to Insert when the row does not exist yet.)

The query in question would be simply SELECT ... FROM Current ... -- no Join needed.