phpmysqlcodeigniteraggregate-functionscartesian-product

SQL query summing grouped records from two JOINed tables is over-calculating amounts


I have 3 related tables in my database and I'm trying to return filtered total of specified columns for the current date.

I need to list the sections [sec_name], Defects QTY [sum(pdr_qty)], Production QTY [sum(pg_qty)].

section Defects QTY Production QTY
Line 2 100 300

Table1: sections

sec_id sec_name
1 Line 1
2 Line 2

Table2: production_defects_report

pdr_id pdr_section_id pdr_date pdr_qty
1 2 2023-12-10 16:25:46 50
2 2 2023-12-10 16:25:47 50

Table3: production_gate

pg_id pg_sec_id pg_date pg_qty
1 2 2023-12-10 16:25:46 100
2 2 2023-12-10 16:25:47 200

My model method:


public function fetch_data($limit, $start)
{
    $today = date('Y-m-d');
    $this->db->select('sections.sec_name, SUM(production_defects_report.pdr_qty) as defect_qty, SUM(production_gate.pg_qty) as production_qty');
    $this->db->from('production_defects_report');
    $this->db->join('sections', 'sections.sec_id = production_defects_report.pdr_section_id');
    $this->db->join('production_gate', 'production_gate.pg_sec_id = production_defects_report.pdr_section_id ', 'left');
    $this->db->where('date(production_defects_report.pdr_date)',$today);
    $this->db->where('date(production_gate.pg_date)',$today);
    $this->db->group_by('sections.sec_id');
    $query = $this->db->get();

    if ($query->num_rows() > 0) {
        foreach ($query->result_array() as $row) {
            $data[] = $row;
        }
        return $data;
    }
    return false;
}

I have created a model method using CodeIgniter3, but I found the grouped totals are being miscalculated -- they are over-calculating. The problem appears to the be the result of summing a cartesian product. In other words, the JOIN clauses are causing qualifying rows to be duplicated within groups and the SUM() values are unusable.


Solution

  • I think that when you JOIN with a table you multiply your rows, so joining with 2 tables would make too many rows to sum properly.

    You can use a less efficient way, but one that works, by calculating each column using a sub-query. Don't forget to WHERE by the date.

    SELECT 
        sec_name AS 'Section',
        (SELECT COALESCE(SUM(pdr_qty), 0) FROM production_defects_report AS pdr WHERE pdr.pdr_section_id = s.sec_id) AS 'Defects QTY',
        (SELECT COALESCE(SUM(pg_qty), 0) FROM production_gate AS pg WHERE pg.pg_sec_id = s.sec_id) AS 'Production QTY'
    FROM 
        sections s;