I have two tables - Batting with Year field for each record that spans Persons career and Person has DebutYear and Final Year. All 3 of these are smallint. I am using the following query to get ONLY the Years from Batting in a specified career range between DebutYear+value to DebutYear+2nd-value. In this case it should only SUM the AB for those years but it is summing ALL the years in the Batting table. I am a little inexperienced with JOINS. Any help would be appreciated!
SELECT CONCAT(FirstName, ' ', LastName) AS Name, B.PersonID, SUM(AB) AS AB
FROM Batting B
INNER JOIN Person P ON P.PersonID = B.PersonID
AND (B.Year >= (P.DebutYear+1) AND B.Year <= (P.DebutYear+2))
WHERE B.PersonID = P.PersonID
GROUP BY B.PersonID
HAVING (AB >= 10000)
ORDER BY AB Desc, LastName, FirstName
LIMIT 1000
Sample data:
PersonID | Name | DebutYear |
---|---|---|
rosepe01 | Pete Rose | 1963 |
Batting | PersonID | Year | AB |
---|---|---|---|
1 | rosepe01 | 1963 | 623 |
2 | rosepe01 | 1964 | 516 |
3 | rosepe01 | 1965 | 670 |
4 | rosepe01 | 1966 | 654 |
5 | rosepe01 | 1967 | 585 |
6 | rosepe01 | 1968 | 626 |
The filter should be applied in the WHERE
clause, not the JOIN
.
Note: Your GROUP BY
clause could just be the B.PersonID
if you wanted, as long as it is an indexed primary key. I kept FirstName
and LastName
since they are referenced in the SELECT
statement.
SELECT CONCAT(P.FirstName, ' ', P.LastName) AS Name, B.PersonID, SUM(B.AB) AS AB
FROM Batting B
INNER JOIN Person P ON P.PersonID = B.PersonID
WHERE B.Year BETWEEN (P.DebutYear + 1) AND (P.DebutYear + 2)
GROUP BY B.PersonID, P.FirstName, P.LastName -- Could just be B.PersonID
HAVING SUM(B.AB) >= 10000
ORDER BY AB DESC, P.LastName, P.FirstName
LIMIT 1000;
ATTENTION: According to your data, Pete Rose had 1,186 at bats (AB) in his second and third year. Your threshold is checking for an AB cound at or exceeding 10,000. I think your query is off by a factor of 10. Did you get that confused with the limit?
Here is a working example that I verified. I included setup, data, and the query from above. Each step can be executed as one contiguous script within a sandbox.
Update: I tried to match your schema according to your comment.
Setup or re-create the tables:
-- Reset
DROP TABLE IF EXISTS Batting;
DROP TABLE IF EXISTS Person;
-- Schemas
CREATE TABLE Person (
PersonID VARCHAR(12) PRIMARY KEY,
FirstName VARCHAR(100),
LastName VARCHAR(100),
DebutYear SMALLINT(4),
FinalYear SMALLINT(4)
);
CREATE TABLE Batting (
BattingID INT PRIMARY KEY,
PersonID VARCHAR(12),
Year SMALLINT(4),
AB SMALLINT(3), -- At Bats
FOREIGN KEY (PersonID) REFERENCES Person(PersonID)
);
Update: I added the Pete Rose stats.
Here is the sample data to query against:
-- Sample data
INSERT INTO Person (PersonID, FirstName, LastName, DebutYear, FinalYear)
VALUES
('doejo01' , 'John' , 'Doe' , 2000 , 2010),
('smithji01' , 'Jimmy' , 'Smith' , 1995 , 2005),
('brownja01' , 'James' , 'Brown' , 1972 , 1982),
('rosepe01' , 'Pete' , 'Rose' , 1963 , 1986);
INSERT INTO Batting (BattingID, PersonID, Year, AB)
VALUES
( 1, 'doejo01', 2001, 5000), -- John Doe (2001, Year 2, total so far: 5000)
( 2, 'doejo01', 2002, 6000), -- John Doe (2002, Year 3, total so far: 11000)
( 3, 'doejo01', 2003, 7000), -- John Doe (2003, Year 4, outside range, not included)
( 4, 'smithji01', 1996, 4000), -- Jimmy Smith (1996, Year 2, total so far: 4000)
( 5, 'smithji01', 1997, 5500), -- Jimmy Smith (1997, Year 3, total so far: 9500, below threshold)
( 6, 'brownja01', 1972, 10000), -- James Brown (1973, Year 1, outside range)
( 7, 'brownja01', 1973, 6000), -- James Brown (1973, Year 2, total so far: 6000)
( 8, 'brownja01', 1974, 7000), -- James Brown (1974, Year 3, total so far: 13000)
( 9, 'rosepe01', 1963, 623), -- Pete Rose (1963, Year 1, outside range)
(10, 'rosepe01', 1964, 516), -- Pete Rose (1963, Year 2, total so far: 516)
(11, 'rosepe01', 1965, 670), -- Pete Rose (1963, Year 3, total so far: 1186, below threshold)
(12, 'rosepe01', 1966, 654), -- Pete Rose (1963, Year 4, outside range, not included)
(13, 'rosepe01', 1967, 585), -- Pete Rose (1963, Year 5, outside range, not included)
(14, 'rosepe01', 1968, 626); -- Pete Rose (1963, Year 6, outside range, not included)
Here is the query from above:
-- Query for players with an AB >= 10000 between DebutYear + 1 and DebutYear + 2
SELECT CONCAT(P.FirstName, ' ', P.LastName) AS Name, B.PersonID, SUM(B.AB) AS AB
FROM Batting B
INNER JOIN Person P ON P.PersonID = B.PersonID
WHERE B.Year BETWEEN (P.DebutYear + 1) AND (P.DebutYear + 2)
GROUP BY B.PersonID, P.FirstName, P.LastName
HAVING SUM(B.AB) >= 10000
ORDER BY AB DESC, P.LastName, P.FirstName
LIMIT 1000;
The result should only include James and John, but not Jimmy.
Name | PersonID | AB |
---|---|---|
James Brown | 3 | 13000 |
John Doe | 1 | 11000 |