sqlsql-serverjoinredundancy

SQL Server Data redundancy


I'm trying to find out the pitstop strategy followed by teams in F1 and which strategy on average yields the better winning results. However every time I try to execute the query I just get redundant data.

Dataset Example:

DriverTable

DriverId DriverRef DriverNumber
1 Lewis 22
2 Alonso 14
3 Max 1

RaceResult

ResultID RaceID DriverID Number Position Laps
1 12 1 22 1 53
1 12 2 14 6 53
1 12 3 1 2 53
2 13 1 22 2 57
2 13 2 14 6 57
2 13 3 1 1 57

Races

RaceID year CircuitID Name
12 2009 1 Monza
13 2013 2 Bahrain Sakhir

PitStops

RaceID DriverID Stop Lap
12 1 1 17
12 1 2 34
12 2 1 14
12 2 2 42
12 3 1 20
12 3 2 37
13 1 1 14
13 1 2 32
13 2 1 12
13 2 2 34
13 3 1 20
13 3 2 42

My desired result table would look something similar to this.

StrategyChoices

DriverRef RaceID CircuitID Name Stop Lap
Lewis 12 1 Monza 1 17
Lewis 12 1 Monza 2 34
Max 13 2 Bahrain Sakhir 1 20
Max 13 2 Bahrain Sakhir 2 42

The goal here find out what pit strategy did the winning driver use on a certain track.

SELECT ra.year, ra.name, d.properdriverref, (SELECT DISTINCT re.number FROM results WHERE re.position = 1), p.stop, p.lap, re.position
FROM pit_stops p
JOIN results re ON re.raceId = p.raceId
JOIN races ra ON p.raceId = ra.raceId
JOIN DriversXL d ON p.driverId = d.driverId    
WHERE ra.year >= 2018 AND re.position = 1

Was the code that I used and the data set returned a

DriverRef RaceID CircuitID Name Stop Lap
Lewis 12 1 Monza 1 17
Lewis 12 1 Monza 2 34
Max 12 1 Monza 1 17
Max 12 1 Monza 2 34
Alonso 12 1 Monza 1 17
Alonso 12 1 Monza 2 34
Max 13 2 Bahrain Sakhir 1 20
Max 13 2 Bahrain Sakhir 2 42
Lewis 13 2 Bahrain Sakhir 1 20
Lewis 13 2 Bahrain Sakhir 2 42
Alonso 13 2 Bahrain Sakhir 1 20
Alonso 13 2 Bahrain Sakhir 2 42

Solution

  • You are missing an extra join column between pit_stop and results.

    It's unclear the point of that subquery, so I have removed it

    SELECT
      d.driverref,
      r.raceId,
      r.circuitId,
      ra.year,
      ra.name,
      p.stop,
      p.lap
    FROM pit_stops p
    JOIN results re ON re.raceId = p.raceId AND re.DriverID = p.DriverID
    JOIN races ra ON p.raceId = ra.raceId
    JOIN DriversXL d ON p.driverId = d.driverId    
    WHERE re.position = 1;
    

    db<>fiddle