I am a beginner in SQL and I want to query data from an Oracle database using SQL. I have a tablespace that records the positions of many cars. Each record has a mileage and a time. The tablespace has three columns: ‘SENDTIME’, ‘MILEAGE’, and ‘PLATENO’. They represent the time, the mileage (in the timestamp format), and the plate number of the cars respectively. The values in the ‘SENDTIME’ column are in the datetime format. I want to find out when a specific car did not move for more than 30 seconds and less than 300 seconds. So i wrote a sql query:
WITH gpsinfo_cte AS (
SELECT plateno, sendtime, longitude, latitude, mileage, createdate,
FIRST_VALUE(sendtime) OVER (PARTITION BY plateno, mileage ORDER BY sendtime) AS first_sendtime,
LAST_VALUE(sendtime) OVER (PARTITION BY plateno, mileage ORDER BY sendtime) AS last_sendtime
FROM GPSINFO
WHERE plateno = '京AEW302'
)
SELECT /*+ NO_MERGE(gpsinfo_cte) */ plateno, sendtime, longitude, latitude, mileage, createdate
FROM gpsinfo_cte
WHERE (last_sendtime - first_sendtime) * 24 * 60 *60 < 300
AND (last_sendtime - first_sendtime) * 24 * 60 *60 > 30;
However, it runs rather slowly in oracle database. Acoording to web search results, I tried to use the EXPLAIN PLAN statement to generate the execution plan for your query and store it in a table called PLAN_TABLE. Here's the output:
I still have no idea to improve the performance. Anyone can help? Many thanks!
This is probably all that matters:
FROM GPSINFO
WHERE plateno = '京AEW302'
Assuming you have lots (thousands, millions maybe) of cars, and are asking for only one license plate number, you want to use an index to access only table rows for that particular car. Your execution plan shows that this isn't happening:
TABLE ACCESS FULL GPSINFO
Simply create an index on the plateno column and that should resolve your issue:
CREATE INDEX gpsinfo_plateno_idx1 on gpsinfo(plateno) compress 1
As for the rest of your query, you'll have to work on the logic to get the right results, but it is unlikely that you have a noticeable performance problem with it as long as you are working with only one car. In terms of the logic, if I understand rightly you want to know when there is a 30-300 second gap between position recordings with no mileage between them. So you need to compare a row with a neighboring row. For that, use LAG
(or LEAD
):
SELECT x.*,
(sendtime - last_sentime) * 86400 seconds_elapsed
FROM (SELECT x.*,
LAG(sendtime) OVER (ORDER BY sendtime) last_sendtime,
LAG(mileage) OVER (ORDER BY sendtime) last_mileage
FROM gpsinfo x
WHERE plateno = '京AEW302') x
WHERE (sendtime - last_sentime) * 86400 BETWEEN 30 AND 300
AND mileage = last_mileage
Note: If you are asking for only one plateno
value, there is no need to include plateno in a PARTITION BY
clause, that's redundant as there is only one. You can certainly use PARTITION BY mileage
as an alternative to my mileage = last_mileage
logic shown here, but mileage is likely to have many different values and internally grouping by that many values means a lot of tiny (single row) groups and that's not very memory/temp efficient. Yet as I said, the indexing is really your only serious issue.