I'm migrating some data from one oracle schema/table to a new schema/table on the same database.
The migration script does the following:
create table newtable as select
...
cast(ACTIVITYDATE as date) as ACTIVITY_DATE,
...
FROM oldtable where ACTIVITYDATE > sysdate - 1000;
If I look at the original data, it looks fine - here's one record:
select
activitydate,
to_char(activitydate, 'MON DD,YYYY'),
to_char(activitydate, 'DD-MON-YYYY HH24:MI:SS'),
dump(activitydate),
length(activitydate)
from orginaltable where oldpk = 1067514
Result:
18-NOV-10 NOV 18,2010 18-NOV-2010 12:59:15 Typ=12 Len=7: 120,110,11,18,13,60,16
The migrated data, showing that the data is corrupt:
select
activity_date,
to_char(activity_date, 'MON DD,YYYY'),
to_char(activity_date, 'DD-MON-YYYY HH24:MI:SS'),
dump(activity_date),
length(activity_date)
from newtable
where id = 1067514
Result:
18-NOV-10 000 00,0000 00-000-0000 00:00:00 Typ=12 Len=7: 120,110,11,18,13,0,16
Around 5000 out of 350k records show this problem.
Can anyone explain how this happened?
UPDATE:
I don't find any published reference to this specific type of DATE corruption on the Oracle support site. (It may be there, my quick searches just didn't turn it up.)
The output from the DUMP() function is showing the date value is indeed invalid:
Typ=12 Len=7: 120,110,11,18,13,0,16
We expect that the minutes byte should be a value between one and sixty, not zero.
The 7 bytes of a DATE value represent, in order, century(+100), year(+100), month, day, hour(+1), minutes(+1), seconds(+1).
The only time I have seen invalid DATE values like this when a DATE value was being supplied as a bind variable, from a Pro*C program (where the bind value is supplied in the internal 7 byte representation, entirely bypassing the normal validation routines that catch invalid dates e.g. Feb 30)
There is no reason to expect the behavior you're seeing, given the Oracle syntax you posted.
This is either a spurious anomaly (memory corruption?) or if this is repeatable, then it's a flaw (bug) in the Oracle code. If it's a flaw in the Oracle code, the most likely suspects would be "newish" features in an un-patched release.
(I know CAST is a standard SQL function that's been around for ages in other databases. I guess I'm old school, and have never introduced it into my Oracle-syntax repertoire. I don't know what version of Oracle it was that introduced the CAST, but I would have stayed away from it in the first release it appeared in.)
The big 'red flag' (that another commenter noted) is that CAST( datecol AS DATE)
.
You would expect the optimizer to treat that as equivalent to date_col ... but past experience shows us that TO_NUMBER( number_col )
is actually interpreted by the optimizer as TO_NUMBER( TO_CHAR ( number_col ) )
.
I suspect something similar might be going on with that unneeded CAST.
Based on that one record you showed, I suspect the issue is with values with a "59" value for minutes or seconds, and possibly a "23" value for hours, would be the ones that show the error.
I would try checking for places where the minutes, hour or seconds are stored as 0:
SELECT id, DUMP(activitydate)
FROM newtable
WHERE DUMP(activitydate) LIKE '%,0,%'
OR DUMP(activitydate) LIKE '%,0'