sqloracleconcurrencysql-mergecontention

Unkillable Oracle session waiting on "SQL*Net message from client" event


On Oracle 11gR2, I've recently encountered a very interesting situation involving a blocked (but idle!) MERGE statement that hangs on a "SQL*Net message from client" event, causing subsequent, concurrently executed MERGE statements to block on the first statement via "cursor: pin S wait on X" events. In Oracle Enterprise Manager, the following can be observed:

enter image description here

This situation turns even more severe, as the above Session-ID 1204 cannot be killed with either:

alter system kill session 'sid,serial#';
alter system kill session 'sid,serial#' immediate;

Our DBA is sometimes able to kill the operating system process, but often, the whole database needs to be restarted. Luckily, thus far, only on a test system, never in production.

Note:

I'm aware this is probably a similar issue as reported in this rather vague question: Oracle updates/inserts stuck, DB CPU at 100%, concurrency high, SQL*Net wait message from client. I'll still report it again, as I have a clear reproduction path, which I'll report as an answer.


Solution

  • This seems to be a bug in Oracle when CLOB data types are used as values that are passed to the MERGE statement's ON clause. Assume this database:

    CREATE TABLE t (
      v INT, 
      s VARCHAR2(400 CHAR)
    );
    

    Reproduction using inlined values

    Now, run the following statement in any Oracle client, including SQL*Plus, SQL Developer or from JDBC, which helps reproducing the issue very easily (I'm using Oracle 11g XE 11.2.0.2.0):

    MERGE INTO t                      
    USING (
      SELECT 
        1 v, 
        CAST('abc' AS CLOB) s 
      FROM DUAL
    ) s 
    ON (t.s = s.s) -- Using a CLOB here causes the bug.
    WHEN MATCHED THEN UPDATE SET
      t.v = s.v        
    WHEN NOT MATCHED THEN INSERT (v, s) 
    VALUES (s.v, s.s);
    

    The example is silly, and the CLOB was bound here by "accident". Nonetheless, such a statement should not create a zombie session in Oracle, but it's there. I'm running the above statement three times in SQL*Plus and then running this...

    SELECT 
      s.sid,
      s.serial#,
      s.sql_id,
      s.event,
      s.blocking_session,
      q.sql_text
    FROM v$session s
    JOIN v$sql q
    ON s.sql_id = q.sql_id
    WHERE s.username = 'TEST'
    AND UPPER(TRIM(q.sql_text)) LIKE 'MERGE%';
    

    ... I get:

    sid serial# sql_id          event                       blocking_session
    9   3       82a2k4sqzy1jq   cursor: pin S wait on X     92
    49  89      82a2k4sqzy1jq   cursor: pin S wait on X     92
    92  13      82a2k4sqzy1jq   db file sequential read     
    

    Notice how the reported event is different ("db file sequential read") from the original event ("SQL*Net message from client"), which was using bind variables

    Reproduction using bind values

    var v_s varchar2(50)
    exec :v_s := 'abc'
    
    MERGE INTO t                      
    USING (
      SELECT 
        1 v, 
        CAST(:v_s AS CLOB) s 
      FROM DUAL
    ) s 
    ON (t.s = s.s) -- Using a CLOB here causes the bug.
    WHEN MATCHED THEN UPDATE SET
      t.v = s.v        
    WHEN NOT MATCHED THEN INSERT (v, s) 
    VALUES (s.v, s.s);
    

    The above statement run in SQL*Plus also produces the bug:

    sid serial# sql_id          event                           blocking_session
    8   1       4w9zuxrumumgj   SQL*Net message from client     
    90  7       4w9zuxrumumgj   cursor: pin S wait on X         8
    94  21      4w9zuxrumumgj   cursor: pin S wait on X         8
    

    No reproduction in PL/SQL

    Interestingly, the bug is avoided in the following PL/SQL statement:

    DECLARE
      v_s CLOB := 'abc';
    BEGIN
      MERGE INTO t                      
      USING (
        SELECT 
          1 v, 
          CAST(v_s AS CLOB) s 
        FROM DUAL
      ) s 
      ON (t.s = s.s) -- Using a CLOB here causes the bug.
      WHEN MATCHED THEN UPDATE SET
        t.v = s.v        
      WHEN NOT MATCHED THEN INSERT (v, s) 
      VALUES (s.v, s.s);
    END;
    /
    

    I'm getting:

              CAST(v_s AS CLOB) s
              *
    ERROR at line 8:
    ORA-06550: line 8, column 11:
    PL/SQL: ORA-00932: inconsistent datatypes: expected - got CLOB
    ORA-06550: line 4, column 7:
    PL/SQL: SQL Statement ignored
    

    It looks as though the PL/SQL engine saveguards clients from this SQL engine bug.