quickfixfix-protocolquickfixj

How does the FIX protocol handle a message sequence number overflow?


We are currently incorporating a FIX engine (using QuickFixJ) in our application. We will be the initiator and use trade capture reports to get informed on all trades happening on the platform.

The trading (and thus the FIX session) will be running 24/7 and we are currently looking into ways to handle this properly. Our concern is that at some point we will need to reset the message sequence numbers to avoid an overflow. We would ideally not want to reset the sequence number as we need to be sure that we catch every single trade. We are worried about the following scenario:

  1. We send a SequenceReset message
  2. Our system crashes due to unrelated reasons
  3. The acceptor side send us one or more TradeCaptureReport messages
  4. Only now does the acceptor side receive our SequenceReset message
  5. Our system has recovered and sends a ResendRequest message, with BeginSeqNo equal to 1 (because we have reset the message sequence number)
  6. We do not get the TradeCaptureReport messages from (3.)

However, we have noticed that in case of a message sequence overflow, neither our engine nor the acceptor side seem to be troubled by this.

The example I have tested is simply sending heartbeats which will overflow the sequence number:

8=FIXT.1.19=13135=A34=149=INITIATOR50=INITIATOR52=20220901-15:26:03.40356=ACCEPTOR98=0108=10141=Y553=INITIATOR554=password1137=910=224
8=FIXT.1.19=00010235=A49=ACCEPTOR56=INITIATOR34=157=INITIATOR52=20220901-15:26:03.65498=0108=10141=Y1409=01137=910=212
8=FIXT.1.19=9035=434=249=INITIATOR50=INITIATOR52=20220901-15:26:03.71856=ACCEPTOR36=2147483646123=Y10=038
8=FIXT.1.19=00007035=049=ACCEPTOR56=INITIATOR34=257=INITIATOR52=20220901-15:26:13.79210=009
8=FIXT.1.19=7935=034=214748364649=INITIATOR50=INITIATOR52=20220901-15:26:13.78956=ACCEPTOR10=044
8=FIXT.1.19=00007035=049=ACCEPTOR56=INITIATOR34=357=INITIATOR52=20220901-15:26:23.85210=008
8=FIXT.1.19=7935=034=214748364749=INITIATOR50=INITIATOR52=20220901-15:26:23.85056=ACCEPTOR10=035
8=FIXT.1.19=00007035=049=ACCEPTOR56=INITIATOR34=457=INITIATOR52=20220901-15:26:33.89610=018
8=FIXT.1.19=8035=034=-214748364849=INITIATOR50=INITIATOR52=20220901-15:26:33.89256=ACCEPTOR10=080
8=FIXT.1.19=00007035=049=ACCEPTOR56=INITIATOR34=557=INITIATOR52=20220901-15:26:43.93310=012
8=FIXT.1.19=8035=034=-214748364749=INITIATOR50=INITIATOR52=20220901-15:26:43.93256=ACCEPTOR10=075

Is this a feature of the FIX protocol or is it undefined behaviour (and just works coincidentally)? And if this doesn't work (or is discouraged), is there a best way to handle ongoing FIX sessions? We have not found any usable information and most exchanges we have seen simply reset once a day.


Solution

  • I think the title of the question should rather be "how does a FIX engine handle message sequence number overflow".

    As per the FIX spec the sequence number is always positive: FIX datatypes

    Sequence of character digits without commas or decimals. Value must be positive.

    I can only speak for QuickFIX/J: internally the sequence number is of type java.lang.Integer which means its maximum positive value is 2147483647.

    Now when QuickFIX/J (or any other engine) accepts or uses negative sequence numbers it clearly is a bug.

    Maybe you should approach your Exchange how other clients handle this. I think at some point they have a time window where sequence numbers can (and should) be reset. I guess the exchange handles it like outlined here: FIX session 24-hour connectivity