We are currently tracking purchase events from two distinct sources: A. Our frontend application, which directly sends the events to Google Analytics 4 (GA4) B. Our backend server, which uses the Measurement Protocol to send the events to GA4.
Both event types include the transaction_id of the order. Under normal operation, we've noticed that these purchase events - and, consequently, the revenue associated with them - are being counted twice in GA4, leading to inflated metrics.
Interestingly, this duplication issue does not occur when we're testing our implementation in debug mode - the events and associated revenue are counted correctly. We've conducted a thorough review of the transaction_id and session_id fields in our BigQuery database, which did not reveal any inconsistencies or issues that could be causing the duplicate counts.
We're seeking guidance on what might be causing this discrepancy between normal and debug mode, and how we can rectify it. Any advice or insights on how to further troubleshoot this issue would be greatly appreciated.
Note: In the MP code, we are including session_id
, engagement_time_msec
, transaction_id
.
The issue is caused because GA4 API is using both CID (also known as user_pseudo_id) and transaction_id simultaneously to check for purchase duplications. However, two purchase events are shown since app users have different CIDs than web users (CIDs change between each browser and device).
One possible solution is to collect all data in the backend and use only one source to send the data. Another fix would be to count purchases with distinct transaction_id for reporting in BigQuery.