phpmysqlsqlcivicrm

Data migration into CiviCRM - keep legacy IDs


I develop custom migration code using CiviCRM's PHP API calls like:

<?php
$result = civicrm_api3('Contact', 'create', array(
  'sequential' => 1,
  'contact_type' => "Household",
  'nick_name' => "boo",
  'first_name' => "moo",
));

There's a need to keep original IDs, but specifying 'id' or 'contact_id' above does not work. It either does not create the contact or updates an existing one. The ID is auto-incremented, for sure, but MySQL supports to insert arbitrary, unique values in that case.

How would you proceed? Hack CiviCRM to somehow pass the id to MySQL at the INSERT statement? Somehow dump the SQL after the import and manipulate the IDs in-place at the .sql textfile (hard to maintain integrity)? Any suggestions for that?

I have ~300.000 entries at least to deal with, so a fully automated and robust solution is a must. Any SQL magic potentially to do that?

For those who are not familiar with CiviCRM, the table structure is the following:

mysql> desc civicrm_contact;
+--------------------------------+------------------+------+-----+-------------------+-----------------------------+                                                                         
| Field                          | Type             | Null | Key | Default           | Extra                       |                                                                         
+--------------------------------+------------------+------+-----+-------------------+-----------------------------+                                                                         
| id                             | int(10) unsigned | NO   | PRI | NULL              | auto_increment              |                                                                         
| contact_type                   | varchar(64)      | YES  | MUL | NULL              |                             |                                                                         
| contact_sub_type               | varchar(255)     | YES  | MUL | NULL              |                             |                                                                         
| do_not_email                   | tinyint(4)       | YES  |     | 0                 |                             |                                                                         
| do_not_phone                   | tinyint(4)       | YES  |     | 0                 |                             |                                                                         
| do_not_mail                    | tinyint(4)       | YES  |     | 0                 |                             |                                                                         
| do_not_sms                     | tinyint(4)       | YES  |     | 0                 |                             |                                                                         
| do_not_trade                   | tinyint(4)       | YES  |     | 0                 |                             |                                                                         
| is_opt_out                     | tinyint(4)       | NO   |     | 0                 |                             |                                                                         
| legal_identifier               | varchar(32)      | YES  |     | NULL              |                             |                                                                         
| external_identifier            | varchar(64)      | YES  | UNI | NULL              |                             |

and we talk about the first field.


Solution

  • You should use the external_identifier field which is exactly done for what you want.

    This field is not used by CiviCRM itself so there is no risk to mess with core functionality. It's done to link with an external system (legacy for example).

    CiviCRM consider the external_identifier to be unique so it will throw an error (using API - I think) or update (using CiviCRM contact import screen) if you try to insert a contact with the same external_identifier.