I create software for pharmacies to validate drugs in NMVS. The program should work in such a way that I scan the drug code with a handheld scanner, click "Verify" and connect to NMVS. Most of the work is done, but to correctly verify the drug, I need to extract from the GTIN code (PC), batch number (LoT), serial number (SN) and expiry date (EXP)
Here are the scan results for the test drugs:
01059099913808231003ZP082117230831210XXFAE5AWA6RF8
0105909990054152101123926172207012162RB6FBN09
010590999109968821100322567773831721093010100013978
01059099907954202190EPCNT32ZH5581004032217250331
010590999032841321YCK3EB53CNZXD1725083110C48700
0105909990071029211165895472021010MU465417241031
I know that it's GS1 DataMatrix format and GTIN is prefixed with 01 (following 14 digits is GTIN), LoT with prefix 10 (following 1-20 alphanumeric characters is LoT), SN with prefix 21 (following 1-20 alphanumeric characters is LoT) LoT) and the expiry date is prefixed with 17 (following 6 digits is EXP).
For the given examples, I should have e.g.:
[
{
"gtin": "05909991380823",
"lot": "03ZP08",
"sn": "0XXFAE5AWA6RF8",
"exp": "230831"
},
{
"gtin": "05909990054152",
"lot": "1123926",
"sn": "62RB6FBN09",
"exp": "220701"
},
{
"gtin": "05909991099688",
"lot": "100013978",
"sn": "10032256777383",
"exp": "210930"
},
{
"gtin": "05909990795420",
"lot": "040322",
"sn": "90EPCNT32ZH558",
"exp": "250331"
},
{
"gtin": "05909990328413",
"lot": "C48700",
"sn": "YCK3EB53CNZXD",
"exp": "250831"
},
{
"gtin": "05909990071029",
"lot": "10MU4654",
"sn": "116589547202",
"exp": "241031"
}
]
The problem is that these sections can be in any order and of varying lengths. Only GTIN and EXP have a fixed length.
I created a regex to extract these sections: ^(?=.*01(\d{14}))(?=.*10([a-zA-Z0-9]{1,20}))(? =.*17(\d{6}))(?=.*21([a-zA-Z0-9]{1,20})).*$
but unfortunately it doesn't work properly. The client is written in Javascript (not in TS, exactly in AngularJS - yes, it's a legacy project, I'm trying to persuade the company to update it), and the server in Java.
I'm looking for any solution - whether it's a regex, library (Javascript or Java), external API - for this problem, personally I'm running out of ideas...
Also, I'll add that the handheld scanner I'm using is the Zebra DS2208.
I would appreciate any help on this topic.
EDIT:
I tried read barcode scanner output character by character, but I don't see a pattern. This is what I got:
I did it! I noticed that GTIN and EXP are always extracted in proper way, so I tried something like this:
const extractDataMatrix = (code) => {
const response = {gtin: '', lot: '', sn: '', exp: ''};
let responseCode = code;
const prefixes = [
{prefix: '01', key: 'gtin', length: 14},
{prefix: '17', key: 'exp', length: 6}
];
prefixes.forEach(({prefix, key, length}) => {
const position = responseCode.indexOf(prefix);
if (position !== -1) {
const start = position + prefix.length;
const end = start + length;
response[key] = responseCode.substring(start, end);
responseCode = responseCode.slice(0, position) + responseCode.slice(end);
}
});
const lotAndSn = extractLotAndSn(responseCode);
response.lot = lotAndSn.lot;
response.sn = lotAndSn.sn;
return response;
};
const extractLotAndSn = (responseCode) => {
const pattern = /^(10.+?)(?=10|21)(21.+?)$|^(21.+?)(?=10|21)(10.+?)$/;
const matches = responseCode.match(pattern);
if (!matches) return {lot: '', sn: ''};
const [lot1, sn1, sn2, lot2] = matches.slice(1);
const lot = (lot1 || lot2 || '').substring(2);
const sn = (sn1 || sn2 || '').substring(2);
return checkLotAndSn(lot, sn, responseCode);
};
const checkLotAndSn = (lot, sn, responseCode) => {
if (responseCode.includes("1010") && !responseCode.includes("10100")) {
const isLotStart = lot.startsWith("10");
if (isLotStart) {
lot = lot.slice(2);
sn += "10";
}
} else if (responseCode.includes("2121") && responseCode.includes("21210")) {
const isSnStart = sn.startsWith("21");
if (isSnStart) {
sn = sn.slice(2);
lot += "21";
}
}
return {lot, sn};
};
I think it can be optimized anyway, but for now I don't care ;).
What is going on?
extractDataMatrix
I check prefixes with fixed length (GTIN and EX).forEach
I remove it from responseCode
.responseCode
to extractLotAndSn()
function.checkLotAndSn()
function remove prefix from start of sequence and add it to previous sequence.10100
and 21210
is part of
responseCode
everything is splitted correctly, so I excluded it from swap.