In regards to HL7 pipe-delimited data, how exactly do the encoding characters (|^~\&) work?
Is the following example of fields, field repetitions, components and their sub-components correct when parsing raw HL7 data?
PID|1||||||||||||1234567890^somedata&moredata^TESTEMAIL@GMAIL.COM~0987654321
Field (|):
PID13 = 1234567890^somedata&moredata^TESTEMAIL@GMAIL.COM~0987654321
Field repetition (~):
PID13~1 = 1234567890^somedata&moredata^TESTEMAIL@GMAIL.COM
PID13~2 = 0987654321
Component (^):
PID13.1 = 1234567890
PID13.2 = somedata&moredata
PID13.3 = TESTEMAIL@GMAIL.COM
Sub-component (&):
PID13.2.1 = somedata
PID13.2.2 = moredata
PID13.3.1 = TESTEMAIL@GMAIL.COM
PID13.3.2 =
Without understanding the left-hand side structure you're trying to assign stuff to, it's impossible to tell you if you're doing it right.
There is however one right way to parse the segment/field in question.
Here's a link to the specs I reference here
From section 2.5.3 of the HL7v2.7 Standard:
Each field is assigned a data type that defines the value domain of the field – the possible values that it may take.
If you pull up section 3.4.2.13 (PID-13) you'll see a breakdown of each component and subcomponent. Technically, the meaning of subcomponents and components can vary by field, but mostly they just vary by data type.
In your example, you don't treat the repetitions as separate instances of XTN data types. I would re-write using array syntax as so:
Field repetition (~):
PID13[0] = 1234567890^somedata&moredata^TESTEMAIL@GMAIL.COM
PID13[1] = 0987654321
Component (^):
PID13[0].1 = 1234567890
PID13[0].2 = somedata&moredata
PID13[0].3 = TESTEMAIL@GMAIL.COM
Sub-component (&):
PID13[0].2.1 = somedata
PID13[0].2.2 = moredata
The psuedo-code in the same specification section 2.6.1 may be helpful as well
foreach occurrence in ( occurrences_of( field ) ) {
construct_occurrence( occurrence );
if not last ( populated occurrence ) insert repetition_separator;
/* e.g., ~ */
}
It's important to remember that those different subcomponents have different meaning because PID-13 is a XTN type.
PID-13 is a problematic example because historically, the order of PID-13 mattered. The first repetition was "primary". Over time the field has also become the landing place for e-mail addresses, pager numbers, etc. So good luck trying to make sense out of real-world data.