javaxmlxml-builder

convert flat file records to xml in java


I have a scenario where records of employees written in flat file, something like :

flatFile.txt
============
1|name1|dept1|10000
2|name2|dept2|12000
3|name3|dept3|9500
....
....

Now I want to read this flat file and convert above employee records into some new xml file everytime, so at the end I should have the xml file with following data :

<EMPLOYEES>
    <EMPLOYEE>
        <ID>1</ID>
        <NAME>name1</NAME>
        <DEPARTMENT>dept1</DEPARTMENT>
        <SALARY>10000</SALARY>
         </EMPLOYEE>
    <EMPLOYEE>
        <ID>2</ID>
        <NAME>name2</NAME>
        <DEPARTMENT>dept2</DEPARTMENT>
        <SALARY>12000</SALARY>
         </EMPLOYEE>
         ...
         ...
</EMPLOYEES>

Now to implement this concept I need to take care of the validations of the data like :

  1. id & salary should be numberic
  2. name length should be less than 20
  3. single line of a flat file should contains above 4 fields

If any of the validations fails then they needs to be reflected in the xml file along with the error line number, something like :

<NAME type="Error" Line="2"></NAME> (name length is greater than 20 in 2nd record of a flat file)

or

<EMPLOYEE type="Error" Line="1"></EMPLOYEE> (first record doesn't contains enough fields)

Now applications needs to be designed in such a way that components are pluggable with alternatives. For e.g. It should be possible to replace the parser used to parse an input file on the basis of delimiter with another one which parse it as fixed length.

So somehow I will have to design the concept in layered way like

Parsing -> Validation -> Output Generation.

Now in order to implement this concept I have given a thought, which is as follows :

  1. Read all records from file using BufferedReader and tokenize it using StringTokenizer.
  2. Initialize the employee object for each record and add them into some collection (List).
  3. Maintain the errors (mismatch of fields or any other failed validation) into Map>.
  4. Write or Marshal the list into xml using some XML Builder API (not clear which would be the best).

Can anyone give me the better suggestion or any hints to achieve the implementation ?


Solution

  • I would do the following using the libraries available in the JDK/JRE since Java SE 6.

    1. Create a StAX (JSR-173) XMLStreamWriter to output the XML content to a file.
    2. Use the XMLStreamWriter to write the root element.
    3. Read the next line of your input
    4. Convert it to an Employee object
    5. Use JAXB (JSR-222) to marshal the object to the XMLStreamWriter
    6. If there is another line repeat step 3.
    7. Use the XMLStreamWriter to end the document.

    UPDATE

    There are different options to perform the actual validation, below I'll demonstrate how to represent the resulting information in a object model that a JAXB implementation could use to produce the desired result.

    Employee

    package forum12446506;
    
    import javax.xml.bind.annotation.*;
    
    @XmlRootElement(name="EMPLOYEE")
    public class Employee {
    
        @XmlAttribute(name="Line")
        Integer line;
    
        @XmlAttribute
        String type;
    
        @XmlElement(name="ID")
        Value id;
    
        @XmlElement(name="NAME")
        Value name;
    
        @XmlElement(name="DEPARTMENT")
        Value department;
    
        @XmlElement(name="SALARY")
        Value salary;
    
        public Employee() {
        }
    
        public Employee(int line, String type) {
            this.line = line;
            this.type = type;
        }
    
    }
    

    Value

    package forum12446506;
    
    import javax.xml.bind.annotation.*;
    
    @XmlAccessorType(XmlAccessType.FIELD)
    public class Value {
    
        @XmlAttribute(name="Line")
        Integer line;
    
        @XmlAttribute
        String type;
    
        @XmlValue
        String value;
    
        public Value() {
        }
    
        public Value(Integer line, String type, String value) {
            this.line = line;
            this.type = type;
            this.value = value;
        }
    
    }
    

    Output

    <EMPLOYEE>
        <ID>1</ID>
        <NAME type="Error" Line="1"/>
    </EMPLOYEE>
    
    <EMPLOYEE type="Error" Line="2"/>