javajsonjsonnode

How to generated the csv file from json data with Java?


I try to generate csv file from json type data. These are my json test data.

{
  "realtime_start":"2020-09-25",
  "realtime_end":"2020-09-25",,
  "units": "Percent",
  "seriess": [
    {
      "name": "James",
      "age": 29,
      "house": "CA"
    },
    {
      "name": "Jina",
      "age": 39,
      "house": "MA",
      "notes": "Million tonne punch"
    },
}

The problem is json array type "seriess" does not contain "notes" node in all every nodes. I made the below java codes to change this json data to csv file with header columns

JSONObject json = getJsonFileFromURL(...)

JSONArray docsArray = json.getJSONArray("seriess");
docsArray.put(json.get("realtime_start"));
docsArray.put(json.get("realtime_end"));
docsArray.put(json.get("units"));

JsonNode jsonTree = new ObjectMapper().readTree(docsArray.toString());
        
Builder csvSchemaBuilder = CsvSchema.builder();
for(JsonNode node : jsonTree) {
    node.fieldNames().forEachRemaining(fieldName -> {csvSchemaBuilder.addColumn(fieldName);} );
}

CsvSchema csvSchema = csvSchemaBuilder.build().withHeader();
CsvMapper csvMapper = new CsvMapper();
csvMapper.writerFor(JsonNode.class).with(csvSchema).writeValue(new File("test.csv"), jsonTree);

But the incorrect results are shown like below,

realtime_start,realtime_end,units,names,age,house,realtime_start,realtime_end,units,names,age,house,notes, realtime_start,.....

The generated header columns does not contain distinct values. The header columns are added in duplicate. How can I generate the distinct header like below

realtime_start,realtime_end,units,names,age,house, notes

Any idea?

Update Part

I try to extract data from the FRED (FEDERAL RESERVE BANK OF ST. LOUIS). FRED provide simple and convenient Python api like below,

from fredapi import Fred 
import pandas as pd

fred = Fred(api_key='abcdefghijklmnopqrstuvwxyz0123456789')
data_unemploy = fred.search('Unemployment Rate in California')
data_unemploy.to_csv("test_unemploy.csv")

But the java apis are deprecated, so I have to develop simple Java api which convert json values to csv file. I found the below Java codes with googling

JSONObject json = getJsonFileFromURL("https://api.stlouisfed.org/fred/series/search?search_text=Unemployment+Rate+in+California&api_key=abcdefghijklmnopqrstuvwxyz0123456789&file_type=json");
        
JSONArray docsArray = json.getJSONArray("seriess");
docsArray.put(json.get("realtime_start"));
docsArray.put(json.get("realtime_end"));

JsonNode jsonTree = new ObjectMapper().readTree(docsArray.toString());
JsonNode firstObject = jsonTree.elements().next();  // I am struggling with this line 
firstObject.fieldNames().forEachRemaining(fieldName -> {csvSchemaBuilder.addColumn(fieldName);} );
CsvSchema csvSchema = csvSchemaBuilder.build().withHeader();
        
CsvMapper csvMapper = new CsvMapper();
csvMapper.writerFor(JsonNode.class).with(csvSchema).writeValue(new File("test.csv"), jsonTree);

To extract columns from json data JsonNode firstObject = jsonTree.elements().next(); return the first json node. But this line does not return notes column. because the first line does not contain the notes key value.

So I change this code line to following lines

for(JsonNode node : jsonTree) {
    node.fieldNames().forEachRemaining(fieldName -> {
        csvSchemaBuilder.addColumn(fieldName);
    } );
}

But these lines throws the results which I do not expect. The repeated duplicated columns like below

realtime_start,realtime_end,units,names,age,house,realtime_start,realtime_end,units,names,age,house,notes, realtime_start,.....

I am totally stuck with this part.


Solution

  • Most probably it is easiest to write a bin type class like below :

    public class CsvVo {
    
        private String realtime_start;
        private String realtime_end;
        private String units;
        private String name;
        private String age;
        private String house;
        private String notes;
    
        public void setRealtime_start(String realtime_start) {
            this.realtime_start = realtime_start;
        }
    
    //Other getters and Setters
    

    Then you can Write :

    public class ConvertJsonToCSVTest {
        public static void main(String[] args) throws JSONException {
            String jsonArrayString = "{\n" +
                    "\t\"realtime_start\": \"2020-09-25\",\n" +
                    "\t\"realtime_end\": \"2020-09-25\",\n" +
                    "\t\"units\": \"Percent\",\n" +
                    "\t\"seriess\": [{\n" +
                    "\t\t\t\"name\": \"James\",\n" +
                    "\t\t\t\"age\": 29,\n" +
                    "\t\t\t\"house\": \"CA\"\n" +
                    "\t\t},\n" +
                    "\t\t{\n" +
                    "\t\t\t\"name\": \"Jina\",\n" +
                    "\t\t\t\"age\": 39,\n" +
                    "\t\t\t\"house\": \"MA\",\n" +
                    "\t\t\t\"notes\": \"Million tonne punch\"\n" +
                    "\t\t}\n" +
                    "\t]\n" +
                    "}";
            JSONObject inJson;
                List<CsvVo> list = new ArrayList<>();
                inJson = new JSONObject(jsonArrayString);
                JSONArray inJsonSeries = inJson.getJSONArray("seriess");
                for (int i = 0, size = inJsonSeries.length(); i < size; i++){
                    CsvVo line = new CsvVo();
                    line.setRealtime_start(inJson.get("realtime_start").toString());
                    line.setRealtime_end(inJson.get("realtime_end").toString());
                    line.setUnits(inJson.get("units").toString());
                    JSONObject o = (JSONObject)inJsonSeries.get(i);
                    line.setName(o.get("name").toString());
                    line.setAge(o.get("age").toString());
                    line.setHouse(o.get("house").toString());
                    try {
                        line.setNotes(o.get("notes").toString());
                    }catch (JSONException e){
                        line.setNotes("");
                    }
                    list.add(line);
                }
                String[] cols = {"realtime_start", "realtime_end", "units", "name", "age", "house", "notes"};
                CsvUtils.csvWriterUtil(CsvVo.class, list, "in/EmpDetails.csv", cols);
    
            }
        }
    

    csvWriterUtil is like below :

        public static <T> void csvWriterUtil(Class<T> beanClass, List<T> data, String outputFile, String[] columnMapping){
            try{
                Writer writer = new BufferedWriter(new FileWriter(outputFile));
                ColumnPositionMappingStrategy<T> strategy = new ColumnPositionMappingStrategy<>();
                strategy.setType(beanClass);
                strategy.setColumnMapping(columnMapping);
                StatefulBeanToCsv<T> statefulBeanToCsv =new StatefulBeanToCsvBuilder<T>(writer)
                        .withMappingStrategy(strategy)
                        .build();
                writer.write(String.join(",",columnMapping)+"\n");
                statefulBeanToCsv.write(data);
                writer.close();
            } catch (IOException e) {
                e.printStackTrace();
            } catch (CsvRequiredFieldEmptyException e) {
                e.printStackTrace();
            } catch (CsvDataTypeMismatchException e) {
                e.printStackTrace();
            }
        }
    

    Full example is available in GitRepo