I am trying to convert EBCDIC file to ASCII using CobolIoProvider class from JRecord in Apache Beam.
Code that I'm using:
CobolIoProvider ioProvider = CobolIoProvider.getInstance();
AbstractLineReader reader = ioProvider.getLineReader(Constants.IO_FIXED_LENGTH, Convert.FMT_MAINFRAME,CopybookLoader.SPLIT_NONE, copybookname, cobolfilename);
The code reads and converts the file as required. I am able to read the cobolfilename and copybookname only from the local system which are basically paths of the EBCDIC file and the copybook respectively. However, when I try to read the files from GCS, it fails with FileNotFoundException – “The filename, directory name, or volume label syntax is incorrect” .
Is there a way to read Cobol file(EBCDIC) from GCS using CobolIoProvider class ?
If not, is there any other class available to convert Cobol file(EBCDIC) to ASCII and allowing the files to be read from GCS.
Using ICobolIOBuilder:-
Code that I’m using:
ICobolIOBuilder iob = JRecordInterface1.COBOL.newIOBuilder("copybook.cbl")
.setFileOrganization(Constants.IO_FIXED_LENGTH)
.setSplitCopybook(CopybookLoader.SPLIT_NONE);
AbstractLineReader reader = iob.newReader(bs); //bs is an InputStream object of my Cobol file
However, here are a few concerns:-
1) I have to keep my copybook.cbl locally. Is there any way to read copybook file from GCS. I tried the below code, trying to read my copybook from GCS to Stream and pass the stream to LoadCopyBook(). But the code didn’t work.
Sample code below:
InputStream bs2 = new ByteArrayInputStream(copybookfile.toString().getBytes());
LayoutDetail schema = new CobolCopybookLoader()
.loadCopyBook( bs, " copybook.cbl",
CopybookLoader.SPLIT_NONE, 0, "",
Constants.USE_STANDARD_COLUMNS,
Convert.FMT_INTEL, 0, new TextLog())
.asLayoutDetail();
AbstractLineReader reader = LineIOProvider.getInstance().getLineReader(schema);
reader.open(inputStream, schema);
2) Reading the EBCDIC file from stream using newReader didn’t convert my file to ascii.
Thanks.
I do not have a full answer. If you are using a recent version of suggest changing the JRecord code to use the JRecordInterface1. The IO-Builder is a lot more flexible than the older CobolIoProvider interface.
String encoding = "cp037"; // cp037/IBM037 US ebcdic; cp273 - German ebcdic
ICobolIOBuilder iob = JRecordInterface1.COBOL
.newIOBuilder("CopybookFile.cbl")
.setFileOrganization(Constants.IO_FIXED_LENGTH)
.setFont(encoding); // should set encoding if you can
AbstractLineReader reader = iob.newReader(datastream);
With the IO-Builder interface you can use streams. This question Stream file from Google Cloud Storage is about creating a stream from GCS, may be useful. Hopefully some one with more knowledge of GCS can help.
Alternatively you could read from GCS directly and create data-lines(data-records) using the newLine method of a JRecord-IO-Builder:
AbstractLine l = iob.newLine(byteArray);
I will look at creating a basic Read/Write interface to JRecord so JRecord user's can write there own interface to GCS or IBM's Mainframe Access (ZFile) etc. But this will take time.