I created a java function to open a file in HDFS. The function is used only the API HDFS. I do not use any Hadoop dependencies in my code. My function worked well:
public static openFile()
{
System.out.print("main for testing the Hdfs WEB API");
URL url = new URL("http://URI/webhdfs/v1/PATH_TO_File?op=OPEN");
try {
HttpURLConnection con = (HttpURLConnection) url.openConnection() ;
con.setRequestMethod("GET");
con.setDoInput(true);
InputStream in = con.getInputStream();
int ch;
while((ch=in.read())!=-1)
{
System.out.print((char) ch);
}
} catch (IOException e) {
e.printStackTrace();
}
}
I'm doing a new function to return a List of Files in HDFS. The second function is:
public static ListFile()
{
System.out.print("main for testing the Hdfs WEB API");
URL url = new URL("http://URI/webhdfs/v1/PATH_TO_File?op=LISTSTATUS");
try {
HttpURLConnection con = (HttpURLConnection) url.openConnection() ;
con.setRequestMethod("GET");
con.setDoInput(true);
InputStream in = con.getInputStream();
logger.info("list is '{}' ", url.openStream());
} catch (IOException e) {
e.printStackTrace();
}
}
Could you please help me, how can I return the list of the files in HDFS using the stream to get the response using a scanner ? Knowing that the URLs worked well when I run them in the browser. Thanks in advance
You can use the exact same logic as the first solution, but this time, use a StringBuilder to get the full response which you then need to parse using a JSON library.
InputStream in = con.getInputStream();
int ch;
StringBuilder sb = new StringBuilder();
while((ch=in.read())!=-1) {
sb.append((char) ch);
}
String response = sb.toString();
// TODO: parse response string
Note: libraries like Retrofit / Gson would make this more straightforward