I have created Azure blob with an excel file(.xlsx). I want to read the same using Java(Apache-poi) and Azure SDK without downloading it in locally. I can download the blob/file to a temp location and read it, but I should be able directly read the file from the blob without downloading the blob as per my requirement. Is there any way as such?
Reading excel file (.xlsx) from Azure Blob using Java and Azure SDK without downloading it locally.
You can use the below code to read the .xlsx file from Azure Blob storage using Azure Java SDK without downloading locally.
Code:
import com.azure.storage.blob.BlobClient;
import com.azure.storage.blob.BlobClientBuilder;
import org.apache.poi.ss.usermodel.*;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;
import java.io.InputStream;
public class App {
public static void main(String[] args) {
String connectionString = "xxxx";
String containerName = "test";
String blobName = "sample5000.xlsx";
BlobClient blobClient = new BlobClientBuilder()
.connectionString(connectionString)
.containerName(containerName)
.blobName(blobName)
.buildClient();
try (InputStream inputStream = blobClient.openInputStream()) {
Workbook workbook = new XSSFWorkbook(inputStream);
Sheet sheet = workbook.getSheetAt(0);
// Iterate through the rows
for (Row row : sheet) {
boolean isRowEmpty = true; // Flag to check if the row is empty
// Iterate through each cell in the row
for (Cell cell : row) {
if (cell.getCellType() != CellType.BLANK) {
isRowEmpty = false; // If we find a non-blank cell, the row is not empty
}
// Handle different cell types
switch (cell.getCellType()) {
case STRING:
System.out.print(cell.getStringCellValue() + "\t");
break;
case NUMERIC:
// Handle dates if the cell contains a date
if (DateUtil.isCellDateFormatted(cell)) {
System.out.print(cell.getDateCellValue() + "\t");
} else {
System.out.print(cell.getNumericCellValue() + "\t");
}
break;
case BOOLEAN:
System.out.print(cell.getBooleanCellValue() + "\t");
break;
default:
System.out.print("Unknown Type\t");
}
}
// Only print a new line if the row is not empty
if (!isRowEmpty) {
System.out.println();
}
}
workbook.close();
} catch (Exception e) {
e.printStackTrace();
}
}
}
The above code used BlobClient
, opens an Excel file (.xlsx
) as stream from the specified blob, and reads the first sheet. It iterates through each row and cell, printing the cell contents based on their type (string, numeric, date, or boolean). Once all the rows and cells are processed, the workbook is closed to finish up.
In my environment, the file contains 5000 rows and the above executed and successfully retrieved from Azure Blob Storage without downloading locally.
Output:
4992.0 Dorcas Darity Female United States 37.0 21/05/2015 8765.0
4993.0 Angel Sanor Male France 24.0 15/10/2017 3259.0
4994.0 Willodean Harn Female United States 39.0 16/08/2016 3567.0
4995.0 Weston Martina Male United States 26.0 21/05/2015 6540.0
4993.0 Angel Sanor Male France 24.0 15/10/2017 3259.0
4994.0 Willodean Harn Female United States 39.0 16/08/2016 3567.0
4995.0 Weston Martina Male United States 26.0 21/05/2015 6540.0
4995.0 Weston Martina Male United States 26.0 21/05/2015 6540.0
4996.0 Roma Lafollette Female United States 34.0 15/10/2017 2654.0
4997.0 Felisa Cail Female United States 28.0 16/08/2016 6525.0
4998.0 Demetria Abbey Female United States 32.0 21/05/2015 3265.0
4999.0 Jeromy Danz Male United States 39.0 15/10/2017 3265.0
5000.0 Rasheeda Alkire Female United States 29.0 16/08/2016 6125.0
Reference: Azure Storage Blob client library for Java | Microsoft Learn