javaapache-tikalast-modifiedfile-propertiesdatecreated

Extracting the createdDateTime and LastModifiedDateTime of a file using inputstream of the file. I'm unable to get it using Metadata class from Tika


I have some files which will be either email attachments or zip attachments. which means that I have stream of the file instead of file or its actual path. I need to get the created date time and last modified date time of the file using the InputStream of the file. I tried Metadata from apache tika, It's not giving me these two things, however I can see these two properties in the file properties. Also I'm able to get the created date time and modified date time of the same files using BasicFileAttribute. But the BasicFileAttribute will work on the file path and won't work with the stream of the file. consider the scenario below -

I have a file say myTestFile.txt for this file, I can see the createdDateTime and modifiedDateTime in the file properties. and I'm able to get these two data using BasicFileAttribute. But for the same file, when I'm using Apache tike Metadata to parse with the stream of the file to get the createdDateTime and lastmodifiedDateTime, It's not giving me any of the two dates.

I need to get the solution for createdDateTime and lastModifiedDateTime with the stream instead of the file or filepath because in the production environment, I'll only have the stream and not the actual file or the file path.

Thanks


Solution

  • I got the solution. I was parsing the inputstream of the file to extract the metadata of file in Metadata class using Parser class, which was returning creation date time and last modified date time as null for few files.

    However when I tried parsing the inputstream of the file using Tika class instead of Parser class (both are the classes from apache tika), that worked for me and I'm able to get all the metadata now.

    Below code was my older approach, which wasn't giving me created, last modified date time.

    public void fetchMetaData(InputStream inputStream) {
        BodyContentHandler handler = new BodyContentHandler();
        Metadata metadata = new Metadata();
        ParseContext pcontext = new ParseContext();
    
        try {
            Parser parser = new AutoDetectParser();
            parser.parse(inputStream, handler, metadata, pcontext);
            System.out.println("creation date from metadata " + metadata.get("dcterms:created"));
            System.out.println("modified date from metadata " + metadata.get("dcterms:modified"));
            //Below loop will get all the metadata keys available in the metadata and will print the values assigned to these keys
            for (String key : metadata.names()) {
                System.out.println(key + " = " + metadata.get(key));
            }
        } catch (TikaException | SAXException | IOException ex) {
            ex.printStackTrace();
        }
    }
    

    and below is the solution that worked.

    public void fetchMetaData(InputStream inputStream){
        try {
            Tika tika = new Tika();
            Metadata metadata = new Metadata();
            tika.parse(inputStream, metadata);
            System.out.println("creation date from metadata "+metadata.get("dcterms:created"));  //created date time
            System.out.println("modified date from metadata "+metadata.get("dcterms:modified")); //last modified date time
            
            for(String key : metadata.names())
                System.out.println(key+" = "+metadata.get(key));
        } catch (IOException ex) {
            ex.printStackTrace();
        }
    

    }