I am trying to get the mime type of files being uploaded in my server.
The .xlsx and .docx files mime type comes up application/zip. I tried to unzip the file and read the file of type "_rels/.rels". The doubt that I have is while reading this particular file, what should the maximum size that I should leave for the reading the file, and if the Target is "xl/workbook.xml" can I assume it to be of type xlsx?
My code is as below
file, fileHeader, err := r.FormFile("file")
buffer := make([]byte, 512)
_, err = file.Read(buffer)
if err != nil {
fmt.Println(err)
}
contentType := http.DetectContentType(buffer)
if contentType == "application/zip" {
r, err := zip.NewReader(file, fileHeader.Size)
if err != nil {
fmt.Println(err)
}
for _, zf := range r.File {
if zf.Name == "_rels/.rels" {
fmt.Println("rels")
rc, err := zf.Open()
if err != nil {
fmt.Println("Rels errors")
}
const BufferSize = 1000
buffer := make([]byte, BufferSize)
defer rc.Close()
bytesread, err := rc.Read(buffer)
if err != nil {
if err != io.EOF {
fmt.Println(err)
}
}
fmt.Println("bytes read: ", bytesread)
fmt.Println("bytestream to string: ", string(buffer[:bytesread]))
fmt.Println(rc)
}
}
}
var arr []byte
w.Header().Set("Content-Type", "application/json")
w.Write(arr)
}
the output I get is
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships"><Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/extended-properties" Target="docProps/app.xml"/><Relationship Id="rId2" Type="http://schemas.openxmlformats.org/package/2006/relationships/metadata/core-properties" Target="docProps/core.xml"/><Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument" Target="xl/workbook.xml"/></Relationships>
Any tips on how to read a .doc or .xls ?
Unfortunately DetectContentType
from the http
package is rather limited to the mime types it can detect.
As for detecting binary formats, you don't need to read the whole file if all you need is to tell if it is a .doc. You can just check the file signature. A good resource for file signatures is file signatures
If you instead want to use existing packages, this is a summary of what's on github.
Disclaimer: I'm the author of mimetype.
man magic
filetype