How can I use the bleve text-indexing library, https://github.com/blevesearch/bleve, to index XML content?
I thought about using code like this XML parser in Go: https://github.com/dps/go-xml-parse, but then how do I pass what is parsed to Bleve to be indexed?
Update: My XML:
My XML looks like the following:
<page>
<title>Title here</title>
<image>image url here</title>
<text>A sentence of two about the topic</title>
<facts>
<fact>Fact 1</fact>
<fact>Fact 2</fact>
<fact>Fact 3</fact>
</facts>
</page>
You would create a struct defining the structure of your XML. You can then use the standard "encoding/xml" package to unmarshal XML into the struct. And from there you can index the struct with Bleve as normal.
http://play.golang.org/p/IZP4nrOotW
package main
import (
"encoding/xml"
"fmt"
)
type Page []struct {
Title string `xml:"title"`
Image string `xml:"image"`
Text string `xml:"text"`
Facts []struct {
Fact string `xml:"fact"`
} `xml:"facts"`
}
func main() {
xmlData := []byte(`<page>
<title>Title here</title>
<image>image url here</image>
<text>A sentence of two about the topic</text>
<facts>
<fact>Fact 1</fact>
<fact>Fact 2</fact>
<fact>Fact 3</fact>
</facts>
</page>`)
inputStruct := &Page{}
err := xml.Unmarshal(xmlData, inputStruct)
if nil != err {
fmt.Println("Error unmarshalling from XML.", err)
return
}
fmt.Printf("%+v\n", inputStruct)
}