I'm writing a little web crawler, and a lot of the links on sites I'm crawling are relative (so they're /robots.txt
, for example). How do I convert these relative URLs to absolute URLs (so /robots.txt
=> http://google.com/robots.txt
)? Does Go have a built-in way to do this?
Yes, the standard library can do this with the net/url
package. Example (from the standard library):
package main
import (
"fmt"
"log"
"net/url"
)
func main() {
u, err := url.Parse("../../..//search?q=dotnet")
if err != nil {
log.Fatal(err)
}
base, err := url.Parse("http://example.com/directory/")
if err != nil {
log.Fatal(err)
}
fmt.Println(base.ResolveReference(u))
}
Notice that you only need to parse the absolute URL once and then you can reuse it over and over.