xml-sitemap

XML Sitemaps: best practice when listing 'alternatives'?


I have put together a XML sitemap for Google that attempts to explain that my website's five main pages each exist in one of eight languages, so the other languages have more or less identical content, just not in English. Here's a representative snippet of what I'm doing:

<url>
<loc>http://www.mywebsite.com/lang/french/</loc>
<xhtml:link 
                 rel="alternate"
                 hreflang="fr"
                 href="http://www.mywebsite.com/lang/french/"
                 />
<xhtml:link 
                 rel="alternate"
                 hreflang="af"
                 href="http://www.mywebsite.com/lang/afrikaans/"
                 />
<xhtml:link 
                 rel="alternate"
                 hreflang="en-us"
                 href="http://www.mywebsite.com/lang/us/"
                 />
<xhtml:link 
                 rel="alternate"
                 hreflang="en"
                 href="http://www.mywebsite.com/"
                 />
<xhtml:link 
                 rel="alternate"
                 hreflang="de"
                 href="http://www.mywebsite.com/lang/german/"
                 />
<xhtml:link 
                 rel="alternate"
                 hreflang="es"
                 href="http://www.mywebsite.com/lang/spanish/"
                 />
</url>

I have two questions. First the simple one. Notice that this segment attempts to list all the locations where alternatives to the French version are located, yet it's own first entry refers to it's own location. In human terms, isn't that like saying "I am an alternative to me"? In other words, is the first entry :

<xhtml:link 
                 rel="alternate"
                 hreflang="fr"
                 href="http://www.mywebsite.com/lang/french/"
                 />

nonsense and needs to be removed?

My other question is a little broader in scope. What you see above is the section that lists alternatives to French. I have identical segments which, for any given language, list all the other languages as alternatives. If we were talking about human reasoning and I told you that B and C are alternative language versions of A, then you should be able to infer that A and C are alternatives to B, and so on. However, here it seems I do have to laboriously and redundantly spell all this out, or at least that's what I have so far with the assistance of an automated PHP script that may or may not be as helpful as it first seemed. Maybe Google, despite it's ample capacity to reason, doesn't actually care to reason much in these cases and just applies the letter of the law. So in summary, do I a) need to reference a folder as a rel="alternate" to itself? - and do I need to specify every language as an alternative to every other language on the site? Thanks.


Solution

  • In other words, is the first entry nonsense and needs to be removed?

    No, keep it.

    and do I need to specify every language as an alternative to every other language on the site?

    Unfortunately, yes. That has been explicitly recommended in Google Webmasters Hangouts.