usage-statisticsmicroformatsrdfa

Statistics about "Microformat vs HTML+RDFa" adoption


Are there some recent and reliable statistics about "Web use" (webpages using one standard or another) of these standards?

Or an specific statistic about vCard (person and/or organization) scope of use?

Only statistics, this question is not about "what the best ideia?" or "how to use it?". Looking for statistics numbers to compare Microformats adoption with (any kind of) RDFa in HTML adoption.

We can considere, for "counting pages" statistics, that Microdata is a kind of RDFa-HTML.


NOTES

Explain context

The RDFa Lite is the only W3C recommendation, when we talk about "Microdata vs Microformat", and Microdata have a better map to RDFa Lite. HTML5 has become a W3C Recommendation in 2014-10-28, and neither one was blessed by W3C. I understand that schema.org is the best way to adopt (reuse community-schemas) RDFa.

By other hand Microformats is older, and the most simple; so, perhaps, the most used in the Web (!? is it?).

About "vCard data statistics"

If we need some scope for the statistics, let's use vCard as scope:

Other notes

Wikipedia express an old (2012's) and not-confirmable assertion (no source!), "Microformats such as hCard, however, continue to be published more than schema and others on the web", and Webdatacommons is a mess, no statistical report.

(edit) now Wikipedia's citation error is fixed.


(edit after @sashoalm comment) Note for those who disagree that this question is valid.

This question is a software problem, not a "request for off-site resource"...

PROBLEM: to decide what library, framework, data-model, etc. in a project, we need to use tools that are in use today and in the next few years... To make project decisions in a software development, we need statistics about user tendency, framework adoption, etc.

PS: here in Stackoverflow there are a lot of discussions about language statistics, that is the same "set of problems". Example: 1, 2, 3,4, 5, 6. See also the questions tagged with [usage-statistics].


Solution

  • Now I see, there are some statistics (!!), the link of Wikipedia was lost... I corrected. It isn't updated, is from "Winter 2013" (~1.5 or 2 years old collected data), but show reality and tendencies.

    http://webdatacommons.org/structureddata/index.html#toc2

    This is the chart at the report (with RDFa+HTML dominance!):

    enter image description here

    enter image description here

    Interpreting:

    Here a table for @sashoalm discussion, showing the percentuals and totals

    enter image description here


    NOTE1: HTML5 was released only 2014-10-28, so only ~2015-10 we will can check the real (definitive) impact of the new standard on the Web. An important expected impact is that Microdata not was blessed by HTML5, so the only standard is HTML+RDFa (that recommends RDFa Lite)... In the future perhaps there will less Microdata and more schema.org.

    NOTE2: methodological problem of counting web-pages, of boilerplate text with some huge-cloned "semantic markup": I think that the "next generation" of statiscs can use some "per domain analisys" to make URL substatistics (sampling) of diversity (of semantically marked pages). Ideal is to weigh (p. ex. count once the non-clones and use 1+SQRT(count) of clones) the boilerplate.

    Conclusion

    Today perhaps some people use Microformat, but there are more pages in the Web using RDFa-HTML (Microdata, RDFa, RDFa Lite, etc.), and the tendency is to grow.

    If your project is for next years, the statistics say to use RDFa.


    NOTE

    Another insteresting counting for RDFa is not the use, but the reuse of vocabularies (!). See Linked Open Vocabularies (LOV)

    LOV