htmlsecuritycdnuri-schemesubresource-integrity

How to cryptographically verify web page requisites?


How to cryptographically verify web page requisites in HTML?

For example, if I have some external resource like an image, a style sheet or (most importantly) a script on a (potentially untrusted) content delivery network, is it possible to force the client browser to cryptographically verify the hash of the downloaded resource before usage? Is there some HTML attribute or URL scheme for this or does one manually have to write some JavaScript to do it?

The rationale is that providing the hashes in HTML served over HTTPS provides an extra defence against compromised (or faulty) CDN-s.

Related questions on SO:


Solution

  • As of 23 June 2016 Subresource Integrity is a W3C Recommendation which allows you to do just that (draft version here). According to the Implementation Report it is already implemented in Firefox 43 and Chrome 45.

    A simple example using subresource integrity would be something like:

    <script src="https://example.com/example.js"
        integrity="sha256-8OTC92xYkW7CWPJGhRvqCR0U1CR6L8PhhpRGGxgW4Ts="
        crossorigin="anonymous"></script>
    

    It is also possible to specify multiple algorithm-hash pairs (called metadata) in integrity field, separated by whitespace and ignoring invalid data (§3.3.3). The client is expected to filter out the strongest metadata values (§3.3.4), and compare the hash of the actual data to the hash values in set of the strongest metadata values (§3.3.5) to determine whether the resource is valid. For example:

    <script src="https://example.com/example.js"
        integrity="
           md5-kS7IA7LOSeSlQQaNSVq1cA==
           md5-pfZdWPRbfElkn7w8rizxpw==
           sha256-8OTC92xYkW7CWPJGhRvqCR0U1CR6L8PhhpRGGxgW4Ts=
           sha256-gx3NQgFlBqcbJoC6a/OLM/CHTcqDC7zTuJx3lGLzc38=
           sha384-pp598wskwELsVAzLvb+xViyFeHA4yIV0nB5Aji1i+jZkLNAHX6NR6CLiuKWROc2d
           sha384-BnYJFwkG74mEUWH4elpCm8d+RFIMDgjWWbAyaXAb8Oo//cHPOeYturyDHF/UcnUB"
        crossorigin="anonymous"></script>
    

    If the client understands SHA256 and SHA384, but not MD5, then it tokenizes the value of the integrity attribute by whitespace and throws away the md5- metadata tokens as garbage. The client then determines that the strongest hashes in the metadata are SHA384 and compares their values to the SHA384 hash of the actual data received.