phpwordpresshtmlhtmltidy

Prevent HTML Tidy from messing meta tags ( schema markup )


I am facing a serious problem with HTML Tidy (latest version -- https://html-tidy.org).

In short: HTML tidy convert these lines of HTML codes

<div class="breadcrumbs" typeof="BreadcrumbList" vocab="http://schema.org/">
<div class="wrap">
    <span property="itemListElement" typeof="ListItem">
        <a property="item" typeof="WebPage" title="Codes Category" href="https://mysite.works/codes/" class="taxonomy category">
            <span property="name">Codes</span>
        </a>
        <meta property="position" content="1">
    </span>
</div>

Into these lines of code -- Please take a close look at META TAGS placement.

<div class="breadcrumbs" typeof="BreadcrumbList" vocab="http://schema.org/">
<div class="wrap">
    <span property="itemListElement" typeof="ListItem">
        <a property="item" typeof="WebPage" title="Codes Category" href="https://mysite.works/codes/" class="taxonomy category">
            <span property="name">Codes</span>
        </a>
    </span>
    <meta property="position" content="1">
</div>

This is causing some serious issues with schema validations. You can check the codes here: https://search.google.com/structured-data/testing-tool/u/0/

Because of this issue, the client's (URL: https://techswami.in ) breadcrumb navigation is not visible in search results.

What am I beautifying?

My client wanted me to make his/her website's source code look "clean, readable and tidy".

So I am using these lines of codes to make it work for him/her.

Note: this code works 100% perfectly on the following WordPress setup.

Code:

if( !is_user_logged_in() || !is_admin() ) {
function callback($buffer) {
    $tidy = new Tidy();
    $options = array('indent' => true, 'markup' => true, 'indent-spaces' => 2, 'tab-size' => 8, 'wrap' => 180, 'wrap-sections' => true, 'output-html' => true, 'hide-comments' => true, 'tidy-mark' => false);
    $tidy->parseString("$buffer", $options);
    $tidy->cleanRepair();
    $buffer = $tidy;
    return $buffer;
}
function buffer_start() { ob_start("callback"); }
function buffer_end() { if (ob_get_length()) ob_end_flush(); }
add_action('wp_loaded', 'buffer_start');
add_action('shutdown', 'buffer_end');

}

What help do I need from you guys?

Can you please tell me how do I prevent HTML Tidy from messing the META TAGS. I need the parameters.

Thanks.


Solution

  • 1st of all, my sincere thanks to everyone who tried to help me.

    I have found the solution, the only problem with my solution is that it doesn't fix HTML-Tidy issue.

    So, now instead of using HTML-Tody I am using this: https://github.com/ivanweiler/beautify-html/blob/master/beautify-html.php

    My new code is:

    if( !is_user_logged_in() || !is_admin() ) {
        function callback($buffer) {
            $html = $buffer;
            $beautify = new Beautify_Html(array(
              'indent_inner_html' => false,
              'indent_char' => " ",
              'indent_size' => 2,
              'wrap_line_length' => 32786,
              'unformatted' => ['code', 'pre'],
              'preserve_newlines' => false,
              'max_preserve_newlines' => 32786,
              'indent_scripts'  => 'normal' // keep|separate|normal
            ));
    
            $buffer = $beautify->beautify($html);
            return $buffer;
        }
        function buffer_start() { ob_start("callback"); }
        function buffer_end() { if (ob_get_length()) ob_end_flush(); }
        add_action('wp_loaded', 'buffer_start');
        add_action('shutdown', 'buffer_end');
    }
    

    And now every issue related to schema markup has been fixed and the client's site has beautified source code.