angularjsapache.htaccessmod-rewritegoogle-index

Angular application indexed by Google


I feel like I have attempted every single option out there and nothing has succeeded. First let me list the options I have tried:

Using prerender with Apache:

I have attempted this using the following steps:

In Angular:

$locationProvider.html5Mode(true);

In HTML, add this meta header:

<head>
    <meta name="fragment" content="!">
</head>

Configure Apache:

  RewriteEngine On
# If requested resource exists as a file or directory
  # (REQUEST_FILENAME is only relative in virtualhost context, so not usable)
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -f [OR]
    RewriteCond %{DOCUMENT_ROOT}%{REQUEST_URI} -d
    # Go to it as is
    RewriteRule ^ - [L]

  # If non existent
    # If path ends with / and is not just a single /, redirect to without the trailing /
      RewriteCond %{REQUEST_URI} ^.*/$
      RewriteCond %{REQUEST_URI} !^/$
      RewriteRule ^(.*)/$ $1 [R,QSA,L]      

  # Handle Prerender.io
    RequestHeader set X-Prerender-Token "YOUR_TOKEN"

    RewriteCond %{HTTP_USER_AGENT} baiduspider|facebookexternalhit|twitterbot|rogerbot|linkedinbot|embedly|quora\ link\ preview|showyoubot|outbrain|pinterest [NC,OR]
    RewriteCond %{QUERY_STRING} _escaped_fragment_

    # Proxy the request
    RewriteRule ^(.*)$ http://service.prerender.io/http://%{HTTP_HOST}$1 [P,L]

  # If non existent
    # Accept everything on index.html
    RewriteRule ^ /index.html

This did not work at all: Google was unable to read my subpages.

Using Node / Phantomjs to render pages

    var express = require('express');
var app = module.exports = express();
var phantom = require('node-phantom');
app.use('/', function (req, res) {
    if (typeof(req.query._escaped_fragment_) !== "undefined") {
        phantom.create(function (err, ph) {
            return ph.createPage(function (err, page) {
                return page.open("https://system.dk/#!" + req.query._esca$
                    return page.evaluate((function () {
                        return document.getElementsByTagName('html')[0].innerHT$
                    }), function (err, result) {
                        res.send(result);
                        return ph.exit();
                    });
                });
            });
        });
    } else
        res.render('index');
});

app.listen(3500);
console.log('Magic happens on port ' + 3500);

Here I created this site then added a proxy in my Apache configuration so that all request pointed on my domain port 3500.

This did not work since it could not render index and when I finally got it to send the html page the JavaScript would not render.

Following custom snapshot guide

Then I followed this guide:

http://www.yearofmoo.com/2012/11/angularjs-and-seo.html

However this required me to make custom snapshots of everything which is not what I am looking for and is annoying to maintain. Plus the make-snapshot did not work on my server.


Solution

  • In theory you shouldn't have to prerender your page to google crawler. They parse javascript nowadays. http://googlewebmastercentral.blogspot.ca/2014/05/understanding-web-pages-better.html

    Google parses my angularJS website just fine. https://www.google.nl/search?q=site%3Atest.coachgezocht.nu

    Using $locationProvider.html5Mode(true); and:

        RewriteEngine On
        RewriteCond %{REQUEST_FILENAME} !-f
        RewriteCond %{REQUEST_FILENAME} !-d
        RewriteCond %{REQUEST_URI} !index
        RewriteRule (.*) index.html [L]