I have created a website using wordpress, and the first day it was full of dummy content until I uploaded mine. Google indexed pages such as:
www.url.com/?cat=1
Now these pages doesn't exists, and to make a removal request google ask me to block them on robots.txt
Should I use:
User-Agent: *
Disallow: /?cat=
or
User-Agent: *
Disallow: /?cat=*
My robots.txt file would look something like this:
User-agent: *
Disallow: /cgi-bin
Disallow: /wp-admin
Disallow: /wp-includes
Disallow: /wp-content
Disallow: /wp-login.php
Disallow: /wp-register.php
Disallow: /author
Disallow: /?cat=
Sitemap: http://url.com/sitemap.xml.gz
Does this look fine or can it cause any problem with search engines? Should I use Allow: / along with all the Disallow:?
I would go with this actually
To block access to all URLs that include a question mark (?) (more specifically, any URL that begins with your domain name, followed by any string, followed by a question mark, followed by any string):
User-agent: Googlebot
Disallow: /*?
So I would actually go with:
User-agent: Googlebot
Disallow: /*?cat=