Friday 22 January 2016

Robots.txt file

Webmasters of websites give instructions to search engines crawlers with robots file,

It is a text file with english commands for the search engine crawlers, the file is named as robots.txt,

Following is a simple robots.txt file,

User-agent: *
Disallow: /

The above means all search engine crawlers should not index any page of the website

User-agent: *
Disallow:

The above means all search engine crawlers should index all the website pages

User-agent: *
Disallow: /cgi-bin/
Disallow: /tmp/
Disallow: /junk

The above means all search engine crawlers should not index the said folders in the website server

User-agent: GoogleBot
Disallow:/

The above means only Google search engine crawler should not index the website pages,

There are more commands for robots.txt file which we will post later.

Robots.txt file is an important file, any errors in robots.txt file will affect the website positions in search engines

No comments:

Post a Comment