How to prevent search engine to index private data or sensitive information?

August 31, 2009 by The Big SEO  
Filed under robots.txt

If you want that search engine should not index your private date or sensitive information related to your website then there is one way to do it so. The search engine checks for one file on your website root that is robot.txt.

What is robot.txt?

This simple little file can get you listed or keep you from ever being listed. Robot.txt is a text file which contains instruction for search engine to act on.

This is simple file tell search engine to list or index website or not to index or list website. If listed then which pages only. You required store this file on the root of your webpage, in the same folder as your index.html or home page.

Thus, all the major Search Engine have adhered to rules on this files, keeping with good web crawling ethics.

Sometimes webmaster required to move domain, the robot.txt file that will help keep your files in their index with new domain name.

With robot.txt you can ban or restrict particular search engine crawler from your website.

General format for robot.txt is as follow:

User-agent:*

Disallow:

Above statement is for all robots to visit all files for your website:

User-agent:*

Disallow:

Above statement keeps all robots out

User-agent:*

Disallow: /Script/

Disallow: /Admin/

Above statement keeps all robots out from script & admin folders

User-agent: Exabot

Disallow: /

Above statement bans Exabot from all files on the server

User-agent: Exabot

Disallow:/style.css

Above statement keeps Exabot out from style.css

Link ExchangeWeb HostingWebsite Hosting

Share and Enjoy:
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks

Comments

Tell us what you're thinking...
and oh, if you want a pic to show with your comment, go get a gravatar!