How to prevent search engine to index private data or sensitive information?
August 31, 2009 by The Big SEO
Filed under robots.txt
If you want that search engine should not index your private date or sensitive information related to your website then there is one way to do it so. The search engine checks for one file on your website root that is robot.txt.
What is robot.txt?
This simple little file can get you listed or keep you from ever being listed. Robot.txt is a text file which contains instruction for search engine to act on.
This is simple file tell search engine to list or index website or not to index or list website. If listed then which pages only. You required store this file on the root of your webpage, in the same folder as your index.html or home page.
Thus, all the major Search Engine have adhered to rules on this files, keeping with good web crawling ethics.
Sometimes webmaster required to move domain, the robot.txt file that will help keep your files in their index with new domain name.
With robot.txt you can ban or restrict particular search engine crawler from your website.
General format for robot.txt is as follow:
User-agent:*
Disallow:
Above statement is for all robots to visit all files for your website:
User-agent:*
Disallow:
Above statement keeps all robots out
User-agent:*
Disallow: /Script/
Disallow: /Admin/
Above statement keeps all robots out from script & admin folders
User-agent: Exabot
Disallow: /
Above statement bans Exabot from all files on the server
User-agent: Exabot
Disallow:/style.css
Above statement keeps Exabot out from style.css


