If you're reading this, chances are you've seen our robot visiting your site while looking through your server logs. When we crawl to populate our index, we advertise the "User-agent" string "mozDex".
Our bot does retrieve and parse robots.txt files, and it looks for robots META tags in HTML. These are the standard mechanisms for webmasters to tell web robots which portions of a site a robot is welcome to access.
We're an open source project, so please understand that a misbehaving bot appearing with our Agent string may not have been run by us. Our code is out there for anyone to tinker with. However, whether or not we ran the bot, we'd appreciate hearing about any bad behavior. Please let us know about it! If possible, please include the name of the domain and some representative log entries. We can be reached at bot@mozdex.com
Our bot follows the robots.txt exclusion standard. Depending on the configuration, our robot may obey different rules. To make it simple to send our bot away, we'll always obey rules for "mozDex". Here are the different cases.
To ban all bots from your site, place the following in your robots.txt file:
User-agent: *
Disallow: /
To ban mozDex bots from your site, place the following in your robots.txt file:
User-agent: mozDex
Disallow: /
User-agent: mozDexOrg
Disallow:
To ban all mozDex bots from your site:
User-agent: mozDex
Disallow: /
If you do not have permission to edit the /robots.txt file on your server, you can still tell robots not to index your pages or follow your links. The standard mechanism for this is the robots META tag.
To tell mozDex, and other robots, not to index your page or follow your links, insert this META tag into the HEAD section of your HTML document:
<meta name="robots" content="noindex,nofollow">
Of course, you can control the "index" and "follow" directives independantly. The keywords "all" or "none" are also allowed, meaning "index,follow" or "noindex,nofollow", respectively. Some examples are:
<meta name="robots" content="all">
<meta name="robots" content="index,follow">
<meta name="robots" content="index,nofollow">
<meta name="robots" content="noindex,follow">
<meta name="robots" content="none">
If there are no robots META tags, or if an action is not specifically prohibited (ie. neither "nofollow" or "none" appears), mozDex will assume it is allowed to index or follow links.