Continue reading Detail Crawl Guide

Detail Crawl Guide

Summary: A detailed look at the Crawl settings and features. Options for how to add pages into your website to be scanned. Overview Crawler Purpose. Differences between Crawl and Scan Website Crawl – Quick Start Default Crawl behavior – Sitemap Import Traditional Page Crawl Crawler Options Crawl Limitations Troubleshooting a Crawl Additional Information Crawler Purpose…

Read more Detail Crawl Guide
Continue reading Forbidden by robots.txt

Forbidden by robots.txt

Forbidden by robots.txt is the most common crawl error type. The Pope Tech crawler respects your robots.txt file. To get around this simply update the robots.txt file found on yourdomain.com/robots.txt. You can leave it to block other bots but add in a line to specifically allow the Pope Tech user agents. Example robots.txt entry: User-agent:…

Read more Forbidden by robots.txt
Continue reading Organization Default Crawler Settings

Organization Default Crawler Settings

The defaults for two crawler settings, “Max Pages” and “Max Depth” can be set at the organization level. Once these are set, all new websites created will have this default. Setting new organization crawler defaults will not override any existing individual website’s crawler settings. You can adjust individual websites crawler settings in that website’s settings.…

Read more Organization Default Crawler Settings