How to work with crawler search operators

You will be able to use the following operators

Comparison operators and collectible fields are listed on the main page in the "Frequently Asked Questions -> Crawling" section

Notice! Some logical operators support either string or numeric values exclusively.

Few examples on working with operators

How to find internal pages of the website from which the crawler became aware of the page (URL)
  1. Navigate to the crawling report page
  2. This information is contained in the From URL column. It represents the page on which the crawler landed and discovered a link to the respective page.

How to find all duplicates (non-unique) titles on the website

  1. Navigate to the crawling report page
  2. Specify the following values in the search menu
    Column Operator Value
    content-type start with text/html
    and title [only_duplicate]
    and http status = 200
  3. Click on Find button.
  4. The opened page will display a list of URLs containing non-unique titles within the crawling process.

How to find pages that lack a title

  1. Navigate to the crawling report page
  2. Specify the following values in the search menu
    Column Operator Value
    content-type start with text/html
    and title empty
    and http status = 200
    or content-type start with text/html
    and title count = 0
    and http status = 200
  3. Click on Find button.

How to find pages that have the title tag, but its content is empty

  1. Navigate to the crawling report page
  2. Specify the following values in the search menu
    Column Operator Value
    content-type start with text/html
    and title count > 0
    and title str length less 1
    and http status = 200
  3. Click on Find button.

How to find pages that lack a H1

  1. Navigate to the crawling report page
  2. Specify the following values in the search menu
    Column Operator Value
    content-type start with text/html
    and h1 1 empty
    and http status = 200
    or content-type start with text/html
    and h1 count = 0
    and http status = 200
  3. Click on Find button.

How to find all duplicates (non-unique) H1 on the website

  1. Navigate to the crawling report page
  2. Specify the following values in the search menu
    Column Operator Value
    content-type start with text/html
    and h1 1 [only_duplicate]
    and http status = 200
  3. Click on Find button.
  4. The opened page will display a list of URLs containing non-unique H1 within the crawling process.

How to find all links on the website that use the HTTP protocol instead of HTTPS

  1. Navigate to the crawling report page
  2. Specify the following values in the search menu
    Column Operator Value
    url not start with https
  3. Click on Find button.
  4. The opened page will display a list of URLs containing non-unique h1 within the crawling process.

In a similar manner, you can find duplicates or missing tags using the following fields

url, found on, redirect to, content-type, last modified, http version, title, h1 1, h1 2, h1 3, h1 4, h1 5, h1 6, h2 1, h2 2, h2 3, h2 4, h2 5, h2 6, h3 1, h3 2, h3 3, h3 4, h3 5, h3 6, meta descriptions, meta keywords, canonical, robots, http canonical, http robots, og title, og descriptions, metrika, google analytics, hreflang, http hreflang

For your convenience, we have prepared quick search templates that will be available in the search section on the report page:

Issues

  • 200 URLs
  • Non 200 URLs
  • 301 Redirects
  • Non-301 Redirects
  • 4xx Client errors
  • 5xx Server errors
  • Timeout
  • HTTPS to HTTP redirect

Indexation

  • HTML Canonical ≠ URL
  • HTML Canonical on non-200
  • HTTP Canonical ≠ URL
  • HTTP Canonical on non-200
  • HTML Meta noindex Pages
  • HTTP noindex Pages
  • URLs with Nosnippet attribute
  • More than one HTML canonical on page
  • HTML canonical is missing
  • HTML canonical is empty
  • HTML canonical from HTTP to HTTPS
  • HTML canonical from HTTPS to HTTP
  • HTTP canonical is missing
  • HTTP canonical is empty
  • HTTP canonical from HTTP to HTTPS
  • HTTP canonical from HTTPS to HTTP
  • HTTP canonical ≠ HTML canonical
  • HTTP robots ≠ HTML robots

Content

  • Duplicate Title
  • Duplicate H1
  • Duplicate Meta description
  • Uppercase Title
  • Uppercase H1
  • More than one Title on page
  • Title is empty
  • Title is missing
  • Title too long
  • Title too short
  • More than one H1 on page
  • H1 is empty
  • H1 is missing
  • H1 too long
  • H1 too short
  • More than one meta description on page
  • Meta description is empty
  • Meta description is missing
  • Meta description too long
  • Meta description too short

Current month ye@r day *