How Scanning Works

A detailed look at how Brokenly scans your site and identifies affiliate links.

Understanding the scan process helps you get the most out of Brokenly.

The Scan Pipeline

1. Sitemap Fetch

Brokenly fetches your sitemap XML file. If it's a sitemap index (a file pointing to multiple sitemaps), we fetch all of them and merge the page list.

2. Page Crawl

Each page in your sitemap is visited with a headless browser. This ensures JavaScript-rendered content is captured — important for sites using affiliate link loaders or click trackers.

Every outbound link on the page is extracted. Brokenly identifies affiliate links using:

  • Known affiliate network URL patterns (Amazon Associates, ShareASale, CJ, Impact, etc.)
  • Custom domains or paths you configure in your site settings

Each identified affiliate link is checked with an HTTP request. We follow up to 5 redirect hops and record the final destination URL and HTTP status code.

5. Status Assignment

Based on the HTTP response, each link is assigned a status: Healthy, Broken, Redirected, or Unknown.

Scan Duration

Scan time depends on:

  • Number of pages in your sitemap
  • Number of affiliate links per page
  • Response time of the affiliate networks

Most sites complete a full scan in under 10 minutes.

What Brokenly Does Not Index

  • Pages not listed in your sitemap
  • Links inside iframes
  • Links generated dynamically after page load via JavaScript (unless using a pre-rendered page)
  • Password-protected pages