How Scanning Works
A detailed look at how Brokenly scans your site and identifies affiliate links.
Understanding the scan process helps you get the most out of Brokenly.
The Scan Pipeline
1. Sitemap Fetch
Brokenly fetches your sitemap XML file. If it's a sitemap index (a file pointing to multiple sitemaps), we fetch all of them and merge the page list.
2. Page Crawl
Each page in your sitemap is visited with a headless browser. This ensures JavaScript-rendered content is captured — important for sites using affiliate link loaders or click trackers.
3. Link Extraction
Every outbound link on the page is extracted. Brokenly identifies affiliate links using:
- Known affiliate network URL patterns (Amazon Associates, ShareASale, CJ, Impact, etc.)
- Custom domains or paths you configure in your site settings
4. Link Verification
Each identified affiliate link is checked with an HTTP request. We follow up to 5 redirect hops and record the final destination URL and HTTP status code.
5. Status Assignment
Based on the HTTP response, each link is assigned a status: Healthy, Broken, Redirected, or Unknown.
Scan Duration
Scan time depends on:
- Number of pages in your sitemap
- Number of affiliate links per page
- Response time of the affiliate networks
Most sites complete a full scan in under 10 minutes.
What Brokenly Does Not Index
- Pages not listed in your sitemap
- Links inside iframes
- Links generated dynamically after page load via JavaScript (unless using a pre-rendered page)
- Password-protected pages