testing process

How I Test Scraper APIs

Every recommendation on this site traces back to a job I actually ran. This page is the process behind the numbers, so you can judge the testing before you trust the verdict.

What I test and why

I test scraper APIs, scraping libraries, and no-code tools against the targets people actually scrape: large retailers, search engines, marketplaces, and social platforms. A tool earns coverage here when readers ask about it or when it ranks for the queries my guides answer. I run the same job on every tool in a comparison so the results sit on equal footing.

The benchmark setup

When I benchmark a category, I send real requests under controlled conditions: the same target URLs, the same day, and the same request volume across every tool. I run each tool against an easy target and a hard one, because a service that breezes through a static page can still collapse on a site with serious anti-bot defenses. I log every response so the run can be repeated.

What I measure

I record three numbers that decide whether a scraper is worth paying for:

Metric	What it tells you
Success rate	Share of requests that returned the data I asked for, not a block page or empty body.
Median latency	How long a typical request takes end to end, which sets your throughput.
Effective cost per 1,000	Real price for 1,000 successful results, after failed requests and credit math.

Effective cost matters more than sticker price. A cheap plan with a 60% success rate often costs more per usable record than a pricier one that succeeds 95% of the time.

How I score and rank

On pages with benchmark data, tools are ordered by measured results. I score each one on price, performance, features, support, and docs, then give an overall figure. Where a tool falls short, I write down exactly where, because a review with no documented weakness is a review I don't trust either. A tool's commercial relationship with this site never moves it up the table; a faster competitor beats it on the numbers regardless.

When I can't run a test

Some claims can't be benchmarked from the outside, such as a vendor's total proxy pool or uptime over a year. When I use a figure I didn't measure, I attribute it to its source and label it an estimate. I never dress up someone else's number as my own test result.

How often I re-test

Pricing and block rates drift, so benchmark pages carry the month I ran them and get re-run on a schedule. When a result changes enough to change a recommendation, I update the page and move its date forward.

Marcus Reed

I've built and run web scrapers for the better part of a decade. On this site I put scraper APIs and scraping tools through real jobs against real targets, then write up what actually holds up. See also the methodology and independence page.