Skip to main content

Build reliable web scrapers. Fast.

Crawlee is a web scraping library for JavaScript and Python. It handles blocking, crawling, proxies, and browsers for you.

Crawlee JavaScriptCrawlee JavaScriptnpx crawlee create my-crawler
Crawlee PythonCrawlee Pythonpipx run crawlee create my-crawler
Run on
import { PlaywrightCrawler } from 'crawlee';

// PlaywrightCrawler crawls the web using a headless browser controlled by the Playwright library.
const crawler = new PlaywrightCrawler({
// Use the requestHandler to process each of the crawled pages.
async requestHandler({ request, page, enqueueLinks, pushData, log }) {
const title = await page.title();
log.info(`Title of ${request.loadedUrl} is '${title}'`);

// Save results as JSON to `./storage/datasets/default` directory.
await pushData({ title, url: request.loadedUrl });

// Extract links from the current page and add them to the crawling queue.
await enqueueLinks();
},

// Uncomment this option to see the browser window.
// headless: false,

// Comment this option to scrape the full website.
maxRequestsPerCrawl: 20,
});

// Add first URL to the queue and start the crawl.
await crawler.run(['https://crawlee.dev']);

// Export the whole dataset to a single file in `./result.csv`.
await crawler.exportData('./result.csv');

// Or work with the data directly.
const data = await crawler.getData();
console.table(data.items);
Or start with a template from our CLI
$npx crawlee create my-crawler
Built with 🤍 by Apify. Forever free and open-source.

Get started now!

Crawlee won’t fix broken selectors for you (yet), but it makes building and maintaining reliable crawlers faster and easier—so you can focus on what matters most.