Build reliable web scrapers. Fast.
Crawlee is a web scraping library for JavaScript and Python. It handles blocking, crawling, proxies, and browsers for you.
npx crawlee create my-crawler
pipx run crawlee create my-crawler
Run on
import { PlaywrightCrawler } from 'crawlee';
// PlaywrightCrawler crawls the web using a headless browser controlled by the Playwright library.
const crawler = new PlaywrightCrawler({
// Use the requestHandler to process each of the crawled pages.
async requestHandler({ request, page, enqueueLinks, pushData, log }) {
const title = await page.title();
log.info(`Title of ${request.loadedUrl} is '${title}'`);
// Save results as JSON to `./storage/datasets/default` directory.
await pushData({ title, url: request.loadedUrl });
// Extract links from the current page and add them to the crawling queue.
await enqueueLinks();
},
// Uncomment this option to see the browser window.
// headless: false,
// Comment this option to scrape the full website.
maxRequestsPerCrawl: 20,
});
// Add first URL to the queue and start the crawl.
await crawler.run(['https://crawlee.dev']);
// Export the whole dataset to a single file in `./result.csv`.
await crawler.exportData('./result.csv');
// Or work with the data directly.
const data = await crawler.getData();
console.table(data.items);
Or start with a template from our CLI
$npx crawlee create my-crawler
Built with 🤍 by Apify. Forever free and open-source.
Get started now!
Crawlee won’t fix broken selectors for you (yet), but it makes building and maintaining reliable crawlers faster and easier—so you can focus on what matters most.