Firecrawl: A Simple Web Scraping Script
A command-line utility for scraping web content through the Firecrawl API with automatic file naming and content cleaning
Code is available via gist here
A command-line utility for scraping web content through the Firecrawl API. The script processes URLs—either individually or in bulk from a text file—and outputs clean, formatted content with automatic file naming based on page titles.
Beyond basic scraping, it handles the tedious aspects of web content extraction: stripping navigation elements, removing duplicate content, and managing filename collisions.
The script maintains a focused scope: it takes a URL or file of URLs as input, processes them through Firecrawl’s API endpoints, and saves the results in a specified directory (defaulting to ./scraped
when no output location is provided).
It also makes use of inline depdencies to ensure the script is self-contained and can be run anywhere with minimal setup.
With the magic of uv
you can even run this via a gist URL, making it extremely portable.
Usage
Locally
# Set your API key
export FIRE_CRAWL_API_KEY=your-api-key-here
# Run with a single URL
uv run firecrawl_scrape.py <https://example.com>
# Or with a file containing URLs
uv run firecrawl_scrape.py urls.txt
# Optionally specify output directory (default: ./scraped)
uv run firecrawl_scrape.py urls.txt -o ./my-scrapes
Via gist
# Set your API key
export FIRE_CRAWL_API_KEY=your-api-key-here
# Run with a single URL
uv run `https://gist.githubusercontent.com/safurrier/8714235a36a5dc502a8f4b2edb98ece3/raw/969f25a37895943725e8a42cae6a219bda3565fa/firecrawl_scrape.py` <https://example.com>
# Or with a file containing URLs
uv run `https://gist.githubusercontent.com/safurrier/8714235a36a5dc502a8f4b2edb98ece3/raw/969f25a37895943725e8a42cae6a219bda3565fa/firecrawl_scrape.py` urls.txt
# Optionally specify output directory (default: ./scraped)
uv run `https://gist.githubusercontent.com/safurrier/8714235a36a5dc502a8f4b2edb98ece3/raw/969f25a37895943725e8a42cae6a219bda3565fa/firecrawl_scrape.py` urls.txt -o ./my-scrapes
A simple script with a simple purpose. Plus by running via gist you have this available anywhere, anytime.