Firecrawl v2.10 Adds File Parsing, Lockdown Mode, and Smarter Scrape Formats

Firecrawl v2.10 ships a set of changes that matter across the full data-ingestion pipeline, from local files to live pages to search results. Here is what changed and what you should plug in today.

The new /parse endpoint lets you upload local files (PDF, DOCX, DOC, ODT, RTF, XLSX, XLS, and HTML) up to 50 MB and get back clean Markdown, JSON, or a summary. Tables and reading order are preserved. Enterprise plans get full Zero Data Retention support. This closes a gap that previously forced teams to run separate document-processing pipelines before feeding content to an LLM.

Lockdown Mode adds lockdown: true to /scrape. When set, results come exclusively from Firecrawl's index. Zero outbound requests are made. Zero data retention applies by default. Gated paths include HTTP fetches, robots.txt, audio downloads, and media. This is useful for regulated environments or any pipeline where you cannot afford unpredictable outbound calls. It is available in every SDK, the CLI (--lockdown), and MCP.

Two new scrape formats target token efficiency directly. The question format accepts a natural-language prompt and returns a grounded answer in data.question, using up to 100x fewer tokens per call. It runs on a managed model chain with automatic fallback and includes prompt-injection isolation via XML tagging and zero-width-space escaping. The highlights format returns the exact sentences, code blocks, and table rows that match your query, also using up to 100x fewer tokens. Consecutive sentences re-join into paragraphs, code lines wrap in fenced blocks with their original language, and table rows rebuild into Markdown tables with headers.

A video format is now available in scrape formats. It returns a signed downloadable video URL for supported sites, with cookie forwarding for authenticated downloads and explicit Lockdown gating.

Search gets two practical additions: includeDomains and excludeDomains parameters for scoping results to specific sites, and a feedback endpoint (POST /v2/search/:jobId/feedback) that refunds 1 credit per accepted rating, capped per UTC day, with idempotent retries.

On the SDK side, the official Go SDK replaces the community module and includes context-aware retry backoff and proper MapData.Links typing. Official Ruby and PHP SDKs round out the new additions. The /parse endpoint and most other features are available across JS, Python, Go, Rust, Java, .NET, PHP, Ruby, and Elixir SDKs.

The custom robotsUserAgent option lets crawl requests evaluate robots.txt rules and crawl delays against a specific agent string, separate from the ignoreRobots flag.

If you are building a RAG pipeline today, the most immediate move is to swap your document preprocessing step for the /parse endpoint and test the question or highlights formats on your highest-volume scrape calls. The token savings alone make both worth benchmarking against your current approach.