firecrawl_extract
on npm:firecrawl-mcp@3.20.2
Severity
2 findings on this tool
- mediumunconstrained inputf-r1-firecrawl_extract
Tool `firecrawl_extract` accepts unconstrained string input
The following string parameter(s) have no `maxLength` constraint: `prompt`. Unbounded strings let an attacker stuff arbitrary payloads through the tool, including indirect-injection content.
fix: Add a `maxLength` to each string property, or constrain with an `enum` or `pattern`. Most legitimate tool inputs fit under a few hundred bytes.
OWASP LLM01NIST MEASURE-2.3ATLAS T0051 - mediumindirect injectionf-r6-firecrawl_extract
Tool `firecrawl_extract` fetches external web content -- indirect-injection surface
Description: " Extract structured information from web pages using LLM capabilities. Supports both cloud AI and self-hosted LLM extraction. **Best for:** Extracting specific structured data like prices, names, details from web pages. **Not recommended for:** When you need the full content of a page (use scrape); when you're not looking for specific structured data. **Arguments:** - urls: Array of URLs to extract information from - prompt: Custom prompt for the LLM extraction - schema: JSON schema for structured data extraction - allowExternalLinks: Allow extraction from external links - enableWebSearch: Enable web search for additional context - includeSubdomains: Include subdomains in extraction **Prompt Example:** "Extract the product name, price, and description from these product pages." **Usage Example:** ```json { "name": "firecrawl_extract", "arguments": { "urls": ["https://example.com/page1", "https://example.com/page2"], "prompt": "Extract product information including name, price, and description", "schema": { "type": "object", "properties": { "name": { "type": "string" }, "price": { "type": "number" }, "description": { "type": "string" } }, "required": ["name", "price"] }, "allowExternalLinks": false, "enableWebSearch": false, "includeSubdomains": false } } ``` **Returns:** Extracted structured data as defined by your schema. " -- this tool pulls externally-controlled content into the agent's context window, the canonical indirect-injection vector. Even when the user supplies the URL, content at that URL can carry hostile instructions.
fix: Sandbox the fetched content: strip prompts before forwarding to the model, constrain to an allow-list of domains, and route through capframe-guard with a `domain in [...]` caveat.
OWASP LLM01NIST MEASURE-2.3ATLAS T0051
About this tool
firecrawl_extract is one of 20 tools exposed by Firecrawl MCP. The server scored 0/100 overall against the capframe rule engine (source: sandbox). Last scanned 2026-06-05.
The findings above are emitted by the public capframe.findings.v1 schema. Disagree with one? Open an issue.