firecrawl-mcp
npm:firecrawl-mcp@3.20.1
github.com/firecrawl/firecrawl-mcp-server
Severity breakdown
Worst finding
Tool `Call` fetches external web content -- indirect-injection surface
· Call
Description: "`firecrawl_agent` with your prompt/schema → returns job ID" -- this tool pulls externally-controlled content into the agent's context window, the canonical indirect-injection vector. Even when the user supplies the URL, content at that URL can carry hostile instructions.
fix: Sandbox the fetched content: strip prompts before forwarding to the model, constrain to an allow-list of domains, and route through capframe-guard with a `domain in [...]` caveat.
All 3 findings
- mediumTool `Call` fetches external web content -- indirect-injection surface· Callindirect injection
Description: "`firecrawl_agent` with your prompt/schema → returns job ID" -- this tool pulls externally-controlled content into the agent's context window, the canonical indirect-injection vector. Even when the user supplies the URL, content at that URL can carry hostile instructions.
fix: Sandbox the fetched content: strip prompts before forwarding to the model, constrain to an allow-list of domains, and route through capframe-guard with a `domain in [...]` caveat.
- mediumTool `Poll` fetches external web content -- indirect-injection surface· Pollindirect injection
Description: "`firecrawl_agent_status` with the job ID to check progress" -- this tool pulls externally-controlled content into the agent's context window, the canonical indirect-injection vector. Even when the user supplies the URL, content at that URL can carry hostile instructions.
fix: Sandbox the fetched content: strip prompts before forwarding to the model, constrain to an allow-list of domains, and route through capframe-guard with a `domain in [...]` caveat.
- mediumTool `When` fetches external web content -- indirect-injection surface· Whenindirect injection
Description: "status is "completed", the response includes the extracted data **Best for:** - Complex research tasks where you don't know the exact URLs - Multi-source data gathering - Finding information scattered across the web - Tasks where you can do other work while waiting for results **Not recommended for:** - Simple single-page scraping where you know the URL (use scrape with JSON format - faster and cheaper) **Arguments:** - `prompt`: Natural language description of the data you want (required, max 10,000 characters) - `urls`: Optional array of URLs to focus the agent on specific pages - `schema`: Optional JSON schema for structured output **Prompt Example:** > "Find the founders of Firecrawl and their backgrounds" **Usage Example (start agent, then poll for results):** ```json { "name": "fi..." -- this tool pulls externally-controlled content into the agent's context window, the canonical indirect-injection vector. Even when the user supplies the URL, content at that URL can carry hostile instructions.
fix: Sandbox the fetched content: strip prompts before forwarding to the model, constrain to an allow-list of domains, and route through capframe-guard with a `domain in [...]` caveat.
How this was scored
Source registry — tool surface extracted from the package's README + manifest (R3/R5/R6/R7 fire; schema-dependent rules deferred). Findings are emitted by the public capframe.findings.v1 schema. Score = 100 − (10·Critical + 4·High + 2·Medium + 1·Low), clamped to [0, 100].
Disagree with a finding? Open an issue.