n8n-Tutorial: Scraping websites with n8n and MCP

In this quick tutorial, we will show you how to easily extract content from websites using scraping. We use a locally hosted n8n workflow and the scraping solution Firecrawl via MCP server. For ambitious beginners and advanced users.

Table of Contents

Scraping website data with n8n and Firecrawl

Disclaimer: Website scraping requires the consent of the respective website. These instructions are intended solely as a technical illustration of the methodology. The legal basis for the use of scraping for data extraction must be ensured before using website scraping itself.

For our scraping setup, we use the automation solution n8n, which we host on our own computer. To make the scraper versatile, we set up a chat in which the user simply enters their scraping instructions via a prompt in the chat. The scraping is then carried out by the Firecrawl solution.

Checklist: Set up scraping workflow

Step 1: Set up a Firecrawl account and copy the API key(Firecrawl)
Step 2: Install n8n locally(instructions)
Step 3: Install n8n MCP node
Step 4: Set up n8n credentials for Firecrawl (via MCP connection)
Step 5: Set up n8n nodes
Step 6: Start chat in n8n and scrape

Result and advantages: Simple scraping of a website

At the end of the tutorial, you will have extracted the articles of a website via chat in the appropriate format within a few seconds. For experienced users, you will need around 30 minutes to set this up. This is what the result will look like:

Let’s start with the setup in n8n and Firecrawl.

Step 1: Set up Firecrawl

Firecrawl is a cost-effective and powerful scraping solution that is flexible and easy to use. Create a free trial account and set up an API key.

Firecrawl ist eine flexible und günstige Scraping-Lösung mit guter Anbindbarkeit an eigene Tools und Workflows — Firecrawl is a flexible and inexpensive scraping solution with good connectivity to your own tools and workflows

Set up Firecrawl API key

Set up the API key here. You need this so that you can use Firecrawl in n8n via API, i.e. “remotely control” it.

API-Key in Firecrawl anlegen — Create API key in Firecrawl

Step 2: Install n8n

We install the free n8n via Docker Desktop. Simply follow our tutorial for setting up n8n

n8n-Docker-Image auswählen und installieren — Select and install the n8n Docker image

Step 3: Install n8n node for MCP

The advantage of a locally hosted n8n instance is that you can easily install your own nodes from the community. This is not yet possible in the cloud version of n8n. We install a node that allows us to integrate additional tools such as Firecrawl via the MCP protocol.

call up n8n settings and go to “Community Nodes”:
- URL: http://localhost:5678/settings/community-nodes
Install MCP-Node: Enter “n8n-nodes-mcp” as npm package name and install

In unserer lokalen n8n-Instanz installieren wir einen Community-Node für MCP-Server-Kommunikation — In our local n8n instance, we install a community node for MCP server communication

Step 4: Create n8n credentials for Firecrawl with MCP server

We now create new credentials for Firecrawl in n8n. We use the modern MCP protocol so that we can request numerous Firecrawl options flexibly in the n8n workflow via chat without having to configure each command individually.

Command: npx
Arguments: -y firecrawl-mcp
Environments: FIRECRAWL_API_KEY=1234567890 (insert your API key here)

MCP-Server für Firecrawl in n8n anlegen — Create MCP server for Firecrawl in n8n

Step 5: Set up n8n nodes for the scraping workflow

We create a chat trigger in n8n so that we can simply enter input directly in the chat window. The actual work is carried out by an AI agent node.

n8n scraping workflow at a glance

n8n-Workflow für Website-Scraping mit Firecrawl per MCP-Protokoll und Anthropic Claude Sonnet 3.7 — n8n workflow for website scraping with Firecrawl via MCP protocol and Anthropic Claude Sonnet 3.7

Set up AI Agent:

Note: The MCP node requires a local n8n instance and is not included as standard but must be installed as a community node, see previous step.

Chat Model: Select e.g. ChatGPT or Anthropic and copy your API key here
Tools: In the AI Agent, we create the following MCP nodes under the “Tools” node input
MCP node – Firecrawl – List Tools: The node lists all Firecrawl methods available via MCP.
MCP node – Firecrawl – Execute: The node executes the Firecrawl command automatically.

Setting up the Execute node:

So that the AI agent can select the appropriate scraping tool from Firecrawl itself, we use an n8n command that sets the tool name via AI. Copy the following function. This is explained well in the n8n documentation(Documentation: fromAI function for n8n)

Tool Name: {{ $fromAI(“tool”, “the selected tool for executing the action”) }}

n8n-Node einrichten: Im AI Agent Node nutzt der AI Agent eigenständig den passenden Scraping-Befehl dynamisch Tool als Tool — Set up the n8n node: In the AI Agent Node, the AI Agent independently uses the appropriate scraping command dynamically Tool as Tool

Step 6: Execute scraping via chat

Now let’s run the website scrape using our own AI blog as an example.

Show firecrawl commands

First, we display the methods that Firecrawl supports via MCP. This is practical, as you can quickly see what is possible.

Firecrawl gibt alle Befehle aus, die man per MCP-Protokoll ansprechen kann — Firecrawl displays all commands that can be addressed via the MCP protocol

Start scraping via prompt

We start the chat in n8n and display the required information in a structured way.

Prompt: list the current news from “https://www.your-url.com/your-page-here” and structure by title, date, category, summary
Result: We receive the articles in the chat and can continue to use them via copy & paste, e.g. further refine them using ChatGPT, summarize the news, create images, etc.

Tip: Structure scraping results JSON

So that we can use the results directly in n8n, it makes sense to output them as structured JSON. This makes it easy to continue using them automatically in other n8n nodes.

Ergebnis: Scraping-Daten als strukturiertes JSON-Objekt — Result: Scraping data as a structured JSON object

Learn more

This Youtube video von AI Workshop shows exactly how to setup this workflow in n8n

Conclusion: How scraping helps with content creation

With n8n, you can easily scrape and process website content from your blogs. News portals and other sources very easily and process it further. You can easily integrate powerful scraping tools such as Firecrawl and even define the scraping logic very easily and intuitively via the MCP protocol so that exactly the right information is extracted in the right format.

What can automated website scraping be used for?

Save article overviews in Google Sheet and analyze and improve content mix
Improve headlines
Find exciting news
Extract topics of the week
Translate into other languages
use as a basis for graphic and video briefings
and much more