n8n-Tutorial: Scraping websites with n8n and MCP

In this quick tutorial, we will show you how to easily extract content from websites using scraping. We use a locally hosted n8n workflow and the scraping solution Firecrawl via MCP server. For ambitious beginners and advanced users.

Scraping website data with n8n and Firecrawl

Disclaimer: Website scraping requires the consent of the respective website. These instructions are intended solely as a technical illustration of the methodology. The legal basis for the use of scraping for data extraction must be ensured before using website scraping itself.

For our scraping setup, we use the automation solution n8n, which we host on our own computer. To make the scraper versatile, we set up a chat in which the user simply enters their scraping instructions via a prompt in the chat. The scraping is then carried out by the Firecrawl solution.

Checklist: Set up scraping workflow

  • Step 1: Set up a Firecrawl account and copy the API key(Firecrawl)
  • Step 2: Install n8n locally(instructions)
  • Step 3: Install n8n MCP node
  • Step 4: Set up n8n credentials for Firecrawl (via MCP connection)
  • Step 5: Set up n8n nodes
  • Step 6: Start chat in n8n and scrape

Result and advantages: Simple scraping of a website

Advertisement

Ebook - ChatGPT for Work and Life - The Beginner's Guide to Getting More Done

For Beginners: Learn ChatGPT for Your Job & Life

Our latest e-book provides a simple and structured guide on how to use ChatGPT in your job or personal life.

  • Includes many examples and prompts to try out
  • 8 use cases included: e.g., as a translator, learning assistant, mortgage calculator, and more
  • 40 pages: clearly explained and focused on the essentials

View E-Book

At the end of the tutorial, you will have extracted the articles of a website via chat in the appropriate format within a few seconds. For experienced users, you will need around 30 minutes to set this up. This is what the result will look like:

Ergebnis des Scrapings: Extrahierte Blog-Artikel

Let’s start with the setup in n8n and Firecrawl.

Step 1: Set up Firecrawl

Firecrawl is a cost-effective and powerful scraping solution that is flexible and easy to use. Create a free trial account and set up an API key.

Firecrawl ist eine flexible und günstige Scraping-Lösung mit guter Anbindbarkeit an eigene Tools und Workflows
Firecrawl is a flexible and inexpensive scraping solution with good connectivity to your own tools and workflows

 

Set up Firecrawl API key

Set up the API key here. You need this so that you can use Firecrawl in n8n via API, i.e. “remotely control” it.

API-Key in Firecrawl anlegen
Create API key in Firecrawl

 

Step 2: Install n8n

We install the free n8n via Docker Desktop. Simply follow our tutorial for setting up n8n

n8n-Docker-Image auswählen und installieren
Select and install the n8n Docker image

 

Step 3: Install n8n node for MCP

The advantage of a locally hosted n8n instance is that you can easily install your own nodes from the community. This is not yet possible in the cloud version of n8n. We install a node that allows us to integrate additional tools such as Firecrawl via the MCP protocol.

  • call up n8n settings and go to “Community Nodes”:
    • URL: http://localhost:5678/settings/community-nodes
  • Install MCP-Node: Enter “n8n-nodes-mcp” as npm package name and install
In unserer lokalen n8n-Instanz installieren wir einen Community-Node für MCP-Server-Kommunikation
In our local n8n instance, we install a community node for MCP server communication

 

Step 4: Create n8n credentials for Firecrawl with MCP server

We now create new credentials for Firecrawl in n8n. We use the modern MCP protocol so that we can request numerous Firecrawl options flexibly in the n8n workflow via chat without having to configure each command individually.

  • Command: npx
  • Arguments: -y firecrawl-mcp
  • Environments: FIRECRAWL_API_KEY=1234567890 (insert your API key here)
MCP-Server für Firecrawl in n8n anlegen
Create MCP server for Firecrawl in n8n

 

Step 5: Set up n8n nodes for the scraping workflow

We create a chat trigger in n8n so that we can simply enter input directly in the chat window. The actual work is carried out by an AI agent node.

n8n scraping workflow at a glance

n8n-Workflow für Website-Scraping mit Firecrawl per MCP-Protokoll und Anthropic Claude Sonnet 3.7
n8n workflow for website scraping with Firecrawl via MCP protocol and Anthropic Claude Sonnet 3.7

 

Set up AI Agent:

Note: The MCP node requires a local n8n instance and is not included as standard but must be installed as a community node, see previous step.

  • Chat Model: Select e.g. ChatGPT or Anthropic and copy your API key here
  • Tools: In the AI Agent, we create the following MCP nodes under the “Tools” node input
  • MCP node – Firecrawl – List Tools: The node lists all Firecrawl methods available via MCP.
  • MCP node – Firecrawl – Execute: The node executes the Firecrawl command automatically.

Setting up the Execute node:

So that the AI agent can select the appropriate scraping tool from Firecrawl itself, we use an n8n command that sets the tool name via AI. Copy the following function. This is explained well in the n8n documentation(Documentation: fromAI function for n8n)

  • Tool Name: {{ $fromAI(“tool”, “the selected tool for executing the action”) }}
n8n-Node einrichten: Im AI Agent Node nutzt der AI Agent eigenständig den passenden Scraping-Befehl dynamisch Tool als Tool
Set up the n8n node: In the AI Agent Node, the AI Agent independently uses the appropriate scraping command dynamically Tool as Tool

Step 6: Execute scraping via chat

Now let’s run the website scrape using our own AI blog as an example.

Show firecrawl commands

First, we display the methods that Firecrawl supports via MCP. This is practical, as you can quickly see what is possible.

Firecrawl gibt alle Befehle aus, die man per MCP-Protokoll ansprechen kann
Firecrawl displays all commands that can be addressed via the MCP protocol

Start scraping via prompt

We start the chat in n8n and display the required information in a structured way.

  • Prompt: list the current news from “https://www.your-url.com/your-page-here” and structure by title, date, category, summary
  • Result: We receive the articles in the chat and can continue to use them via copy & paste, e.g. further refine them using ChatGPT, summarize the news, create images, etc.

Ergebnis des Scrapings: Extrahierte Blog-Artikel

Tip: Structure scraping results JSON

So that we can use the results directly in n8n, it makes sense to output them as structured JSON. This makes it easy to continue using them automatically in other n8n nodes.

Ergebnis: Scraping-Daten als strukturiertes JSON-Objekt
Result: Scraping data as a structured JSON object

 

Learn more

This Youtube video von AI Workshop shows exactly how to setup this workflow in n8n

Ads

Legal Notice: This website ai-rockstars.com participates in the Amazon Services LLC Associates Program, an affiliate advertising program designed to provide a means for sites to earn advertising fees by advertising and linking to Amazon.com.

Conclusion: How scraping helps with content creation

With n8n, you can easily scrape and process website content from your blogs. News portals and other sources very easily and process it further. You can easily integrate powerful scraping tools such as Firecrawl and even define the scraping logic very easily and intuitively via the MCP protocol so that exactly the right information is extracted in the right format.

What can automated website scraping be used for?

  • Save article overviews in Google Sheet and analyze and improve content mix
  • Improve headlines
  • Find exciting news
  • Extract topics of the week
  • Translate into other languages
  • use as a basis for graphic and video briefings
  • and much more