← Back to use cases

Content Scraping Prevention

Protect your content from web scraping by reliably detecting bots and browser automation tools using Fingerprint Pro Bot Detection.

What is content scraping?

Content scraping or web scraping is the process of extracting valuable data from websites using automated scripts or bots. If your website contains expensive data to collect or compute (e.g., flight connections, real-estate listings, product prices, or user data), a bad actor or competitor could steal and use it for nefarious purposes. Your data could also be scraped by browsing plugins of generative AI models like ChatGPT and used to answer user queries or train large language models without your knowledge.

Bots vary in their ability to scrape content and avoid detection. Simple scripts using an HTTP library like wget can retrieve pages from a web server and parse information from the HTML response. They can be effective for scraping static sites but less efficient for client-rendered content. They are also easier to detect as your website can easily test its inability to execute JavaScript.

Headless browsers and browser automation tools like Puppeteer or Selenium are much more sophisticated. They can execute JavaScript, scroll, press buttons, wait for client-rendered content to load, and scrape it. They are full-featured browsers, only automated, which makes them more robust and harder to detect. Many also have “stealth” plugins, which try to make them resemble regular browsers.

Protect your content from sophisticated bots

A web application firewall can provide an essential layer of rule-based protection, such as blocking IP ranges, countries, and data centers known to host bots. This first line of defense is helpful but sometimes insufficient, as scrapers can use proxies to cycle through different IP addresses.

You can ask your visitors to prove their human by completing CAPTCHA challenges, like picking all the images that contain a sombrero. This is generally effective but also disruptive to the user experience. To fight bots without bothering humans, you can use a client-side library to detect bots at runtime by analyzing the visitor’s browser.

Fingerprint Pro Bot Detection collects vast amounts of browser data that bots leak (errors, network overrides, browser attribute inconsistencies, API changes, and more) to reliably distinguish real users from headless browsers, automation tools, their derivatives, and plugins.

It is based on BotD — a free and open-source library that detects simple bots running entirely in the client. Fingerprint Pro Bot Detection can detect a broader range of sophisticated bots and runs the analysis on the server side where it’s not vulnerable to tampering by bots themselves. See our documentation for a detailed comparison of BotD and Fingerprint Pro Bot Detection. The example below uses the non-open-source version.

Integrate Fingerprint Bot Detection into your website

First, sign up for a Fingerprint account. Bot detection is included in the Pro Plus plan as one of the Smart signals alongside incognito mode detection, VPN detection, browser tampering detection, and other data points useful for securing your website.

Add the JavaScript agent to your website client. Once enabled, you can use the same JavaScript agent for visitor identification and Bot Detection. We have client libraries for all significant front-end frameworks, or you can load the script from our CDN as shown below:

// Initialize the agent
const fpPromise = import('<https://fpjscdn.net/v3/><your-public-api-key>')
  .then(FingerprintJS => FingerprintJS.load({
    endpoint: '<https://metrics.yourdomain.com>',

Note: For production deployments, we recommend routing requests to Fingerprint Pro APIs through your own domain — that’s what the endpoint parameter is for. This prevents ad blockers from disrupting identification requests and improves accuracy. We offer a variety of proxy integration options, see Protecting your JavaScript agent from ad blockers for more details.

Let’s use an airline website as an example. The visitor picks their destination and clicks “Search flights.” Before returning the results from the server, you want to make sure they are not a bot.

Example website for searching flights

On the client, right before requesting the flight data, use the loaded fpPromise to send browser parameters to Fingerprint Pro API for analysis. You will get a requestId in the response. Include it in the search request you send to your server.

async function onClickSearchFlights(from, to) {
  // Collect browser signals for bot detection and send them 
  // to Fingerprint Pro API. The response contains a requestId
  const { requestId } = await (await fpPromise).get();

  // Pass the requestId to your server alongside the flights query	
  const response = await fetch(`/api/web-scraping/flights`, {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
    body: JSON.stringify({ from, to, requestId }),

Note: To detect bots, Fingerprint Pro needs to collect signals from the browser. Therefore, it is best used to protect data endpoints that are accessible from your website, as demonstrated in this article. It is not designed to protect server-rendered or static content that is sent to the browser on the initial page load, as browser signals are not available during server-side rendering.

On the server, send the requestId to Fingerprint Pro Server API to get your bot detection result. If the requestId is malformed or not found, it will not return the flight results. You can call the Server API REST endpoint directly, or use one of our Server SDKs. Here is an example using the Node.js SDK:

import {
} from "@fingerprintjs/fingerprintjs-pro-server-api";

export default async function getFlightsEndpoint(req, res) {
  const { from, to, requestId } = req.body;

  // requestId in the wrong format can be rejected immediately
  if (!/^\d{13}\.[a-zA-Z0-9]{6}$/.test(requestId)) {
      message: "malformed requestId, potential spoofing detected",

  let botDetection;
  try {
    // Initialise Server API client
    const client = new FingerprintJsServerApiClient({
      region: Region.Global,
      apiKey: "<YOUR_SERVER_API_KEY>",

    // Get analysis event from the Server API using the requestId
    const eventResponse = await client.getEvent(requestId);
    botDetection = eventResponse.products?.botd?.data;
  } catch (error) {
    // If getting the event fails, it's likely that the
    // requestId was spoofed, so don't return the results
      message: "requestId not found, potential spoofing detected",

  // continue processing botDetection result...

The botDetection result returned from the Server API tells you if Fingerprint Pro detected a good bot (for example, a search engine crawler), a bad bot (an automated browser), or no bot at all.

  "bot": {
    "result": "bad" // or "good" or "notDetected",
      "type": "headlessChrome"
	"userAgent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/110.0.5481.177 Safari/537.36"
	"url": "https://yourdomain.com/search",
  "ip": "",
  "time": "2022-03-21T16:40:13Z"

If the visitor is a malicious bot, return an error. Optionally you could update your WAF rules to block the bot’s IP address in the future.

if (botDetection?.bot.result === 'bad') {
    message: "Malicious bot detected, scraping flight data is not allowed."

Now you know that the fingerprinting request is real and it didn’t detect any malicious bots. But you need to verify that the result actually belongs to this search request. The bot could have replaced the real requestId with an old one obtained manually some time ago. To check against replay attacks, you need to verify the freshness of the fingerprinting request:

// fingerprinting event must be max 3 seconds old
if (Date.now() - Number(new Date(botDetection.time)) > 3000) {
    message: "Old visit detected, potential replay attack.",

You also want to verify that the origin of the fingerprinting request matches the origin of the search request itself. Usually, both will be coming from your website’s domain.

const fpRequestOrigin = new URL(botDetection.url).origin;
if (
  fpRequestOrigin !== req.headers["origin"] ||
  fpRequestOrigin !== "yourdomain.com" ||
  req.headers["origin"] !== "yourdomain.com"
) {
    message: "Origin mismatch detected, potential spoofing attack.",

Finally, verify that the IP of the fingerprinting request matches the IP of the search request.

if (botDetection.ip !== req.headers["x-forwarded-for"]?.split(",")[0]) {
    message: "IP mismatch detected, potential spoofing attack.",

Having verified the authenticity of the bot detection result, you can now confidently return the flights:

const flights = await getFlightResults(from, to);
res.status(200).json({ flights });

Explore our Web Scraping Prevention Demo

Visit the Web Scraping Prevention Demo we built to demonstrate the concepts above. You can explore the open-source code on Github or run it in your browser with StackBlitz. The core of the use case is implemented in this component and this endpoint.

To see Fingerprint Pro Bot Detection in action, you need to visit the use case website as a bot. The easiest way is the use a Browserless debugger, which allows you to control an automated browser in the cloud from your browser.

  1. Go to the Fingerprint’s Browserless instance.

  2. Switch to the Web Scraping tab for a full bot example or just copy this snippet into the code editor:

    export default async ({ page }: { page: Page }) => {
      await page.goto('https://fingerprinthub.com/web-scraping');
  3. Press the “Play” button in the top right to run the bot (and wait a few seconds).

You see the automated browser on the right-hand side can’t access the flight search results when Bot Detection is enabled.

A screenshot of an automated browser failing to scrape a website protected by Fingerprint Pro Bot Detection.

If you prefer to explore and test locally, the demo contains end-to-end tests. Execute them to see that you can scrape the flight results with Bot Detection disabled, but not otherwise.

git clone https://github.com/fingerprintjs/fingerprintjs-pro-use-cases
cd fingerprintjs-pro-use-cases
yarn install
yarn dev

# in a second terminal window
yarn test:e2e:chrome e2e/scraping/protected.spec.js --debug
yarn test:e2e:chrome e2e/scraping/unprotected.spec.js --debug

If you have any questions, please reach out to our support.

Explore live technical demo

Trouble running the demo? Try opening it on Stackblitz in a Chromium-based browser without ad-blockers.

Join the community

Fingerprint’ open source technology is supported by contributing developers across the globe. Stay up to date on our latest technical use cases, integrations and updates.

Explore more technical use cases

Credential stuffing illustration

Credential Stuffing

Explore our full code use case for credential stuffing prevention. Stop automated attacks on your login page with our user identification API.

  • Login
  • Bot Attacks
  • Identity Fraud
  • Ecommerce
  • Gaming
  • Gambling
  • Financial
coupon promo fraud

Coupon & Promo Abuse

Coupon abuse occurs when a customer or fraudster takes advantage of a business’s promotions for monetary gain—for example, a single fraudster redeeming the same promo code multiple times at checkout.

  • Payment
  • Cheating
  • Ecommerce