Herr Bischoff

How to Block OpenAI Bots

Today’s instalment of things you already know, unless you don’t is about blocking anything OpenAI. You know who they are and chances are if you’re reading this, you have an opinion regarding them. Personally, I feel that LLMs are poised to pollute the Internet in ways that will eventually make the public-facing world wide web almost unusable.

This is how you instruct their crawlers not to ingest your site’s content. If they comply with it (or even send the correct user agent) is a different ballgame, even when you block their IP ranges.

The robots.txt directives:

User-agent: GPTBot
User-agent: ChatGPT-User
Disallow: /

The IP ranges (as of today):

OpenAI publish their IP ranges: https://openai.com/gptbot-ranges.txt

ChatGPT-User requests originate from the range.