Deploy Anti AI Scraper Measures #15
Labels
No labels
Goal: Ops
Goal: Platform
Goal: Scale
Goal: User
Size
L
Size
M
Size
S
Type: Action
Type: Auto
Type: Infra
Type: Sec
Type: Tool
Who
Internal
Who
Public
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
opencommit/roadmap#15
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
1. What
The implementation of a multi-layered defensive perimeter around OpenCommit to identify, challenge, and mitigate unauthorized automated data extraction by AI-related crawlers (e.g., GPTBot, CCBot, AnthropicAI). This includes deploying technical controls at the application level (via
robots.txt), the network/proxy level (via User-Agent filtering or rate limiting), and, if required, an identity challenge layer (such as Anubis) to verify human-driven traffic.2. Why
To protect the intellectual property and privacy of the repositories hosted on this instance by preventing uncompensated use of source code in Large Language Model (LLM) training sets. Additionally, this initiative aims to reduce infrastructure resource consumption and "noise" caused by high-frequency automated requests, ensuring higher availability and performance for legitimate human users and authorized integrations.
3. Boundaries
robots.txtdirectives. (Robots.txt Traefik plugin?)4. Definition of Done