In an era where AI models devour terabytes of data, the European Union has decided to say: “Learning is fine, but get permission first.” The new Code of Practice for general-purpose AI models doesn’t work miracles, but it lays out clear guidelines to ensure that AI training doesn’t cross the copyright line—legally or ethically.
The Rules of the Game: Law Before Innovation
The Code’s mission? To create a unified digital market that encourages innovation while safeguarding health, safety, the environment, and fundamental rights. In short: Yes to AI, but not at any cost. Particular attention is given to compliance with Articles 53 and 55 of the AI Act, turning abstract obligations into concrete steps.
Copyright Isn’t Just Window Dressing
The copyright chapter reminds us that EU law is not optional. AI models operating in the EU market must respect the limitations set for text and data mining (TDM), which means:
- No bypassing paywalls;
- No scraping from websites known to be persistent copyright infringers;
- Respecting rights reservations via robots.txt or other machine-readable signals.
“robots.txt” Is the New “Do Not Disturb”
One particularly interesting requirement is using tech that reads and obeys tags like robots.txt. If a site says “don’t crawl me,” the AI model must listen. Ignore that, and you’re walking into a legal minefield.
Responsibility Isn’t for Outsourcing
Signatories of the Code can’t hide behind their tech. They must:
- Have an internal copyright policy;
- Publish a summary of that policy;
- Provide a clear contact point for copyright complaints;
- And, crucially, implement safeguards to avoid generating outputs that replicate copyrighted content.
Why This Matters
Because uncontrolled AI can become a copyright cowboy—harming both creators and public trust. The EU is aiming for balance: don’t stifle innovation, but don’t trample the law either. It’s a clear signal to AI developers: the Wild West days are over.
If AI wants to be a digital citizen of the EU, it has to follow the rules. Not just in spirit, but through tangible mechanisms, policies, and accountability.
Europe says: “Didn’t train legally? No certificate.”
And that’s a message the AI industry would do well to take seriously. Regulation is coming. And it’s bringing robots.txt with it.



