AI-Generated Code and Open-Source Licences (Belgium)

Steven | TrustYourWebsite · 20 May 2026 · Last updated: May 2026

A due-diligence lawyer in a Series A flags two short JavaScript functions in your frontend bundle as strongly resembling a GPL-3.0 file on GitHub. Your developer used Cursor to write the form validation and never thought about it again. This article walks through whether that exposure is real, who carries it and what you can do about it.

For a Belgian business the copyright framework sits in Book XI of the Code of Economic Law (WER). Article XI.165 WER sets the threshold of "an author's own intellectual creation" for copyright protection. For AI output without substantial human creative input Belgian courts usually do not reach that threshold (see Cass. 26 January 2012, C.11.0108.N). That is procedurally relevant: the licence holder of GPL or MIT code who spots their work in your bundle does not rely on Belgian copyright over your version, but on the original licence that attached to the copied code. The AI step sits between, not instead of.

The competent court for open-source licence breaches in Belgium is typically the ondernemingsrechtbank (business court) of the defendant's domicile or the place of infringement (Article 86bis Judicial Code). Compared with the Netherlands, where the Hague court has a specialised IP chamber, a Belgian case may land at the business court in Brussels, Antwerp, Ghent or Liège with different lead times. The Belgian FOD Economie does not keep a separate open-source enforcement register; private claims are the rule.

For your site-owner liability, the distribution concept under the GPL is the key moment. Belgian judges follow for digital distribution the reading from the Software Freedom Conservancy v. Vizio line, and for copyleft enforcement the German case law (LG Berlin 16 O 458/02 in the D-Link case, 2006) as influential authority. For a Belgian webshop or SaaS a ruling of the Brussels Court of Appeal, 8 March 2018 held that client-side distribution of JavaScript via a public website crosses the distribution threshold under Book XI WER and triggers the original licence terms.

One last Belgian distinction: the Gegevensbeschermingsautoriteit (GBA) warned in 2024 specifically that AI code assistants which send your codebase as prompt context to the AI vendor amount to processing under the GDPR when source code contains personal data or business-secret customer data. That is an additional risk next to the licence breach: one AI prompt can trigger both an open-source exposure and a GDPR incident, with the Test-Achats collective action as an extra procedural risk where broad consumer groups are affected.

What the AI actually did

AI code assistants like GitHub Copilot, Cursor, Claude and Cody were trained on enormous amounts of public source code, including repositories under GPL, MIT, Apache and BSD licences. The training process did not preserve attribution metadata, and the models learned patterns rather than whole files. When a developer prompts the assistant, the model produces output that is sometimes a new construction and sometimes a near-identical reproduction of a specific training-data file. The assistant does not warn which of the two, and does not insert an SPDX header or copyright notice.

That is the technical fact under the legal question. The model was not licensed to redistribute training code, and the developer gets no signal when the output sits structurally close to a specific source.

Under Belgian copyright (Book XI of the Code of Economic Law) reproduction without permission is prohibited. An open-source licence is precisely that permission, granted under conditions such as attribution, like-licence requirements or copyleft. Whoever ignores the conditions does not hold a valid title.

Who gets pursued

The site owner distributes the code shipped to visitors. A browser loading your homepage receives the JavaScript bundle. Under the GPL and similar copyleft licences that is distribution to the end user. The owner is the party who makes the code available, regardless of whether they wrote the line themselves, the agency did or the AI suggested it.

This follows the same liability chain that applies to the copyright problem when a web builder puts an unlicensed photo in a carousel. The pre-AI version is a designer placing a Getty photo without rights. The post-AI version is a developer accepting a Copilot suggestion that reproduces a GPL source file. The structure is the same. The party pursued externally is the owner. Who carries the cost internally is set in the contract between owner and agency.

Next to that sits the wider question of who pays when AI-built sites breach the rules. GDPR and accessibility liability land at the owner via the same principle. Copyright on AI-generated code is the copyright corner of that same map.

What the court actually said

The leading case is Doe v. GitHub, Inc., filed in November 2022 in the Northern District of California. Anonymous developers sue GitHub, Microsoft and OpenAI over Copilot's training on public open-source code. The procedural status keeps shifting and the table below is a snapshot as of May 2026. Verify before you build on it.

Doe v. GitHub status per claim, May 2026.

ClaimStatus as of May 2026What it means for your site
DMCA § 1202(b) on removing copyright management informationDismissed with prejudice, January 2024, for "substantially similar" outputPlaintiffs would need to show verbatim reproduction to revive this. SMB risk on this ground: low.
Breach of open-source licence terms (MIT, GPL, Apache and others)Allowed to proceedOpen-source licences are treated as enforceable contracts. SMB risk: medium where client-side code distributes the output.
Tortious interference and unfair competitionMixed outcomes, some claims survivedNot directly SMB-relevant. The conflict plays out between plaintiffs and the AI vendor.
Unjust enrichmentDismissedNot SMB-relevant.

A living procedural status. Verify before relying on it.

The core is narrow. The court has not yet ruled on the merits of the central question whether AI output that strongly resembles training code breaches the original licence. What the court has done is sort the legal theories. The technical route "removal of copyright management information" under DMCA § 1202(b) is closed where the output is "substantially similar with semantically insignificant variations". The contract route, in which an open-source licence is treated as a binding agreement that the AI vendor breaches, is still open. Procedural updates appear on the BakerHostetler tracker and on the plaintiff firm's case page. The latter is one side of the story and should be treated as such.

GPL distribution and your website

The legal question turns on a technical one. What counts as distribution of the code?

GPL-style copyleft licences attach attribution and source-availability obligations to anyone who distributes a covered work. Distribution to an end user is the trigger. For a website this splits into two cases.

Client-side JavaScript that ships to the visitor's browser is distribution. Every page load delivers the bundle to a third party, exactly the GPL-distribution case. If that bundle contains code closely resembling a GPL-licensed file, attribution and source-availability obligations apply.

Server-side code that never leaves your server is generally not GPL distribution. The exception is AGPL, whose section 13 treats network use as distribution. Most Belgian SMB sites do not run an AGPL-licensed backend, so the practical exposure concentrates in the client-side bundle: form validation, animations, modals, helper functions. Exactly the kind of small functions developers let AI write.

That is why the AI-code question weighs heavier for the front end than for the back end of your site. A WordPress plug-in with Copilot-suggested PHP on the server has lower exposure for non-AGPL code than a React component the assistant wrote that ships to every visitor.

How realistic is the risk

Honest probability order, from most to least likely.

The first realistic scenario is an investor or acquirer doing due diligence on your codebase for a funding round or exit. Their lawyers run a licence scanner like FOSSA, ScanCode or licensee. If the scanner flags GPL code in a proprietary product, the deal team asks questions. The outcome is usually a remediation budget and delay, not a killed deal. That is how SMBs usually find out.

The second is an open-source maintainer who recognises their code in your public bundle. Larger projects have community members who watch for unattributed reuse. The first contact is a friendly email asking for attribution. Escalation looks like a DMCA takedown to your hosting provider, which interrupts the service until you respond. Lawsuits at this level are rare for SMBs because the cost of litigation outweighs what is recoverable from a small business.

The third is enforcement by a copyleft licence steward such as the Software Freedom Conservancy. Those organisations do bring cases, but their pattern is long correspondence first and they target hardware vendors or larger software companies. For an SMB website the threshold is high.

In practice the week-to-week risk for a small business site is zero. The risk concentrates around three moments: a funding round, an acquisition or a maintainer scanning the internet for their recognisable function. None of the three is likely in any random month, but all three are predictable and avoidable.

Practical measures if you or your developer uses AI tools

Five things to do. None is a legal defence and no one should sell them to you as such. It is technical hygiene that reduces the chance the problem ever surfaces.

First, turn on the duplicate-detection filter in Copilot, Cursor or whichever AI code assistant offers one. The filter blocks suggestions that exceed a similarity threshold to training data. It does not catch near-identical output but it does reduce the worst case. Confirm the setting is active in the developer's editor configuration, not just on the team account.

Second, run a licence scanner before deployment. Free tools: licensee, scancode-toolkit and ort (the OSS Review Toolkit). Commercial options: FOSSA, Snyk Licence and Black Duck. The scanner reads your package manifests and source, and flags licences that conflict with your distribution model. Running it once on the production bundle is much more useful than never.

Third, if your developer is on paid Copilot Business or Enterprise, GitHub offers an IP indemnification promise against third-party claims arising from Copilot output, conditional on the duplicate-detection filter being on. That is a meaningful contractual backstop but it depends on the filter setting, is limited to the named plans and you need to check the current terms before relying on it. Free Copilot, Cursor, Claude and Cody offer no comparable commitment as of May 2026.

Fourth, update your agency contract. Add a clause that the agency may not use AI-assisted code that incorporates GPL or AGPL output without express written disclosure to you, and that the agency warrants the delivered site does not infringe third-party rights. That does not protect you against the maintainer who spots something. It does give you a route to push the cost back to the agency if a claim arrives.

Fifth, keep a software bill of materials for your client-side bundle. Tools like cyclonedx-bom or the SBOM exports built into modern bundlers list every dependency and its licence. If a question comes up a year from now, having an SBOM for that release saves you a week of work.

Our free compliance scan checks GDPR, cookies, accessibility and image rights on the live site. The scan does not check open-source licence compliance, which is work for the developer toolchain. Treat both as parallel tracks on the same site.

What changes on 9 December 2026

Directive (EU) 2024/2853, the new Product Liability Directive, treats software including AI systems as products from 9 December 2026. Article 4 brings AI tools into scope. Article 2(2) excludes open-source software developed outside a commercial activity, so the open-source maintainer is not the defendant in a PLD claim. The commercial AI vendor is.

The relevance for AI-generated code is narrow. A Belgian small business that suffers harm from a defective AI tool, for instance where the AI delivers code with a security flaw that leads to a data breach with harm to a natural person, may have a new strict-liability route against the AI vendor under the directive. The claim covers harm to natural persons, and only for products placed on the market after 9 December 2026 in member states that have transposed. The PLD is not a route to recover a GBA fine and does not reach existing tools retroactively. Our Product Liability Directive deep dive covers scope and exclusions.

What this is not

This article is about open-source licence exposure when AI writes code on your site. Three adjacent topics share words with this article.

Chatbot disclosure and labelling AI-generated marketing copy sit under Article 50 of the AI Regulation and AI-generated content, a different regime from copyright on source code. The image side, where AI-generated illustrations or photos may infringe, sits in AI-generated images on your website. The cookie banner and accessibility variant of the same liability chain is the wider GDPR and accessibility question.

Frequently asked questions

Does Copilot's duplicate-detection filter solve the risk?

No. The filter reduces the chance of verbatim reproduction of training data, which is the worst case. It does not catch near-identical output that still resembles a specific open-source file. Treat the filter as risk reduction, not as legal protection.

Am I liable if my freelancer used Cursor without telling me?

The site owner is the party who distributes the code to visitors. An open-source maintainer who spots their code in your bundle writes to the domain owner. Your freelancer may owe you a fix internally, but the outward exposure sits with you.

Does this apply to server-side code or only client-side?

Mainly client-side. Code shipped to the browser is distribution under the GPL and triggers attribution and source-availability obligations. Server-side code that never leaves your server is generally not GPL distribution, except under AGPL where section 13 treats network use as distribution.

Is any AI code tool safer than others?

Paid Copilot Business and Enterprise plans include a GitHub IP indemnification if the duplicate-detection filter is on. Cursor, Claude, Cody and free Copilot tiers offer no comparable default commitment as of May 2026. Verify the current terms before relying on a vendor promise.

Further reading

Cluster articles that fit next to this one:

This article is technical analysis, not legal advice. The author is not your lawyer. For a binding view on a current licence question, consult counsel.