Having over a decade of open source software I've written freely available online, I actually really appreciate the value that AI && LLMs have provided me.
The thing that leaves a bad taste in my mouth is the fact that my works were likely included in the training data and, if it doesn't violate my licenses (GNU 2/3), it certainly feels against the spirit of what I intended when distributing my works.
I was made redundant recently "due to AI" (questionable) and it feels like my works in some way contributed to my redundancy where my works contributed to the profits made by these AI megacorps while I am left a victim.
I wish I could be provided a dividend or royalty, however small, for my contribution to these LLMs but that will never happen.
I've been looking for a copy-left "source available" license that allows me to distribute code openly but has a clause that says "if you would like to use these sources to train an LLM, please contact me and we'll work something out". I haven't yet found that.
I'm guessing that such a license would not be enforceable because I am not in the US, but at least it would be nice to declare my intent and who knows what the future looks like.
I think there's no meaningful case by the letter of the law that use of training data that include GPL-licensed software in models that comprise the core component of modern LLMs doesn't obligate every producer of such models to make both the models and the software stack supporting them available under the same terms. Of course, it also seems clear in the present landscape that the law often depends more on the convenience of the powerful than its actual construction and intent, but I would love to be proven wrong about that, and this kind of outcome would help
Sure, but that's more a result of policy decisions than an inevitable result of some natural law. Corporate lawlessness has been reined in before and it can be again
Reading this I hear The Roots playing The Seed 2.0[1] in my mind.
It’s a wild thought to think that of all the things that will remain on this earth after you’re gone, it’ll be your GPL contributions reconstituting themselves as an LLM’s hallucinations.
Taken to a hallucinated but logical conclusion, we might define a word such as "cene" to riff off of "meme" and "gene".
The c is for code. If adopted we could spend forever arguing how the c is pronounced and whether the original had a cedilla, circonflex or rhymes with bollocks, which seems somehow appropriate. Everyone uses xene instead. x is chi but most people don't notice.
Me too, and I use LLMs often for personal and professional work. Knowing that colleagues are burning through $700/day worth of tokens, and a small fraction of those tokens were likely derived from my work while I get made redundant is a bit shite.
> I've been looking for a copy-left "source available" license that allows me to distribute code openly but has a clause that says "if you would like to use these sources to train an LLM, please contact me and we'll work something out". I haven't yet found that
Frankly do you think AI companies have even the remotest amount of respect for these licenses anyways? They will simply take your code if it is publicly scrapeable, train their models, exactly like they have so far. Then it will be up to you to chase them down and try to sue or whatever. And good luck proving the license violation
I dunno. I just don't really believe that many tech companies these days are behaving even remotely ethically. I don't have much hope that will change anytime soon
All the infrastructure that runs the whole AI-over-the-internet juggernaut is
essentially all open source.
Heck, even Claude Code would be far less useful without grep, diff, git, head, etc., etc., etc. And one can easily see a day where something like a local sort Claude Code talking to Open Weight and Open Source models is the core dev tool.
> All the infrastructure that runs the whole AI-over-the-internet juggernaut is essentially all open source.
Exactly.
> Heck, even Claude Code would be far less useful without grep, diff, git, head, etc.
It wouldn't even work. It's constantly using those.
I remember reading a Claude Code CLI install doc and the first thing was "we need ripgrep" with zero shame.
All these tools also all basically run on top of Linux: with Claude Code actually installing, on Windows and MacOS, a full linux VM on the system.
It's all open-source command line tools, an open-source OS and piping program one to the other. I'm on Linux on the desktop (and servers ofc) since the Slackware days... And I was right all along.
It’s such a fun time to have 1+ decade(s) of experience in software. Knowing what simple and good are (for me), and being able to articulate it has let me create so much personal software for myself and my family. It has really felt like turning ideas into reality, about as fast as I can think of them or they can suggest them. And adding specific features, just for our needs. The latest one was a slack canvas replacement, as we moved from slack to self-hosted matrix + element but missed the multiplayer, persistent monthly notes file we used. Even getting matrix set up in the first place was a breeze.
$20/month with your provider of choice unlocks a lot.
Edit: the underlying point being, yes to the article. Either building upon the foundations of open source to making personal things, or just modifying a fork for my own needs.
I don’t know what SaaS has to do with FOSS. The point of FOSS was to allow me to modify the software I run on my system. If the device drivers for some hardware I depend on are no longer supported by the company I bought it from, if it’s open source, I can modify and extend the software myself.
The Copy Left licenses ensure that I share my modifications back if I distribute them. It’s a thing for the public good.
Agent-based software development walls people off from that. Mostly by ensuring that the provenance of the code it generates is not known and by deskilling people so that they don’t know what to prompt or how to fix their code.
I’m not so sure… what I see as more likely is that coding agents will just strip parts from open source libraries to build bespoke applications for users. Users will be ecstatic because they get exactly what they want and they don’t have to worry about upstream supply chain attacks. Maintainers get screwed because no one contributes back to the main code base. In the end open source software becomes critical to the ecosystem, but gets none of the credit.
But the users would have to maintain their own forks then. Unless you stream back patches into your forks, which implies there's some upstream being maintained. Software doesn't interoperate and maintain itself for free - somebody's gotta put in the time for that.
I think as long as AI isn't literal AGI, social pressures will keep projects alive, in some state. There definitely is something scary about stealing entire products as a mean for new market domination - e.g. steal linux then make a corporate linux, and force everybody to contribute to corporate linux only (many linux contributors are paid by corporations, after all), and make that the new central pointer. That might be worst case scenario - then Microsoft, in collusion (which I admit is far fetched, but def possible), could completely adopt linux for servers and headless compute, and enforce very strict hardware restrictions such that only Windows works.
5 years ago, I set out to build an open-source, interoperable marketplace powered by open-source SaaS. It felt like a pipe dream, but AI has made the dream into fruition. People are underestimating how much AI is a threat to rent seeking middlemen in every industry.
> SaaS scaled by exploiting a licensing loophole that let vendors avoid sharing their modifications.
AI is going to exploit even more: "Given the repository -> Construct tech spec -> Build project based on tech spec"
At this stage, I want everyone just close their source, stop working on open source until this issue of licensing gets resolved.
Any improvement you make to the open source code will be leveraged in ways you didn't intend it to be used, eventually making you redundant in the process
Maybe, but I don't really believe users can or want to start designing software, if it was even possible which today it isn't really unless you already have software dev skills.
That would basically make users a product manager and UX designer, which they aren't really capable of currently. At most they will discover what they think they want isn't what they actually want.
The real unlock here isn't users becoming devs, it's maintainers becoming 10x more productive. Most OSS projects die because the maintainer burned out fixing bugs nobody wants to fix. If agents can handle the boring parts (triage, repro, patch obvious stuff) the maintainer can focus on design decisions and reviewing PRs instead of drowning in issues. That changes the economics completely.
This feels like an AI generated comment, but I'll reply anyway. AI has been a massive negative for open source since every project is now drowning in AI generated PRs which don't work, reports for issues which don't exist, and the general mountain of time waster automated slop.
We are getting to the point where many projects may have to close submissions from the general public since they waste far more time than they help.
I’m impressed by how current times make us consider so many completely opposite scenarios. I think it can indeed foster progress, but it can also have negative impacts.
It compares and contrasts open source and free software, and then gives an example of how free software is better than closed software.
But if the premise of the article, that the agent will take the package you pick and adapt it to your needs, is correct, then honestly the agent won't give a rat's ass whether the starting point was free source or open source.
First of all, free software still matters. Then, being a slave to a $200 subscription to a oligarch application that launders other people's copyright is not what Stallman envisioned.
The AI propaganda articles are getting more devious my the minute. It's not just propaganda---it's Bernays-level manipulation!
Unfortunately for me, I have reason to believe that the algorithms won't allow me to get exposure for my work no matter how good it is so there is literally no benefit for me to do open source. Though I would love to, I'm not in a position to work for free. Exposure is required to monetize open source. It has to reach a certain scale of adoption.
The worst part is building something open source, getting positive feedback, helping a couple of startups and then some big corporation comes along and implements a similar product and then everyone gets forced by their bosses to use the corporate product against their will and people eventually forget your product exists because there are no high-paying jobs allowing people to use it.
I think the opposite. It will make all software matter less.
If trendlines continue... It will be faster for AI to vibe code said software to your customized specifications than to sign up for a SaaS and learn it.
"Claude, create a project management tool that simplifies jira, customize it to my workflow."
So a lot of apps will actually become closed source personalized builds.
I can already build a ticket tracker in a weekend. I’ve been on many teams that used Jira, nobody loves Jira, none of us ever bothered to DIY something good enough.
Why?
Because it’s a massive distraction. It’s really fun to build all these side apps, but then you have to maintain them.
I’m guessing a lot of vibeware will be abandoned rather than maintained.
And then you get a new hire who already knows the common SaaS products but has to re learn your vibe coded version no one else uses where no information exists online.
There is a reason why large proprietary products remain prevalent even when cheaper better alternatives exist. Being "industry standard" matters more than being the best.
But then all your local stuff is based on open-source software, unlike the SaaS which is probably not all the way open.
I've always preferred my stack to be on the thinner, more vanilla, less prebuilt side than others around me, and seems like LLMs are reinforcing that approach now.
There's too much value in familiar UX. "Don't make the user think" is the golden rule these days. People used to have mental bandwidth for learning new interfaces... But now people expect uniformity
Due to copyright laws and piracy bleed-through, one can't safely license "AI" output under some other use-case without the risk of getting sued or DMCA strikes. You can't make it GPL, or closed source... because it is not legally yours even if you paid someone for tokens.
Like all code-generators that came before, the current LLM will end up a niche product after the hype-cycle ends. "AI" only works if the models are fed other peoples real works, and the web is already >52% nonsense now. They add the Claude-contributor/flag to Git projects, so the scrapers don't consume as much of its own slop. ymmv =3
tl-didn't finish but I absolutely do this already. Much of the software I use is foss and codex adjusts it to my needs. Sometimes it's really good software and I end up adding something that already exists. Whatever, tokens are free...
The thing that leaves a bad taste in my mouth is the fact that my works were likely included in the training data and, if it doesn't violate my licenses (GNU 2/3), it certainly feels against the spirit of what I intended when distributing my works.
I was made redundant recently "due to AI" (questionable) and it feels like my works in some way contributed to my redundancy where my works contributed to the profits made by these AI megacorps while I am left a victim.
I wish I could be provided a dividend or royalty, however small, for my contribution to these LLMs but that will never happen.
I've been looking for a copy-left "source available" license that allows me to distribute code openly but has a clause that says "if you would like to use these sources to train an LLM, please contact me and we'll work something out". I haven't yet found that.
I'm guessing that such a license would not be enforceable because I am not in the US, but at least it would be nice to declare my intent and who knows what the future looks like.
Are you saying that you believe that untested but technically; models trained on GPL sources need to distribute the resulting LLMs under GPL?
It’s a wild thought to think that of all the things that will remain on this earth after you’re gone, it’ll be your GPL contributions reconstituting themselves as an LLM’s hallucinations.
[1]: https://youtu.be/ojC0mg2hJCc
The c is for code. If adopted we could spend forever arguing how the c is pronounced and whether the original had a cedilla, circonflex or rhymes with bollocks, which seems somehow appropriate. Everyone uses xene instead. x is chi but most people don't notice.
There are also people who want to be eaten by a literal cannibal. I say, no thanks.
Frankly do you think AI companies have even the remotest amount of respect for these licenses anyways? They will simply take your code if it is publicly scrapeable, train their models, exactly like they have so far. Then it will be up to you to chase them down and try to sue or whatever. And good luck proving the license violation
I dunno. I just don't really believe that many tech companies these days are behaving even remotely ethically. I don't have much hope that will change anytime soon
Take a litigious company like Nintendo. If one was to train an LLM on their works and the LLM produces an emulator, that would force a lawsuit.
If Nintendo wins, then LLMs are stealing. If Nintendo loses, then we can decompile everything.
All the infrastructure that runs the whole AI-over-the-internet juggernaut is essentially all open source.
Heck, even Claude Code would be far less useful without grep, diff, git, head, etc., etc., etc. And one can easily see a day where something like a local sort Claude Code talking to Open Weight and Open Source models is the core dev tool.
Exactly.
> Heck, even Claude Code would be far less useful without grep, diff, git, head, etc.
It wouldn't even work. It's constantly using those.
I remember reading a Claude Code CLI install doc and the first thing was "we need ripgrep" with zero shame.
All these tools also all basically run on top of Linux: with Claude Code actually installing, on Windows and MacOS, a full linux VM on the system.
It's all open-source command line tools, an open-source OS and piping program one to the other. I'm on Linux on the desktop (and servers ofc) since the Slackware days... And I was right all along.
Without the ability to string together the basic utilities into a much greater sum, Unix would have been another blip.
$20/month with your provider of choice unlocks a lot.
Edit: the underlying point being, yes to the article. Either building upon the foundations of open source to making personal things, or just modifying a fork for my own needs.
I don’t know what SaaS has to do with FOSS. The point of FOSS was to allow me to modify the software I run on my system. If the device drivers for some hardware I depend on are no longer supported by the company I bought it from, if it’s open source, I can modify and extend the software myself.
The Copy Left licenses ensure that I share my modifications back if I distribute them. It’s a thing for the public good.
Agent-based software development walls people off from that. Mostly by ensuring that the provenance of the code it generates is not known and by deskilling people so that they don’t know what to prompt or how to fix their code.
I think as long as AI isn't literal AGI, social pressures will keep projects alive, in some state. There definitely is something scary about stealing entire products as a mean for new market domination - e.g. steal linux then make a corporate linux, and force everybody to contribute to corporate linux only (many linux contributors are paid by corporations, after all), and make that the new central pointer. That might be worst case scenario - then Microsoft, in collusion (which I admit is far fetched, but def possible), could completely adopt linux for servers and headless compute, and enforce very strict hardware restrictions such that only Windows works.
AI is going to exploit even more: "Given the repository -> Construct tech spec -> Build project based on tech spec"
At this stage, I want everyone just close their source, stop working on open source until this issue of licensing gets resolved.
Any improvement you make to the open source code will be leveraged in ways you didn't intend it to be used, eventually making you redundant in the process
(I know this isn't the actual point of your comment, apologies!)
That would basically make users a product manager and UX designer, which they aren't really capable of currently. At most they will discover what they think they want isn't what they actually want.
We are getting to the point where many projects may have to close submissions from the general public since they waste far more time than they help.
It compares and contrasts open source and free software, and then gives an example of how free software is better than closed software.
But if the premise of the article, that the agent will take the package you pick and adapt it to your needs, is correct, then honestly the agent won't give a rat's ass whether the starting point was free source or open source.
The AI propaganda articles are getting more devious my the minute. It's not just propaganda---it's Bernays-level manipulation!
The worst part is building something open source, getting positive feedback, helping a couple of startups and then some big corporation comes along and implements a similar product and then everyone gets forced by their bosses to use the corporate product against their will and people eventually forget your product exists because there are no high-paying jobs allowing people to use it.
If trendlines continue... It will be faster for AI to vibe code said software to your customized specifications than to sign up for a SaaS and learn it.
"Claude, create a project management tool that simplifies jira, customize it to my workflow."
So a lot of apps will actually become closed source personalized builds.
I can already build a ticket tracker in a weekend. I’ve been on many teams that used Jira, nobody loves Jira, none of us ever bothered to DIY something good enough.
Why?
Because it’s a massive distraction. It’s really fun to build all these side apps, but then you have to maintain them.
I’m guessing a lot of vibeware will be abandoned rather than maintained.
There is a reason why large proprietary products remain prevalent even when cheaper better alternatives exist. Being "industry standard" matters more than being the best.
I've always preferred my stack to be on the thinner, more vanilla, less prebuilt side than others around me, and seems like LLMs are reinforcing that approach now.
Like all code-generators that came before, the current LLM will end up a niche product after the hype-cycle ends. "AI" only works if the models are fed other peoples real works, and the web is already >52% nonsense now. They add the Claude-contributor/flag to Git projects, so the scrapers don't consume as much of its own slop. ymmv =3