no chats / 2022-12-06
Avoiding ChatGPT; updates on economics (trends going in good directions), law (it’s messy), explainability (nothing good).
If you're off twitter, congrats, you're a better person than me. (Though I did finally re-join mastodon this week!) That means you missed the blizzard of experimentation with – and commentary on – ChatGPT, a new user interface for GPT 3.5 from OpenAI. I'd promised to do a round up of it but... have decided against, because it's been mostly done to death. So some other open-adjacent ML news from this week.
Economics
The economics of ML are still challenging for open(ish), which has tended to assume decentralized, cheap computation. Two announcements this week suggest that this continues to trend favorably for open(ish):
- Training on commodity hardware? State of the art training techniques, developed in high-end labs, assume extremely high bandwidth between GPUs. A new algorithm trains model on much slower commodity network connections, suggesting that the cost of training might fall in the future.
- Executing on commodity hardware: Apple's latest operating system betas feature 20x speedups of Stable Diffusion. When Apple is using your code as a testbed, you've caught the wave.
Licensing and law
Open(ish) legal issues continue to burble.
- RAIL continues to fragment: Stable Diffusion 2.0, released hours after the last newsletter, has a very slightly different version of the RAIL license, dropping a requirement that new versions of models be used. (The person who apparently suggested the change was... not impressed.)
- "Pay to Train": A new license addendum that attempts to be "human only", i.e., humans can use it but if you want to train code-generation on it, you have to pay. I have many questions (including about who is behind it), but the actual license text requires going through some GitHub hoops so I have not seen it yet. (Related, the SPDX license metadata stnadard has a proposal to broaden how it supports license exceptions—of which this could become one.)
- Registration: Lawyer Van Lindberg is working to defend the right of a human to register a copyright on a work developed with ML tooling. This seems essentially correct to me—there may be some line where ML does so much of the creative work that the output can't be copyrighted, but in the subject work it is very clear that lots of human creativity was involved. (Alex Champandard, I think correctly, predicts that ML tools will eventually provide logging of 'creative effort' to help with copyright registration—perhaps akin to the logs researchers are encouraged to keep for patenting purposes.)
Transparency and Explainability
- Hallucination: I am tickled that hallucination is essentially an ML term of art, meaning "it's impossible for an AI to remember everything, so sometimes it just makes stuff up". One wonders if transparency/explainability techniques will get more focus now that much critique of ChatGPT and Galactica have honed in on this problem.
- Governance as a service: Model cards are now well-understood enough as a concept to be featured in an AWS Reinvent keynote.
- Explainabilty may be overrated: I've been intrigued (as I wrote last week) about the potential for transparency to be an appropriate focus for ML licensing (as a subset of ML governance). But this paper suggests that explainability, while nice in theory, isn't (yet?) that useful. It can be read, usefully, in conversation with this piece on how explainability can be just a dodge to avoid the real, harder questions of governance—which is why I would again stress that transparency is important, but can't be viewed as sufficient.
Misc.
- Mozilla acquihires a team to "spearhead [their] efforts in applied ethical machine learning".
- Creative Commons has a year-in-review post on their efforts throughout the year on AI. There's a lot, grounded in principles of originality and reuse—old CC themes, applied to new techniques. Kudos to that team.
- Attempts to avoid AI responsibility bingo card.
- Ethical schism time! I said at some point early on that I am not particularly interested in spending time on "are we building Skynet", but didn't elaborate on why. I think this essay on that topic is very good, and pretty much endorse the "reform" position it details.
Member discussion