5 min read

Live from Sweden / 2023-04-19

Upcoming public panel moderation; shifting power via foundation models (open or not); public ability to understand via great WaPo visualization; modularity; and more.
Sailboats in front of a church, with trees and birds, in the style of a pencil sketch.
Pencil sketch of a "swedish post" via Midjourney.

Intro

I’ve been doing a lot of speaking on ML of late (I think four streams/podcasts and one conference track in four weeks?) so the newsletter has suffered. Thanks to all of you who have referred me to speaking opportunities; it’s been fun! Today I'm coming to you live from Göteburg, Sweden, where I'll co-lead a track on machine learning and open tomorrow.

Events

(All streaming unless otherwise noted)

Values

In this section: what values have helped define open? are we seeing them in ML?

Improves public ability to understand

  • The Washington Post has a great visualization and report on what data is used in one of the key semi-public data sets, C4. This is the kind of democratic oversight that (1) is extremely necessary and (2) can only happen when data sets are open(ish) enough to be accessible and legible to the media.
  • On the flip side, #5 in this terrific list of “Eight Things To Know About LLMs” is that “Experts are not yet able to interpret the inner workings of LLMs”. This is a nice, concise summary of the research in this area—and suggests that, at least for the moment, making models available to researchers is not a panacea for interpretability.

Shifts power

I’ve mused here before that the “foundation models” approach is an important one to understand, not just technically but because whoever provides and controls those models will have a lot of power.

Daniel Jeffries, formerly (briefly?) of StabilityAI, muses at length on who will “win” in foundation models. His take: there will be “Foundation Model as a Service companies who basically offer intelligence as a service but even more importantly they offer piece of mind: Guardrails, logic fixes, safety measures, upgrades, rapid response to problems.” But getting there will be costly, and error-prone, because making the wrong choices at the beginning will mean throwing everything away to retrain. The essay ends with a long section on open business models in this space that is particularly worth reading.

One oversight in the Jeffries essay is the regulatory environment. I think this may push towards open (or at least transparent) in a way that regulation of traditional software has not. Besides the safety considerations I’ve already covered here repeatedly, there’s also a growing push within academia to do research based on open models. If you’re interested in reading more on that, here’s a long read focused on natural language processing research, and a more recent editorial in Nature. It will be interesting to see if this advocacy succeeds and tips the general policy balance in favor of open foundation models.

Techniques

In this section: open software defined, and was defined by, new techniques in software development. What parallels are happening in ML?

Model improvement

This paper is a very deep dive (with excellent, short executive summary) on what terminology and techniques we might use to discuss safety and security in ML models. Highly recommended for anyone thinking about this; the comparisons to old techniques are problematic and we need to build better vocabulary if we want to get this right.

Modularity

New sub-section here; modularity is a key open source software technique, enabled by low-friction licensing. Are we seeing it in ML?

Changes

In this section: ML is going to change open—not just how we understand it, but how we practice it.

Creating new things

New unquestionably open models continue to proliferate. From the past few weeks:

And data sets too. This week it is Red Pajamas, a new data set explicitly aimed at duplicating the Facebook LLaMA data set, so that others can reproduce the LLaMA model. Note that funding is a mix of academic, government, and startup, suggesting that the “everyone finds something” economic model of open source software will have at least some applicability in open(ish) ML.

Ethically-focused practitioners

Changing regulatory landscape

Collaborative tooling

  • In anti-collaboration news, I increasingly think that before LLMs create impactful “misinfo” themselves, they’ll accidentally create a misinfo crisis by burning out every human moderator on every platform, allowing human misinfo to flourish. Relevant to open, GitHub will be one of the first victims of moderator burnout.

Misc.

Closing note

I’ve been re-reading classic machine-learning related science fiction; please leave comments or ping me if you have suggestions!

One thing that has jumped out at me is that in Stephenson’s Diamond Age, the ML-like software is referred to as “pseudo-intelligence”. I really like this—it captures the almost-but-not-quite thereness.