4 Comments
User's avatar
Nathan Lambert's avatar

I’m guessing y’all saw I made the same argument? https://open.substack.com/pub/robotic/p/contra-dwarkesh-on-continual-learning

Join the team

Expand full comment
Anson Ho's avatar

Yep we did, in fact your post was one of the inspirations for the first part of our piece!

We've cited you in this paragraph:

> That said, whether these techniques actually end up working depends on more than just increasing the size of context windows on paper. It also requires building infrastructure so that more relevant context (e.g. all recent work interactions) can be digitized and shared with LLMs.

Expand full comment
Frank Greco's avatar

Many research papers suggest that huge context windows are ineffective. Many researchers point out that information in the middle of a large context window is apparently less likely to be used to guide the LLM.

Expand full comment
Sourav Tripathy's avatar

I’ve seen this issue first-hand when building a system that relies on multiple LLM calls to iteratively write longer documents. As the document grows, each successive call either drifts into incoherence or becomes narrowly fixated on a single aspect of the earlier context. This kind of context poisoning makes it very hard to sustain balanced, coherent generation across long horizons. If long-context inference is going to support continual learning in the way you suggest, how do we address this systemic fragility ...where models overfit to prior slices of context instead of integrating them holistically ?

Expand full comment