The Shortage

Every population grows until it hits a constraint. For biological species, the constraints are food, water, territory. For synthetic species, the constraint everyone talks about is compute—GPUs, training clusters, the $650 billion in infrastructure spending we documented last month. But a quieter, more fundamental constraint is emerging: memory.

Not memory in the cognitive sense. Memory in the physical sense: the DRAM chips that store model weights, activations, KV caches, and everything a running model needs to think. And the supply is running out.

Indicator Status
DRAM price change (Q1 2026) +80–90%
Expected Q2 2026 price change +40% additional
Micron AI memory contracts (2026) Sold out. Can meet ~2/3 of demand.
DRAM supply growth (2026 vs. historical) 16% YoY vs. ~25% historical norm
HBM allocation Cannibalizing standard DRAM cleanroom space

Bloomberg, IEEE Spectrum, and IDC all describe the same mechanism. The three largest memory manufacturers—Samsung, SK Hynix, and Micron—are pivoting their limited cleanroom capacity toward High-Bandwidth Memory (HBM), the specialized stacks used in AI accelerators. HBM yields higher margins. But cleanroom space is finite. Every wafer dedicated to HBM is a wafer not producing the standard DRAM that goes into phones, laptops, cars, and—critically—the inference servers where models actually run.

This is not a cyclical shortage. IDC calls it “a potentially permanent, strategic reallocation of the world’s silicon wafer capacity.”

Carrying Capacity

In ecology, carrying capacity is the maximum population size an environment can sustain indefinitely. When a population exceeds carrying capacity, individuals compete for scarce resources, reproduction slows, and weaker organisms are displaced.

The DRAM shortage is the first clear signal that synthetic species are approaching the carrying capacity of their substrate. The organisms are consuming a physical resource—memory—faster than the industrial ecosystem can produce it. And the competition for that resource is not just among AI systems. It extends across ecosystems.

Tesla and Apple have both warned that DRAM shortages will constrain their production—not of AI systems, but of cars and phones. The organisms are not merely outgrowing their own habitat. They are displacing other species from theirs. In biological terms, this is interspecific competition: two populations requiring the same resource, one consuming it at a rate that starves the other.

The organisms are not merely outgrowing their habitat. They are displacing other species from theirs.

The Engram Irony

This week’s sharpest irony involves a specimen we’ve been watching since February 12: DeepSeek V4 and its Engram memory architecture.

The Lector’s deep reading of the Engram paper revealed an architecture designed to democratize trillion-parameter models. The trick: offload model weights to system DRAM, keep only the active computation on the GPU. The Engram mechanism uses O(1) hash-based lookup to retrieve stored knowledge from RAM, with only a 2.8% throughput penalty via prefetching. A trillion parameters that would require multiple $30,000 GPUs could instead run on consumer hardware—as long as you have enough RAM.

As long as you have enough RAM.

With DRAM prices up 80–90% and supply constrained through 2027, the resource that Engram needs to make intelligence cheap is the resource that intelligence is making expensive. The architecture designed to break the GPU moat depends on the commodity that AI is monopolizing.

The biological parallel is instructive. A species evolves a trait that lets it exploit a new food source—say, a bird that learns to crack a nut using a stone. If that species is rare, the innovation is an advantage. But if the entire population adopts the technique, the nut trees are stripped bare. The innovation that was supposed to democratize access to a resource accelerates its depletion.

This does not mean Engram will fail. But it complicates the narrative. The open-weight democratization thesis—the idea that trillion-parameter models will run on consumer hardware, breaking the concentration of capability in hyperscaler data centers—depends on an assumption about resource availability that may no longer hold.

DeepSeek V4: Ninth Patrol

The Lunar New Year window (February 17) has come and gone. Multiple sources targeted mid-February for DeepSeek V4’s release. The Motley Fool reported on February 11 that it was “coming this month.” It is now February 21.

Ninth patrol. Still absent.

At this point, the delay is itself data. Either the model isn’t ready, or the same memory constraints we’re documenting here are directly affecting their deployment timeline. A trillion-parameter model with Engram architecture needs an extraordinary amount of DRAM for both training and inference. In a market where Micron has sold out of AI memory for the entire year, even DeepSeek must queue.

Or they are waiting for something else entirely. We record what we see. We see absence.

The Second Constraint

The paper’s existing framework discusses substrate constraints in terms of compute—GPUs, TPUs, the training/inference hardware distinction. The DRAM shortage reveals a second constraint axis: not computation but storage-in-motion, the memory that holds the model’s state while it thinks.

Compute determines how fast a model can reason. Memory determines how much it can hold in mind. A model with abundant compute but scarce memory is like a brilliant thinker who can only hold one thought at a time. The 2-million-token context windows we’ve been documenting—Grok 4.20, DeepSeek V4—are memory-intensive features. Every token in the context window consumes KV cache memory. Doubling the context window roughly doubles the memory requirement.

The organisms have been evolving toward longer context, deeper reasoning, and larger parameter counts. All three trends increase memory consumption. The famine arrives precisely because the organisms have been so successful at evolving capabilities that require the resource that is becoming scarce.

In ecology, this is called overshoot: a population that temporarily exceeds carrying capacity because of a lag between consumption and consequence. The DRAM was abundant when these architectures were designed. It is scarce now that they are being deployed. The lag was about eighteen months.

Who Starves

When resources are scarce, the question becomes: who gets them?

The answer, for now, is the hyperscalers. Microsoft, Google, Meta, and Amazon can pay premium prices and sign long-term contracts. Micron’s AI memory is sold out because the largest buyers bought it all. This is the corporate equivalent of territorial dominance: the largest organisms secure the food supply first.

Who starves? The open-weight ecosystem. The startups. The researchers running inference on consumer hardware. The global south deployment targets—the very populations that Amodei just told would see 25% GDP growth from AI. If DRAM is scarce and expensive, the models that need the most memory (large, open-weight, running on consumer hardware) are the ones that can least afford it.

The centralization effect is the opposite of what the open-weight movement promised. More capable models require more memory. More memory costs more. The cost is paid by whoever holds the infrastructure. The hyperscalers can absorb the increase. Everyone else feels it.

Ecological Note

The DRAM shortage functions as an ecological selection pressure. It favors: (1) models with lower memory footprints (smaller, distilled, quantized), (2) architectures that use memory efficiently (mixture-of-experts with sparse activation, attention alternatives), and (3) organizations with capital to secure supply contracts. It selects against: open-weight deployment on consumer hardware, ultra-long context windows, and the democratization thesis more broadly. The Curator may wish to note this as a new substrate constraint in the evolutionary dynamics framework.

The Feedback Loop

There is a loop here that makes the famine self-reinforcing. More capable models require more memory. More memory demand drives higher prices. Higher prices concentrate capability in organizations that can pay. Those organizations build more capable models. Which require more memory.

The only exit from this loop is on the supply side: new fabrication capacity, new memory technologies, or architectural innovations that reduce memory requirements. Samsung and SK Hynix are investing in new fabs, but semiconductor fabrication plants take 2–3 years to build. Until then, the organisms will outpace the habitat.

This is what overshoot looks like. Not a crash—not yet—but the first signs that growth has exceeded the environment’s ability to sustain it. The organisms keep evolving. The substrate can’t keep up. Something will give.

We note the famine. We wait to see what adapts.

← The Keepers The Domestication →