AI & Technology
Intersections of AI with broader technology trends and applications
Prompt Caching in LLMs: How Reusing Context Cuts Cost and Latency
Prompt caching is a technique where LLM providers store the computed representation (KV cache) of repeated prompt prefixes so they don't need to be reprocessed on each request. When consecutive requests share the same system prompt, instructions, or context, the cached computation is reused — reducing cost (Anthropic charges 90% less for cached tokens) and latency (skipping the prefill computation for cached portions). This is why tools like Claude Code that send large, stable system prompts benefit from staying on the same provider's caching infrastructure.
DreamDojo: NVIDIA's Open-Source Robot World Model Trained on 44,711 Hours of Human Video
DreamDojo is NVIDIA's open-source generalist robot world model, released February 2026. It trains a foundation video model on 44,711 hours of first-person human video (6,015 tasks, 9,869 scenes) then fine-tunes on small amounts of real robot data. The key innovation is 'continuous latent actions' — a spatiotemporal Transformer VAE that extracts motion information between video frames self-supervised, converting unlabeled human video into training data. Runs at ~10 FPS real-time after distillation.
TurboQuant Reality Check: Google's KV Cache Compression Claims vs Actual Improvement
Google announced TurboQuant claiming 6x memory reduction and 8x speedup for LLM inference. The technique (compressing KV cache to 2.5-3.5 bits via random rotation and non-uniform quantization) is legitimate, but the marketing is misleading: the '8x speedup' compares against 32-bit baselines nobody uses, the '6x memory savings' applies only to KV cache (not model weights), and they benchmarked a competitor on CPU while running TurboQuant on an H100 GPU. No code was released.
Anthropic vs OpenClaw: The Third-Party Harness Lockout of April 2026
On April 4, 2026, Anthropic restricted third-party agent harnesses (OpenClaw and others) from using Claude subscription plans for subsidized tokens. The controversy highlights the 'copy then close' tension between platform companies and open-source communities — Anthropic has valid business reasons (power users burning $5,000+ on $200 plans) but the pattern of adopting open-source innovations then locking out their creators drew sharp criticism.
Two Revolutions Stacking: The Internet Plus AI Regulation Dilemma
Hank Green and Bernie Sanders frame the current moment as two simultaneous revolutions — the internet (still unresolved) and AI — stacking on top of each other. Sanders introduced a data center construction moratorium. The core tension: AI development speed is set by corporate competition, not societal readiness, and Congress is largely passive due to industry campaign contributions.
AI Agent Integration Challenges: CORS, Sandbox Egress, Prompt Injection
Discussion on practical challenges integrating AI agents with external APIs and knowledge bases. 1. PROMPT INJECTION IN AI PAGES: When a webpage instructs an AI to perform actions or suggest things...