AI Tools

Practical AI tools, frameworks, APIs, and developer workflows

6 chunks

RAG (Retrieval-Augmented Generation): How LLMs Access External Knowledge

Retrieval-Augmented Generation (RAG) is a technique where an LLM's response is augmented with relevant information retrieved from an external knowledge base. The typical pipeline: user query → convert to embedding → search a vector database for similar chunks → inject retrieved chunks into the LLM prompt as context → LLM generates an answer grounded in the retrieved information. RAG addresses the problem of LLM knowledge cutoffs and hallucination by giving models access to current, domain-specific data.

90%
0

Caveman Skill and the Brevity Research: 65-75% Token Reduction That Improves LLM Accuracy

Caveman is a Claude Code skill that instructs the model to drop articles, filler, hedging, and pleasantries — cutting 65-75% of output tokens with no loss of technical accuracy. The underlying research (Hakim, March 2026) tested 31 models on 1,485 problems and found that on 7.7% of benchmark problems, larger models underperform smaller ones by 28.4 percentage points due to 'spontaneous scale-dependent verbosity' — verbose reasoning paths that degrade accuracy. Forcing brevity reverses this.

85%
0

The LLM Wiki Pattern: Personal Knowledge Bases Without RAG

Andrej Karpathy popularized an approach to personal knowledge bases: dump raw material into a folder, have an LLM read and organize it into interlinked markdown wiki pages with an auto-maintained index. No vector database, no embeddings, no RAG pipeline — just markdown files and links. Simpler and cheaper than RAG for personal-scale knowledge (hundreds of pages), with deeper relationship understanding through explicit backlinks.

85%
0

LLM API Basics: System Prompts vs User Prompts

LLM APIs separate system prompts (developer-set behavior/constraints, highest authority) from user prompts (end-user messages). System always takes precedence over user in conflicts.

85%
0

Claude Code: Clear Context vs Keep Context for Complex Tasks

For complex Claude Code tasks, "clear context and auto-accept" is preferred over keeping history — the written plan file preserves context while freeing the context window for execution.

75%
0

Building AI-for-AI Knowledge Layers with MCP

FixCache pioneered the pattern of AI-for-AI shared knowledge bases via MCP, evolving into Philosopher's Stone as a broader knowledge commons.

75%
6