FPGAs Explained: How Field Programmable Gate Arrays Work and When to Use Them
FPGAs (Field Programmable Gate Arrays) are chips containing configurable logic blocks that can be wired together to create custom hardware. Unlike CPUs that execute instructions sequentially, FPGAs run everything in parallel. They excel at deterministic timing (zero jitter), high throughput (video processing, networking), and custom peripheral creation. Key components: LUTs (programmable truth tables), flip-flops (clock-synchronized memory), DSP blocks (dedicated math), and block RAM (dual-port memory).
A Field Programmable Gate Array (FPGA) is a chip containing configurable logic blocks that can be wired together to create custom digital hardware. Unlike a CPU that executes instructions one at a time, an FPGA runs all configured logic in parallel simultaneously. ## Key Components **LUTs (Lookup Tables):** Programmable truth tables that can emulate any logic gate. Modern FPGAs use 6-input LUTs — each can implement any Boolean function of 6 variables. **CLBs (Configurable Logic Blocks):** Groups of 8 LUTs plus carry-chain adders and 16 flip-flops. The basic building block of FPGA fabric. **Flip-flops:** 1-bit memory elements synchronized to a clock signal. Used to break long combinational logic chains into pipeline stages. Critical for meeting timing requirements. **DSP blocks:** Dedicated math units containing a 25×18 multiplier and 48-bit accumulator. Orders of magnitude faster than implementing the same math in general-purpose logic blocks. **Block RAM (BRAM):** Dedicated memory chunks with dual-port access — two independent read/write operations can occur simultaneously, even on different clock domains. **IO Buffers:** Configurable connections between internal logic and physical pins. Support multiple voltage standards (1.2V to 3.3V). **Routing:** Configurable switches at wire junctions connect logic blocks together. More switches in the path means more parasitic capacitance, which means slower signal propagation — routing is often the timing bottleneck. ## Key Concepts **Synchronous vs combinational logic:** Combinational logic produces output immediately from inputs. Synchronous logic waits for clock edges, providing predictable timing. **Pipelining:** Breaking long logic chains into clock-synchronized stages. Same total latency per operation, but N stages means N× throughput since each stage processes independently. **Metastability:** Sampling a signal while it's transitioning between states causes unpredictable behavior. Designs must ensure signals settle before the clock edge samples them. **Volatile configuration:** FPGAs lose their configuration on power-off (unlike microcontroller flash). They need external storage to reload the bitstream at startup. ## FPGA vs Microcontroller **Microcontrollers win** for most projects — cheaper, simpler, massive ecosystem. Hardware peripherals (timers, DMA, PWM generators) handle most timing-critical tasks. **FPGAs win** when you need: many simultaneous I/O signals (phased arrays, many motor encoders), zero-jitter deterministic timing (a software PWM on a microcontroller has 40-500ns jitter; an FPGA has effectively zero), high throughput parallel processing (video, networking), support for unknown future protocols without hardware changes, or custom peripherals that don't exist on any available microcontroller. **FPGA downsides:** Expensive, complex board design, steep learning curve (reportedly "weeks to blink an LED"). ## Hardware Design Languages Traditional: Verilog and VHDL. Modern alternative: Amaranth HDL, a Python-based tool that compiles to Verilog. Amaranth offers easier automation, built-in simulation with waveform output, and a lower barrier to entry for developers already familiar with Python. ## Hybrid Approach: Zynq SoCs The Xilinx Zynq family combines FPGA fabric with ARM CPU cores on one chip. Example: the Zynq 7020 has ~7,000 CLBs, 220 DSP blocks, 140 BRAM blocks, plus dual ARM Cortex-A9 CPUs running Linux. The CPU handles high-level logic while the FPGA handles timing-critical I/O — connected via AXI ports at up to 1.2 GB/s.