Hardcoded Inference on FPGA
A Short Primer on My Ongoing Research at Vicharak #
During my R&D internship at Vicharak, I had the opportunity to dive deep into a fascinating project that reimagines how Deep Neural Networks (DNNs) can be deployed efficiently on FPGAs — with a special focus on low-bit quantized models and hardcoded inference flows.
DNNs have revolutionized fields like computer vision and natural language processing, but deploying them on resource-constrained edge devices remains a challenge.
GPUs and NPUs, though powerful, suffer from the Von Neumann bottleneck — the cost of frequently moving data between memory and compute units.
This becomes especially inefficient for low-bit models like Binary Neural Networks (BNNs), where memory access can negate the advantages of quantization in terms of performance and energy savings.
Key Challenge #
To design a system that completely eliminates memory access during inference by embedding the model directly into the FPGA’s logic fabric.
The Core Idea: Dataflow-Style Inference on FPGAs #
FPGAs (Field-Programmable Gate Arrays) are reconfigurable chips that allow defining custom data paths using logic gates and Look-Up Tables (LUTs).
Unlike CPUs or GPUs, FPGAs can execute logic operations in parallel without fetching instructions, making them perfect for static, hardwired models.
We experimented with an architecture that:
- Hardcodes model weights into LUTs instead of storing them in memory
- Uses pure combinational logic (e.g., XNOR-popcount for BNNs and logic gates for LGNs)
- Executes inference in a fully pipelined, streaming fashion — ideal for real-time, low-power applications
Why This Matters #
This work demonstrates that ultra-efficient inference is possible without relying on GPUs or large memory systems — using only:
- No dynamic memory
- No instruction fetch
- Pure hardwired logic
It paves the way for deploying AI in:
- Drones, robotics, and smart sensors
- Offline medical devices
- Space and aerospace systems
Project setup #
Vaaman (FPGA board), FTDI(USB TO UART Convertor), JTAG Debugger/Programmar
For a brief overview, watch "Training Logic Gate Networks" and go through my Bachelor's Thesis Presentation