Bare-Bones GPT-2 Inference from Scratch
I built a from-scratch GPT-2 inference engine for educational purposes — no frameworks, no shortcuts. Rust handles the tensor math (via PyO3), Python handles everything else: tokenizer, transforme...
I built a from-scratch GPT-2 inference engine for educational purposes — no frameworks, no shortcuts. Rust handles the tensor math (via PyO3), Python handles everything else: tokenizer, transforme...
In a previous post I argued that machine-readable specifications are the foundation for AI-augmented hardware design. This post digs into a specific tool for that job: TLA+ and its procedural fron...
Every hardware team has lived this: a 400-page PDF spec says one thing, the RTL says another, and the testbench assumes a third. The bug that falls out is nobody’s fault and everybody’s problem. ...
Architectural decisions for a high-performance I/O-memory fabric should be grounded in data, not gut feel. It sounds obvious, but in practice the pressure to “just pick something reasonable” is re...
Welcome to schmole.com — a space for engineering notes, deep dives, and tools from my work in silicon hardware architecture. Posts here will cover fabric architecture trade-offs, formal specificat...