A 25-year-old computer just ran a modern AI model, proving that cutting-edge tech doesn't always need cutting-edge hardware. With just a Pentium II and 128 MB of RAM, EXO Labs pulls off a remarkable AI breakthrough.
The 1B parameter version of Llama 3.2 showed even slower results at 0.0093 tokens per second, based on the partial model run with data stored on disk.
I mean, cool? They got a C interface library to compile using an older C standard, and the 1B model predictably runs like trash. It will take hours to do anything meaningful at that rate.
I mean, cool? They got a C interface library to compile using an older C standard, and the 1B model predictably runs like trash. It will take hours to do anything meaningful at that rate.