New QPU macro assembler

Since Broadcom released complete documentation for the VideoCore IV GPU back in February 2014 we’ve seen a number of fun uses of our 24GFLOPs of QPU compute, from Andrew Holme’s FFT library to Pete Warden’s deep learning experiments. It’s not unusual to see a 10x increase in performance over the...