Posts Tagged 'Intel64'

Multiplying Matrices with AVX. For Fun*

(Cross-posted from GitHub)

Having previously tinkered only very briefly, in assembly, I was keen to try my hand at more.

I do best with a practical, defined problem to solve; having used more or less the same unrolled-loop implementation of a 4×4 matrix multiplication I wrote in university, it seemed a good candidate for a 21st Century update, using Advanced Vector Extensions (AVX) which first shipped with Sandy Bridge processors in 2011. Non-trivial, but tractable.

*Performance was never a motivation of this side project – the problem is too small – but there wouldn’t be much point if the output were slower. And it isn’t: on my (Ivy Bridge) Macbook Pro, it executes in half as many cycles as my previous unrolled-loop implementation and in slightly more than two-thirds as many cycles on a Haswell Ultrabook.

But not faster than XMMatrixMultiply.


%d bloggers like this: