Multiplying Matrices with AVX. For Fun*

(Cross-posted from GitHub)

Having previously tinkered only very briefly, in assembly, I was keen to try my hand at more.

I do best with a practical, defined problem to solve; having used more or less the same unrolled-loop implementation of a 4×4 matrix multiplication I wrote in university, it seemed a good candidate for a 21st Century update, using Advanced Vector Extensions (AVX) which first shipped with Sandy Bridge processors in 2011. Non-trivial, but tractable.

*Performance was never a motivation of this side project – the problem is too small – but there wouldn’t be much point if the output were slower. And it isn’t: on my (Ivy Bridge) Macbook Pro, it executes in half as many cycles as my previous unrolled-loop implementation and in slightly more than two-thirds as many cycles on a Haswell Ultrabook.

But not faster than XMMatrixMultiply.

Advertisements

0 Responses to “Multiplying Matrices with AVX. For Fun*”



  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s





%d bloggers like this: