Posts Tagged 'ML64.exe'

Multiplying Matrices with AVX. For Fun*

(Cross-posted from GitHub)

Having previously tinkered only very briefly, in assembly, I was keen to try my hand at more.

I do best with a practical, defined problem to solve; having used more or less the same unrolled-loop implementation of a 4×4 matrix multiplication I wrote in university, it seemed a good candidate for a 21st Century update, using Advanced Vector Extensions (AVX) which first shipped with Sandy Bridge processors in 2011. Non-trivial, but tractable.

*Performance was never a motivation of this side project – the problem is too small – but there wouldn’t be much point if the output were slower. And it isn’t: on my (Ivy Bridge) Macbook Pro, it executes in half as many cycles as my previous unrolled-loop implementation and in slightly more than two-thirds as many cycles on a Haswell Ultrabook.

But not faster than XMMatrixMultiply.

Advertisements

Using Intel’s Secure Key (RDRAND) in MS Visual C++ 2010

UPDATE (29/05/2016): added a function to use RDRAND to generate a random value within a specified range, and refactored the logic into a static library and wrapped it with a dynamic library for use with P/Invoke.

Among the features added to Intel’s 3rd-Generation Core i* processors is a Digital Random Number Generator (DRNG) backed by an on-die hardware entropy source. This new hardware feature is made available to software via the also-new RDRAND instruction.

If you’re still using the compiler which shipped with Visual C++ 2010, it seems the only way to leverage the DRNG is either via a third-party library (the one available from Intel’s website is, as of writing, broken) or by dipping into assembly programming. Of these the latter comes with a couple of catches: the mnenomic/intrinsic for the instruction is not available for older assemblers/compilers, and the assembly is slightly different for 32- and 64-bit environments.

The sample project illustrates testing whether the host processor supports the RDRAND instruction as well as invoking it (via assembly). When built for 32-bit CPUs, the assembly is inlined; when built for 64-bit CPUs, the assembly is linked in via an exernal module (the 64-bit compiler in Visual Studio 2010 does not support inline assembly).

For the most part, the project simply follows the Software Implementation Guide from Intel. Additionally, it demonstrates invoking the instruction via its opcode, and linking a module implemented in assembly into a VC++ project.



%d bloggers like this: