VOLK: Vector-Optimized Library of Kernels

We have just pushed a new library into the GNU Radio tree. It's called VOLK (Vector-Optimized Library of Kernels) and it is designed to help us work with the processor's SIMD instruction sets. These are very powerful vector operations that can give signal processing a huge boost in performance. We have done hand-optimization for the FIR filters in the past, and you can bet that FFTW uses SIMD heavily for its performance. Yet we never had a convenient way to really use SIMD in every-day signal processing in GNU Radio.

Volk helps us address this issue. It's a framework to add SIMD functionality as we need it. It consists of a set of functions, say a vector multiplier for complex floats. We want to make use of this, say in the FFT filters to replace the inner multiply loop. But to use SIMD code, we need a way to be processor independent but that also easily integrates with our code. So now, you have volk. Where we want to multiply two vectors, you can call:

volk_32fc_multiply_aligned16(c, a, b, N)

Where a and b are the two vectors we want to multiply, c is the output of the multiply, and N is the number of items to multiply. The key issue here is that a, b, and c must be 16-byte aligned. I'll go over that more in a later post.

Behind the scenes, volk knows that your processor can handle some set of SIMD instructions. If you run an Intel processor, it can do MMX, SSE, SSE, SSE3 and maybe SSE4. Volk then has a list of routines that do complex multiplies. It will always have a generic routine, which is a a standard C for loop that will run on any computer. But it will also have other multiply routines that are designed for different SIMD instructions. Without you knowing or caring about the SIMD architecture or how to write it, volk selects the best version of the multiplier for your processor, like SSE3.

I did this on my system and achieved a 10% boost in speed in the FFT filters. Not bad considering that the multiply is not the biggest part of the FFT routine and the fact that it took me about a half-dozen lines of code to do, including the headers and other setup necessary.

That's the basic introduction to Volk for now. I'll post more about how to use it later. For now, I just wanted to alert everyone that it's available and will be built with GNU Radio's "next" branch. Also, volk is built as its own autotools project, which means it has its own configure and bootstrap. These are called automatically by GNU Radio's configure, so you don't have to do anything. You'll see volk's configure being run by itself during GNU Radio's configure. The key about this is that you can just take the volk directory as a separate project from GNU Radio and configure, build, and produce libvolk tarballs that you can then use in your projects.

Much of the early work on volk is public domain code, so if you don't see a GPL notice and copyright header in a file, it's public domain. The rest of it is GPLv3 and copyright FSF like the rest of GNU Radio. And any changes we make from the GNU Radio side of things will also be GPL'd. Just so you have some understanding of the state of things when using it in your own code.