c++ - Strange error in AVX loop vectorization -
when try unroll simplest loop avx, runtime error - segmentation fault:
const int sz = 9; float *src = (float *)_mm_malloc(sz*sizeof(float), 16); float *dest = (float *)_mm_malloc(sz*sizeof(float), 16); for(int i=0; i<8; i+=8) { __m256 buffer = _mm256_load_ps(src+i); _mm256_store_ps(dest+i, buffer); } _mm_free(src); _mm_free(dest);
interesting: if sz=8, or >=13, runtime not crushes. otherwise segmentation fault occurs.
what's wrong?
compiler - gcc 4.7.
raising alignment 32 makes symptom go away.
i'm not versed these intrinsics, wouldn't surprised if 32
-byte alignment required on 64-bit cpus
#include <mm_malloc.h> #include <immintrin.h> int main() { const int sz = 9; float *src = (float *)_mm_malloc(sz*sizeof(float), 32); float *dest = (float *)_mm_malloc(sz*sizeof(float), 32); for(int i=0; i<8; i+=8) { __m256 buffer = _mm256_load_ps(src+i); _mm256_store_ps(dest+i, buffer); } _mm_free(src); _mm_free(dest); }
Comments
Post a Comment