c++ - Strange error in AVX loop vectorization -


when try unroll simplest loop avx, runtime error - segmentation fault:

    const int sz = 9;     float *src   = (float *)_mm_malloc(sz*sizeof(float), 16);     float *dest  = (float *)_mm_malloc(sz*sizeof(float), 16);      for(int i=0; i<8; i+=8)     {          __m256 buffer = _mm256_load_ps(src+i);          _mm256_store_ps(dest+i, buffer);     }      _mm_free(src);     _mm_free(dest); 

interesting: if sz=8, or >=13, runtime not crushes. otherwise segmentation fault occurs.

what's wrong?

compiler - gcc 4.7.

raising alignment 32 makes symptom go away.

i'm not versed these intrinsics, wouldn't surprised if 32-byte alignment required on 64-bit cpus

#include <mm_malloc.h> #include <immintrin.h>  int main() {     const int sz = 9;     float *src   = (float *)_mm_malloc(sz*sizeof(float), 32);     float *dest  = (float *)_mm_malloc(sz*sizeof(float), 32);      for(int i=0; i<8; i+=8)     {          __m256 buffer = _mm256_load_ps(src+i);          _mm256_store_ps(dest+i, buffer);     }      _mm_free(src);     _mm_free(dest); } 

Comments

Popular posts from this blog

java - JavaFX 2 slider labelFormatter not being used -

Detect support for Shoutcast ICY MP3 without navigator.userAgent in Firefox? -

web - SVG not rendering properly in Firefox -