Example 2: Vector
The main vector type in ViennaCL is vector<T, alignment>, representing a chunk of memory on the compute device. T is the underlying scalar type (either float or double, complex types are not supported in ViennaCL 1.0.x) and alignment denotes the memory the vector is aligned to (in multiples of sizeof(T)). For example, a vector initialized to have 55 entries and an alignment of 16 will reside in a block of memory equal to 64 entries. However, memory alignment is fully transparent, so from the end-user's point of view, alignment allows to tune the library for maximum speed on the available compute device.
Handling vectors in ViennaCL
typedef float ScalarType; //typedef double ScalarType; //use this if your GPU supports double precision // Define a few CPU vectors using the STL std::vector<ScalarType> std_vec1(10); std::vector<ScalarType> std_vec2(10); std::vector<ScalarType> std_vec3(10); // Define a few GPU vectors using ViennaCL viennacl::vector<ScalarType> vcl_vec1(10); viennacl::vector<ScalarType> vcl_vec2(10); viennacl::vector<ScalarType> vcl_vec3(10); // Fill the CPU vectors with random values: // (random<> is a helper function defined elsewhere) for (unsigned int i = 0; i < 10; ++i) { std_vec1[i] = random<ScalarType>(); vcl_vec2[i] = random<ScalarType>(); //also works for GPU vectors, but is slow! std_vec3[i] = random<ScalarType>(); } // Copy the CPU vectors to the GPU vectors and vice versa copy(std_vec1.begin(), std_vec1.end(), vcl_vec1.begin()); //either the STL way copy(vcl_vec2.begin(), vcl_vec2.end(), std_vec2.begin()); copy(std_vec3, vcl_vec3); //or using the short hand notation copy(vcl_vec2, std_vec2); // Compute the inner product of two GPU vectors and write the result to either CPU or GPU vcl_s1 = viennacl::linalg::inner_prod(vcl_vec1, vcl_vec2); s1 = viennacl::linalg::inner_prod(vcl_vec1, vcl_vec2); // Compute norms: s1 = viennacl::linalg::norm_1(vcl_vec1); vcl_s2 = viennacl::linalg::norm_2(vcl_vec2); s3 = viennacl::linalg::norm_inf(vcl_vec3); // Use viennacl::vector via the overloaded operators just as you would write it on paper: vcl_vec1 = vcl_s1 * vcl_vec2 / vcl_s3; vcl_vec1 = vcl_vec2 / vcl_s1 + vcl_s2 * (vcl_vec1 - vcl_s2 * vcl_vec2);