About ViennaCL
The Vienna Computing Library (ViennaCL) is a free open-source scientific computing library written in C++ and provides CUDA, OpenCL and OpenMP computing backends. It enables simple, high-level access to the vast computing resources available on parallel architectures such as GPUs and is primarily focused on common sparse and dense linear algebra operations (BLAS levels 1, 2 and 3). It also provides iterative solvers with optional preconditioners for large systems of equations.
Core Features
All core features are considered mature and available on all computing backends.
- Convenient C++ API for common dense and sparse linear algebra operations
- Operations on BLAS levels 1, 2 and 3
- Multiple compute backends: CUDA, OpenCL, and OpenMP
- Integer and floating point arithmetic supported
- Support for multithreading across different backends and OpenCL contexts
- Sparse matrix types: CSR, COO, ELL, HYB, Sliced-ELL (implemented as proposed by Kreutzer et al.)
- Fast sparse matrix-vector products: Implementation derived from CSR-adaptive as proposed by Greathouse and Daga .
- Fast sparse matrix-matrix products: Implementation derived from RMerge as proposed by Gremse et al.
- Interface similar to Boost.uBLAS
- Convenient data transfer from and to STL, uBLAS, Armadillo, Eigen and MTL4 objects
- Lanczos method and Power Iteration for eigenvalue computations
- QR factorization (for least-squares problems)
- Header-only library
- MIT (X11) open source license
Iterative Solvers
- Conjugate Gradient (CG) - Pipelined or preconditioned
- Mixed-Precision CG
- Stabilized Bi-Conjugate Gradient (BiCGStab) - Pipelined or preconditioned
- Generalized Minimum Residual (GMRES) - Pipelined or preconditioned
Iterative solvers can also be used directly with C++ STL, uBLAS, Armadillo, Eigen and MTL4 objects
Preconditioners
- Incomplete Cholesky (ICHOL) factorization
- Fine-grained oparallel ICHOL factorization (as proposed by Chow and Patel)
- Incomplete LU factorization with static pattern (ILU0)
- Fine-grained parallel ILU0 factorization (as proposed by Chow and Patel)
- Incomplete LU factorization with threshold (ILUT)
- Block-ILU preconditioner (with ILU0 or ILUT)
- Algebraic Multigrid (as proposed by Bell et al.)
- Jacobi
- Row normalization
Additional Features
Additional features are only available on some computing backends. Also, interface changes might still occur in the process of features becoming core functionality.
- Singular value decomposition and nonnegative matrix factorization (both experimental)
- Sparse approximate inverse preconditioner (experimental)
- Fast Fourier transform (experimental)
- Structured matrix types for efficient operations: Circulant, Hankel, Toeplitz, Vandermonde (experimental)
- Reordering algorithms for sparse systems of linear equations: Cuthill-McKee, Gibbs-Poole-Stockmeyer (both experimental)