Skip to content

Commit 54e949e

Browse files
author
Kent Knox
committed
Merge branch 'develop' into master
Bump version to v2.2 Conflicts: README.md src/CMakeLists.txt
2 parents ac0cb67 + f2de5e7 commit 54e949e

File tree

140 files changed

+4417
-1294
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

140 files changed

+4417
-1294
lines changed

.travis.yml

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
language: cpp
2+
3+
compiler:
4+
- gcc
5+
6+
before_install:
7+
- sudo apt-get update -qq
8+
- sudo apt-get install -qq fglrx opencl-headers libboost-program-options-dev libgtest-dev
9+
# Uncomment below to help verify the installs above work
10+
# - ls -la /usr/lib/libboost*
11+
# - ls -la /usr/include/boost
12+
# - ls -la /usr/src/gtest
13+
14+
install:
15+
- mkdir -p bin/gTest
16+
- cd bin/gTest
17+
- cmake -DCMAKE_BUILD_TYPE=Release /usr/src/gtest
18+
- make
19+
- sudo mv libg* /usr/lib
20+
21+
before_script:
22+
- cd ${TRAVIS_BUILD_DIR}
23+
- mkdir -p bin/clBLAS
24+
- cd bin/clBLAS
25+
- cmake -DCMAKE_BUILD_TYPE=Release -DBUILD_TEST=OFF -DBUILD_CLIENT=ON ../../src
26+
27+
script:
28+
- make install
29+
# - ls -Rla package
30+
# Run a simple test to validate that the build works; CPU device in a VM
31+
- cd package/bin
32+
- export LD_LIBRARY_PATH=${TRAVIS_BUILD_DIR}/bin/clBLAS/package/lib64:${LD_LIBRARY_PATH}
33+
- ./client --cpu
34+
35+
after_success:
36+
- cd ${TRAVIS_BUILD_DIR}/bin/clBLAS
37+
- make package
38+
39+
notifications:
40+
email:
41+
42+
on_success: change
43+
on_failure: always
44+

CHANGELOG

Lines changed: 0 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -243,34 +243,3 @@ For example:
243243
./example_sgemm
244244
- Run a simple client; one example is provided for each supported main
245245
BLAS function family.
246-
_______________________________________________________________________________
247-
(C) 2010-2013 Advanced Micro Devices, Inc. All rights reserved. AMD, the AMD
248-
Arrow logo, ATI, the ATI logo, Radeon, FireStream, FireGL, Catalyst, and
249-
combinations thereof are trademarks of Advanced Micro Devices, Inc. Microsoft
250-
(R), Windows, and Windows Vista (R) are registered trademarks of Microsoft
251-
Corporation in the U.S. and/or other jurisdictions. OpenCL and the OpenCL logo
252-
are trademarks of Apple Inc. used by permission by Khronos. Other names are for
253-
informational purposes only and may be trademarks of their respective owners.
254-
255-
The contents of this document are provided in connection with Advanced Micro
256-
Devices, Inc. ("AMD") products. AMD makes no representations or warranties with
257-
respect to the accuracy or completeness of the contents of this publication and
258-
reserves the right to make changes to specifications and product descriptions
259-
at any time without notice. The information contained herein may be of a
260-
preliminary or advance nature and is subject to change without notice. No
261-
license, whether express, implied, arising by estoppel or otherwise, to any
262-
intellectual property rights is granted by this publication. Except as set forth
263-
in AMD's Standard Terms and Conditions of Sale, AMD assumes no liability
264-
whatsoever, and disclaims any express or implied warranty, relating to its
265-
products including, but not limited to, the implied warranty of
266-
merchantability, fitness for a particular purpose, or infringement of any
267-
intellectual property right.
268-
269-
AMD's products are not designed, intended, authorized or warranted for use as
270-
components in systems intended for surgical implant into the body, or in other
271-
applications intended to support or sustain life, or in any other application
272-
in which the failure of AMD's product could create a situation where personal
273-
injury, death, or severe property or environmental damage may occur. AMD
274-
reserves the right to discontinue or make changes to its products at any time
275-
without notice.
276-
_______________________________________________________________________________

CONTRIBUTING.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ Firstly, in order to contribute code to this project, a contributor must have a
88
* After forking, the contributor [clones their repository](https://help.github.com/articles/create-a-repo) locally on their machine
99
* Code is developed and checked into the contributor's repository. These commits are eventually pushed upstream to their GitHub repository
1010
* The contributor then issues a [pull-request](https://help.github.com/articles/using-pull-requests) against the **develop** branch of this repository, which is the [git flow](http://nvie.com/posts/a-successful-git-branching-model/) workflow which is well suited for working with GitHub
11-
* A [git extention](https://github.com/nvie/gitflow) has been developed to ease the use of the 'git flow' methodology, but requires manual installation by the user. Refer to the projects wiki
11+
* A [git extension](https://github.com/nvie/gitflow) has been developed to ease the use of the 'git flow' methodology, but requires manual installation by the user. Refer to the projects wiki
1212

1313
At this point, the repository maintainers will be notified by GitHub that a 'pull request' exists pending against their repository. A code review should be completed within a few days, depending on the scope of submitted code, and the code will either be accepted, rejected or commented on for extra feedback.
1414

@@ -32,5 +32,5 @@ guidelines over time
3232
Pull requests will be reviewed by the set of collaborators that are assigned for the repository. Pull requests may be accepted, declined or a conversation may start on the pull request thread with feedback. If the pull request is trivial and all the submission guidelines defined above are honored, the pull request may be accepted without delay. If the pull request is good, but the guidelines defined above are not followed, the collaborators may leave feedback on the pull request and engage in a conversation with the contributor with what they can do to improve the pull request. At any time, collaborators may decline a pull request if they decide the contribution is not appropriate for the project, or the feedback from reviewers on a pull request is not being addressed in an appropriate amount of time.
3333

3434
## Is it possible to become an official collaborator of the repository?
35-
Yes, we hope to promote trusted members of the community, who have proven themselves to be competent and request to take on the extra responsibility to be official collaborators of the project. When an individual requests to be an official collaborator, current project collaborators will browse through the history of the requester's prior pull requests and take a vote amongst themselves if the requester should be promoted to collaborator. These individuals will then have the right to approve/decline pull requests and help shape the path that the project goes. It is worth noting, that on GitHub everybody has read-only access to the source and that everybody has the ability to issue a pull request to contribute to the project. The benefit of being a repository collaborator allows you to be able to be able to manage other peoples pull requests.
35+
Yes, we hope to promote trusted members of the community, who have proven themselves to be competent and request to take on the extra responsibility to be official collaborators of the project. When an individual requests to be an official collaborator, current project collaborators will browse through the history of the requester's prior pull requests and take a vote amongst themselves if the requester should be promoted to collaborator. These individuals will then have the right to approve/decline pull requests and help shape the path that the project goes. It is worth noting, that on GitHub everybody has read-only access to the source and that everybody has the ability to issue a pull request to contribute to the project. The benefit of being a repository collaborator allows you to be able to manage other peoples pull requests.
3636

README.md

Lines changed: 101 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -1,78 +1,110 @@
11
clBLAS
22
=====
3+
[![Build Status](https://travis-ci.org/clMathLibraries/clBLAS.png)](https://travis-ci.org/clMathLibraries/clBLAS)
4+
5+
6+
This repository houses the code for the OpenCL™ BLAS portion of clMath.
7+
The complete set of BLAS level 1, 2 & 3 routines is implemented. Please
8+
see Netlib BLAS for the list of supported routines. In addition to GPU
9+
devices, the library also supports running on CPU devices to facilitate
10+
debugging and multicore programming. APPML 1.10 is the most current
11+
generally available pre-packaged binary version of the library available
12+
for download for both Linux and Windows platforms.
13+
14+
The primary goal of clBLAS is to make it easier for developers to
15+
utilize the inherent performance and power efficiency benefits of
16+
heterogeneous computing. clBLAS interfaces do not hide nor wrap OpenCL
17+
interfaces, but rather leaves OpenCL state management to the control of
18+
the user to allow for maximum performance and flexibility. The clBLAS
19+
library does generate and enqueue optimized OpenCL kernels, relieving
20+
the user from the task of writing, optimizing and maintaining kernel
21+
code themselves.
322

4-
clMATH is a software library containing FFT and BLAS functions written in OpenCL. In addition to GPU devices, the libraries also support running on CPU devices to facilitate debugging and multicore programming.
23+
## clBLAS library user documentation
524

6-
<a href="http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-math-libraries/">APPML 1.10</a> is the most current generally available version of the library, and pre-built binaries are available for download on both Linux and Windows platforms.
25+
[Library and API documentation][] for developers is available online as
26+
a GitHub Pages website
727

8-
This repository houses the code for the OpenCL™ BLAS portion of APPML. The complete set of BLAS level 1, 2 & 3 routines has been implemented. Please see <a href="http://www.netlib.org/blas/index.html"> Netlib BLAS </a> for the list of routines. For more information on supported graphics cards, see the <a href="http://developer.amd.com/tools-and-sdks/heterogeneous-computing/amd-accelerated-parallel-processing-app-sdk/system-requirements-driver-compatibility/">AMD APP System Requirements</a>.
28+
### Google Groups
929

10-
The primary goal of clBLAS is to make it easier for developers to utilize the inherent performance and power efficiency benefits of heterogeneous computing. clBLAS interfaces do not hide nor wrap OpenCL interfaces, but rather leaves OpenCL state management to the control of the user to allow for maximum performance and flexibility. The clBLAS library does generate and enqueue optimized OpenCL kernels, relieving the user from the task of writing, optimizing and maintaining kernel code themselves.
30+
Two mailing lists have been created for the clMath projects:
1131

12-
## clBLAS library user documentation
13-
[Library and API documentation]( http://clmathlibraries.github.io/clBLAS/ ) for developers is available online as a GitHub Pages website
32+
- [[email protected]][] - group whose focus is to answer
33+
questions on using the library or reporting issues
34+
35+
- [[email protected]][] - group whose focus is for
36+
developers interested in contributing to the library code itself
1437

1538
## clBLAS Wiki
16-
The [project wiki](https://github.com/clMathLibraries/clBLAS/wiki) contains helpful documentation, including a [build primer](https://github.com/clMathLibraries/clBLAS/wiki/Build)
39+
40+
The [project wiki][] contains helpful documentation, including a [build
41+
primer][]
1742

1843
## Contributing code
19-
Please refer to and read the [Contributing](CONTRIBUTING.md) document for guidelines on how to contribute code to this open source project
44+
45+
Please refer to and read the [Contributing][] document for guidelines on
46+
how to contribute code to this open source project. The code in the
47+
/master branch is considered to be stable, and all pull-requests should
48+
be made against the /develop branch.
2049

2150
## License
22-
The source for clFFT is licensed under the [Apache License, Version 2.0]( http://www.apache.org/licenses/LICENSE-2.0 )
51+
52+
The source for clBLAS is licensed under the [Apache License, Version
53+
2.0][]
2354

2455
## Example
25-
The simple example below shows how to use clBLAS to compute an OpenCL accelerated SGEMM
2656

27-
```c
28-
#include <sys/types.h>
29-
#include <stdio.h>
57+
The simple example below shows how to use clBLAS to compute an OpenCL
58+
accelerated SGEMM
3059

31-
/* Include the clBLAS header. It includes the appropriate OpenCL headers
60+
#include <sys/types.h>
61+
#include <stdio.h>
62+
63+
/* Include the clBLAS header. It includes the appropriate OpenCL headers
3264
*/
33-
#include <clBLAS.h>
65+
#include <clBLAS.h>
3466

35-
/* This example uses predefined matrices and their characteristics for
67+
/* This example uses predefined matrices and their characteristics for
3668
* simplicity purpose.
3769
*/
3870

39-
#define M 4
40-
#define N 3
41-
#define K 5
71+
#define M 4
72+
#define N 3
73+
#define K 5
4274

43-
static const cl_float alpha = 10;
75+
static const cl_float alpha = 10;
4476

45-
static const cl_float A[M*K] = {
77+
static const cl_float A[M*K] = {
4678
11, 12, 13, 14, 15,
4779
21, 22, 23, 24, 25,
4880
31, 32, 33, 34, 35,
4981
41, 42, 43, 44, 45,
50-
};
51-
static const size_t lda = K; /* i.e. lda = K */
82+
};
83+
static const size_t lda = K; /* i.e. lda = K */
5284

53-
static const cl_float B[K*N] = {
85+
static const cl_float B[K*N] = {
5486
11, 12, 13,
5587
21, 22, 23,
5688
31, 32, 33,
5789
41, 42, 43,
5890
51, 52, 53,
59-
};
60-
static const size_t ldb = N; /* i.e. ldb = N */
91+
};
92+
static const size_t ldb = N; /* i.e. ldb = N */
6193

62-
static const cl_float beta = 20;
94+
static const cl_float beta = 20;
6395

64-
static cl_float C[M*N] = {
96+
static cl_float C[M*N] = {
6597
11, 12, 13,
6698
21, 22, 23,
6799
31, 32, 33,
68100
41, 42, 43,
69-
};
70-
static const size_t ldc = N; /* i.e. ldc = N */
101+
};
102+
static const size_t ldc = N; /* i.e. ldc = N */
71103

72-
static cl_float result[M*N];
104+
static cl_float result[M*N];
73105

74-
int main( void )
75-
{
106+
int main( void )
107+
{
76108
cl_int err;
77109
cl_platform_id platform = 0;
78110
cl_device_id device = 0;
@@ -138,25 +170,48 @@ int main( void )
138170
clReleaseContext( ctx );
139171

140172
return ret;
141-
}
142-
```
173+
}
143174

144175
## Build dependencies
176+
145177
### Library for Windows
146-
* Windows® 7/8
147-
* Visual Studio 2010 SP1
148-
* An OpenCL SDK, such as APP SDK 2.8
149-
* Latest CMake
178+
179+
- Windows® 7/8
180+
181+
- Visual Studio 2010 SP1, 2012
182+
183+
- An OpenCL SDK, such as APP SDK 2.9
184+
185+
- Latest CMake
150186

151187
### Library for Linux
152-
* GCC 4.6 and onwards
153-
* An OpenCL SDK, such as APP SDK 2.8
154-
* Latest CMake
188+
189+
- GCC 4.6 and onwards
190+
191+
- An OpenCL SDK, such as APP SDK 2.9
192+
193+
- Latest CMake
194+
195+
### Library for Mac OSX
196+
197+
- Recommended to generate Unix makefiles with cmake
155198

156199
### Test infrastructure
157-
* Latest Googletest
158-
* Latest ACML
159-
* Latest Boost
200+
201+
- Googletest v1.6
202+
203+
- ACML on windows/linux; Accelerate on Mac OSX
204+
205+
- Latest Boost
160206

161207
### Performance infrastructure
162-
* Python
208+
209+
- Python
210+
211+
[Library and API documentation]: http://clmathlibraries.github.io/clBLAS/
212+
[[email protected]]: https://groups.google.com/forum/#!forum/clmath
213+
[[email protected]]: https://groups.google.com/forum/#!forum/clmath-developers
214+
[project wiki]: https://github.com/clMathLibraries/clBLAS/wiki
215+
[build primer]: https://github.com/clMathLibraries/clBLAS/wiki/Build
216+
[Contributing]: CONTRIBUTING.md
217+
[Apache License, Version 2.0]: http://www.apache.org/licenses/LICENSE-2.0

doc/clBLAS.doxy

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -52,7 +52,7 @@ PROJECT_LOGO =
5252
# If a relative path is entered, it will be relative to the location
5353
# where doxygen was started. If left blank the current directory will be used.
5454

55-
OUTPUT_DIRECTORY = F:\code\git-svn\clBLAS.head\bin\master\vs10x64.superbuild\docs
55+
OUTPUT_DIRECTORY = ..\..\bin\clBLAS.doxy
5656

5757
# If the CREATE_SUBDIRS tag is set to YES, then doxygen will create
5858
# 4096 sub-directories (in 2 levels) under the output directory of each output
@@ -651,17 +651,17 @@ WARN_LOGFILE =
651651
# directories like "/usr/src/myproject". Separate the files or directories
652652
# with spaces.
653653

654-
INPUT = clBLAS.h \
655-
include/cltypes.h \
656-
include/kerngen.h \
657-
include/solver.h \
658-
include/mempat.h \
659-
src/blas/gens/blas_kgen.h \
660-
src/blas/include/clblas-internal.h \
661-
src/blas/include/kernel_extra.h \
662-
src/blas/include/solution_seq.h \
663-
include/granulation.h \
664-
src/tools/ktest/step.h
654+
INPUT = ../src/clBLAS.h \
655+
../src/include/cltypes.h \
656+
../src/include/kerngen.h \
657+
../src/include/solver.h \
658+
../src/include/mempat.h \
659+
../src/library/gens/blas_kgen.h \
660+
../src/library/include/clblas-internal.h \
661+
../src/library/include/kernel_extra.h \
662+
../src/library/include/solution_seq.h \
663+
../src/include/granulation.h \
664+
../src/library/tools/ktest/step.h
665665

666666
# This tag can be used to specify the character encoding of the source files
667667
# that doxygen parses. Internally doxygen uses the UTF-8 encoding, which is
@@ -721,7 +721,7 @@ EXCLUDE_SYMBOLS =
721721
# directories that contain example code fragments that are included (see
722722
# the \include command).
723723

724-
EXAMPLE_PATH = samples
724+
EXAMPLE_PATH = ../src/samples
725725

726726
# If the value of the EXAMPLE_PATH tag contains directories, you can use the
727727
# EXAMPLE_PATTERNS tag to specify one or more wildcard pattern (like *.cpp

0 commit comments

Comments
 (0)