Skip to content

Commit 1f3de2a

Browse files
author
Kent Knox
authored
Merge pull request #295 from kknox/2.12
2.12
2 parents d16f7b3 + 88afc1d commit 1f3de2a

File tree

245 files changed

+3102
-2459
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

245 files changed

+3102
-2459
lines changed

.gitignore

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,5 +24,5 @@
2424
# vim temp files
2525
.*.swp
2626

27-
src/build/
28-
27+
# python compiled files
28+
*.pyc

.travis.yml

Lines changed: 9 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -113,19 +113,21 @@ install:
113113
- if [ ${TRAVIS_OS_NAME} == "linux" ]; then
114114
mkdir -p ${OPENCL_ROOT};
115115
pushd ${OPENCL_ROOT};
116-
wget ${OPENCL_REGISTRY}/specs/opencl-icd-1.2.11.0.tgz;
117-
tar -xf opencl-icd-1.2.11.0.tgz;
118-
mv ./icd/* .;
119-
mkdir -p inc/CL;
116+
travis_retry git clone --depth 1 https://github.com/KhronosGroup/OpenCL-ICD-Loader.git;
117+
mv ./OpenCL-ICD-Loader/* .;
118+
travis_retry git clone --depth 1 https://github.com/KhronosGroup/OpenCL-Headers.git inc/CL;
120119
pushd inc/CL;
121-
wget -r -w 1 -np -nd -nv -A h,hpp https://www.khronos.org/registry/cl/api/1.2/;
122-
wget -w 1 -np -nd -nv -A h,hpp https://www.khronos.org/registry/cl/api/2.1/cl.hpp;
120+
travis_retry wget -w 1 -np -nd -nv -A h,hpp ${OPENCL_REGISTRY}/api/2.1/cl.hpp;
123121
popd;
124122
mkdir -p lib;
125123
pushd lib;
126124
cmake -G "Unix Makefiles" ..;
127125
make;
128-
cp ../bin/libOpenCL.so .;
126+
cp ./bin/libOpenCL.so .;
127+
popd;
128+
pushd inc/CL;
129+
travis_retry git fetch origin opencl12:opencl12;
130+
git checkout opencl12;
129131
popd;
130132
mv inc/ include/;
131133
popd;

CONTRIBUTING.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,9 +19,9 @@ We want to ensure that the project code base maintains a level of quality over t
1919
guidelines over time
2020
* separate check-ins that modify a files style from the ones that add/change/delete code.
2121
* target the **develop** branch in the repository
22-
* ensure that the [code properly builds]( https://github.com/kknox/clBLAS/wiki/Build )
22+
* ensure that the [code properly builds]( https://github.com/clMathLibraries/clBLAS/wiki/Build )
2323
* cannot break existing test cases
24-
* we encourage contributors to [run the test-short]( https://github.com/kknox/clBLAS/wiki/Testing ) suite of tests on their end before the pull-request
24+
* we encourage contributors to [run the test-short]( https://github.com/clMathLibraries/clBLAS/wiki/Testing ) suite of tests on their end before the pull-request
2525
* if possible, upload the test results associated with the pull request to a personal [gist repository]( https://gist.github.com/ ) and insert a link to the test results in the pull request so that collaborators can browse the results
2626
* if no test results are provided with the pull request, official collaborators will run the test suite on their test machines against the patch before we will accept the pull-request
2727
* if we detect failing test cases, we will request that the code associated with the pull request be fixed before the pull request will be merged

README.md

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ This repository houses the code for the OpenCL™ BLAS portion of clMath.
1010
The complete set of BLAS level 1, 2 & 3 routines is implemented. Please
1111
see Netlib BLAS for the list of supported routines. In addition to GPU
1212
devices, the library also supports running on CPU devices to facilitate
13-
debugging and multicore programming. APPML 1.10 is the most current
13+
debugging and multicore programming. APPML 1.12 is the most current
1414
generally available pre-packaged binary version of the library available
1515
for download for both Linux and Windows platforms.
1616

@@ -23,13 +23,12 @@ library does generate and enqueue optimized OpenCL kernels, relieving
2323
the user from the task of writing, optimizing and maintaining kernel
2424
code themselves.
2525

26-
## clBLAS update notes 09/2015
27-
28-
- Introducing [AutoGemm](http://github.com/clMathLibraries/clBLAS/wiki/AutoGemm)
29-
- clBLAS's Gemm implementation has been comprehensively overhauled to use AutoGemm. AutoGemm is a suite of python scripts which generate optimized kernels and kernel selection logic, for all precisions, transposes, tile sizes and so on.
30-
- CMake is configured to use AutoGemm for clBLAS so the build and usage experience of Gemm remains unchanged (only performance and maintainability has been improved). Kernel sources are generated at build time (not runtime) and can be configured within CMake to be pre-compiled at build time.
31-
- clBLAS users with unique Gemm requirements can customize AutoGemm to their needs (such as non-default tile sizes for very small or very skinny matrices); see [AutoGemm](http://github.com/clMathLibraries/clBLAS/wiki/AutoGemm) documentation for details.
26+
## clBLAS update notes 01/2017
3227

28+
- v2.12 is a bugfix release as a rollup of all fixes in /develop branch
29+
- Thanks to @pavanky, @iotamudelta, @shahsan10, @psyhtest, @haahh, @hughperkins, @tfauck
30+
@abhiShandy, @IvanVergiliev, @zougloub, @mgates3 for contributions to clBLAS v2.12
31+
- Summary of fixes available to read on the releases tab
3332

3433
## clBLAS library user documentation
3534

@@ -197,8 +196,12 @@ The simple example below shows how to use clBLAS to compute an OpenCL accelerate
197196
198197
### Test infrastructure
199198
* Googletest v1.6
200-
* ACML on windows/linux; Accelerate on Mac OSX
201199
* Latest Boost
200+
* CPU BLAS
201+
- Netlib CBLAS (recommended)
202+
Ubuntu: install by "apt-get install libblas-dev"
203+
Windows: download & install lapack-3.6.0 which comes with CBLAS
204+
- or ACML on windows/linux; Accelerate on Mac OSX
202205
203206
### Performance infrastructure
204207
* Python

appveyor.yml

Lines changed: 10 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -40,26 +40,29 @@ install:
4040
- ps: mkdir $env:OPENCL_ROOT
4141
- ps: pushd $env:OPENCL_ROOT
4242
- ps: $opencl_registry = $env:OPENCL_REGISTRY
43-
# This downloads the source to the example/demo icd library
44-
- ps: wget $opencl_registry/specs/opencl-icd-1.2.11.0.tgz -OutFile opencl-icd-1.2.11.0.tgz
45-
- ps: 7z x opencl-icd-1.2.11.0.tgz
46-
- ps: 7z x opencl-icd-1.2.11.0.tar
47-
- ps: mv .\icd\* .
43+
# This downloads the source to the Khronos ICD library
44+
- git clone --depth 1 https://github.com/KhronosGroup/OpenCL-ICD-Loader.git
45+
- ps: mv ./OpenCL-ICD-Loader/* .
4846
# This downloads all the opencl header files
4947
# The cmake build files expect a directory called inc
5048
- ps: mkdir inc/CL
51-
- ps: wget $opencl_registry/api/1.2/ | select -ExpandProperty links | where {$_.href -like "*.h*"} | select -ExpandProperty outerText | foreach{ wget $opencl_registry/api/1.2/$_ -OutFile inc/CL/$_ }
49+
- git clone --depth 1 https://github.com/KhronosGroup/OpenCL-Headers.git inc/CL
50+
- ps: wget $opencl_registry/api/2.1/cl.hpp -OutFile inc/CL/cl.hpp
5251
# - ps: dir; if( $lastexitcode -eq 0 ){ dir include/CL } else { Write-Output boom }
5352
# Create the static import lib in a directory called lib, so findopencl() will find it
5453
- ps: mkdir lib
5554
- ps: pushd lib
5655
- cmake -G "NMake Makefiles" ..
5756
- nmake
5857
- ps: popd
58+
# Switch to OpenCL 1.2 headers
59+
- ps: pushd inc/CL
60+
- git fetch origin opencl12:opencl12
61+
- git checkout opencl12
62+
- ps: popd
5963
# Rename the inc directory to include, so FindOpencl() will find it
6064
- ps: ren inc include
6165
- ps: popd
62-
- ps: popd
6366

6467
# before_build is used to run configure steps
6568
before_build:

src/CMakeLists.txt

Lines changed: 49 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
# ########################################################################
22
# Copyright 2013 Advanced Micro Devices, Inc.
3-
#
3+
#
44
# Licensed under the Apache License, Version 2.0 (the "License");
55
# you may not use this file except in compliance with the License.
66
# You may obtain a copy of the License at
7-
#
7+
#
88
# http://www.apache.org/licenses/LICENSE-2.0
9-
#
9+
#
1010
# Unless required by applicable law or agreed to in writing, software
1111
# distributed under the License is distributed on an "AS IS" BASIS,
1212
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
@@ -18,7 +18,7 @@ cmake_minimum_required(VERSION 2.8)
1818

1919
#User toggle-able options that can be changed on the command line with -D
2020
option( BUILD_RUNTIME "Build the BLAS runtime library" ON )
21-
option( BUILD_TEST "Build the library testing suite (dependency on google test, Boost, and ACML)" ON )
21+
option( BUILD_TEST "Build the library testing suite (dependency on google test, Boost, and ACML/NETLIB BLAS)" ON )
2222
option( BUILD_PERFORMANCE "Copy the performance scripts that can measure and graph performance" OFF )
2323
option( BUILD_SAMPLE "Build the sample programs" OFF )
2424
option( BUILD_CLIENT "Build a command line clBLAS client program with a variety of configurable parameters (dependency on Boost)" OFF )
@@ -41,33 +41,33 @@ set( OPENCL_OFFLINE_BUILD_TAHITI_KERNEL OFF)
4141
#use dynamic generated kernels
4242
# MESSAGE(STATUS "Build dynamic Hawaii kernels.")
4343
# MESSAGE(STATUS "Check OPENCL_OFFLINE_BUILD_HAWAII_KERNEL to build kernls at compile-time. This will eliminates clBuildProgram() overhead and better kernel performance with certain driver.")
44-
add_definitions(-DCLBLAS_HAWAII_DYNAMIC_KERNEL)
44+
add_definitions(-DCLBLAS_HAWAII_DYNAMIC_KERNEL)
4545
#else()
4646
# MESSAGE(STATUS "Build static Hawaii kernels.")
4747
# MESSAGE(STATUS "Uncheck OPENCL_OFFLINE_BUILD_HAWAII_KERNEL to build kernls at run-time")
48-
# MESSAGE(STATUS "Please ensure the presence of Hawaii device in the system. With certain driver/compiler flags, this might result in compile-time error.")
48+
# MESSAGE(STATUS "Please ensure the presence of Hawaii device in the system. With certain driver/compiler flags, this might result in compile-time error.")
4949
#endif( )
5050

5151
#if( NOT OPENCL_OFFLINE_BUILD_BONAIRE_KERNEL )
5252
#use dynamic generated kernels
5353
# MESSAGE(STATUS "Build dynamic Bonaire kernels.")
5454
# MESSAGE(STATUS "Check OPENCL_OFFLINE_BUILD_BONAIRE_KERNEL to build kernls at compile-time. This will eliminates clBuildProgram() overhead and better kernel performance with certain driver.")
55-
add_definitions(-DCLBLAS_BONAIRE_DYNAMIC_KERNEL)
55+
add_definitions(-DCLBLAS_BONAIRE_DYNAMIC_KERNEL)
5656
#else()
5757
# MESSAGE(STATUS "Build static Bonaire kernels.")
5858
# MESSAGE(STATUS "Uncheck OPENCL_OFFLINE_BUILD_BONAIRE_KERNEL to build kernls at run-time")
59-
# MESSAGE(STATUS "Please ensure the presence of Bonaire device in the system. With certain driver/compiler flags, this might result in compile-time error.")
59+
# MESSAGE(STATUS "Please ensure the presence of Bonaire device in the system. With certain driver/compiler flags, this might result in compile-time error.")
6060
#endif( )
6161

6262
#if( NOT OPENCL_OFFLINE_BUILD_TAHITI_KERNEL )
6363
#use dynamic generated kernels
6464
# MESSAGE(STATUS "Build dynamic Tahiti kernels.")
6565
# MESSAGE(STATUS "Check OPENCL_OFFLINE_BUILD_TAHITI_KERNEL to build kernls at compile-time. This will eliminates clBuildProgram() overhead and better kernel performance with certain driver.")
66-
add_definitions(-DCLBLAS_TAHITI_DYNAMIC_KERNEL)
66+
add_definitions(-DCLBLAS_TAHITI_DYNAMIC_KERNEL)
6767
#else( )
6868
# MESSAGE(STATUS "Build static Tahiti kernels.")
6969
# MESSAGE(STATUS "Uncheck OPENCL_OFFLINE_BUILD_TAHITI_KERNEL to build kernls at run-time")
70-
# MESSAGE(STATUS "Please ensure the presence of Tahiti device in the system. With certain driver/compiler flags, this might result in compile-time error.")
70+
# MESSAGE(STATUS "Please ensure the presence of Tahiti device in the system. With certain driver/compiler flags, this might result in compile-time error.")
7171
#endif( )
7272

7373

@@ -108,7 +108,7 @@ if( NOT DEFINED clBLAS_VERSION_MAJOR )
108108
endif( )
109109

110110
if( NOT DEFINED clBLAS_VERSION_MINOR )
111-
set( clBLAS_VERSION_MINOR 10 )
111+
set( clBLAS_VERSION_MINOR 12 )
112112
endif( )
113113

114114
if( NOT DEFINED clBLAS_VERSION_PATCH )
@@ -135,8 +135,8 @@ if(NOT CMAKE_BUILD_TYPE)
135135
FORCE)
136136
endif()
137137

138-
# These variables are meant to contain string which should be appended to the installation paths
139-
# of library and executable binaries, respectively. They are meant to be user configurable/overridable.
138+
# These variables are meant to contain string which should be appended to the installation paths
139+
# of library and executable binaries, respectively. They are meant to be user configurable/overridable.
140140
set( SUFFIX_LIB_DEFAULT "" )
141141
set( SUFFIX_BIN_DEFAULT "" )
142142

@@ -170,8 +170,9 @@ if( MSVC_IDE )
170170
endif( )
171171

172172
# add the math library for Linux
173-
if( UNIX )
173+
if( UNIX )
174174
set(MATH_LIBRARY "m")
175+
set(THREAD_LIBRARY "pthread")
175176
endif()
176177

177178
# set the path to specific OpenCL compiler
@@ -220,7 +221,7 @@ if( BUILD_TEST )
220221
else()
221222
message(WARNING "Cannot find acml.h")
222223
endif()
223-
224+
224225
if( UNIX )
225226
find_library(ACML_LIBRARIES acml_mp
226227
HINTS
@@ -238,7 +239,7 @@ if( BUILD_TEST )
238239
)
239240
mark_as_advanced(_acml_mv_library)
240241
endif( )
241-
242+
242243
if(WIN32)
243244
find_library(ACML_LIBRARIES libacml_mp_dll
244245
HINTS
@@ -248,7 +249,7 @@ if( BUILD_TEST )
248249
$ENV{ACML_ROOT}/${ACML_SUBDIR}/lib
249250
)
250251
endif( )
251-
252+
252253
if( NOT ACML_LIBRARIES )
253254
message(WARNING "Cannot find libacml")
254255
endif( )
@@ -265,15 +266,23 @@ if( BUILD_TEST )
265266
endif( )
266267
endif( )
267268

269+
if( BUILD_CLIENT )
270+
if( NETLIB_FOUND )
271+
else( )
272+
message( WARNING "Not find Netlib; BUILD_CLIENT needs the Netlib CBLAS library" )
273+
endif()
274+
endif()
275+
276+
268277
# This will define OPENCL_FOUND
269-
find_package( OpenCL )
278+
find_package( OpenCL ${OPENCL_VERSION} )
270279

271280
# Find Boost on the system, and configure the type of boost build we want
272281
set( Boost_USE_MULTITHREADED ON )
273282
set( Boost_USE_STATIC_LIBS ON )
274283
set( Boost_DETAILED_FAILURE_MSG ON )
275-
set( Boost_DEBUG ON )
276-
set( Boost_ADDITIONAL_VERSIONS "1.44.0" "1.44" "1.47.0" "1.47" )
284+
# set( Boost_DEBUG ON )
285+
set( Boost_ADDITIONAL_VERSIONS "1.44.0" "1.44" "1.47.0" "1.47" "1.60.0" "1.60" )
277286

278287
find_package( Boost 1.33.0 COMPONENTS program_options )
279288
message(STATUS "Boost_PROGRAM_OPTIONS_LIBRARY: ${Boost_PROGRAM_OPTIONS_LIBRARY}")
@@ -288,26 +297,36 @@ endif()
288297

289298
# Turn on maximum compiler verbosity
290299
if(CMAKE_COMPILER_IS_GNUCXX)
291-
add_definitions(-pedantic -Wall -Wextra
300+
add_definitions(# -pedantic -Wall -Wextra
292301
-D_POSIX_C_SOURCE=199309L -D_XOPEN_SOURCE=500
293302
)
294303
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -std=c99 -Wstrict-prototypes" CACHE STRING
295304
"Default CFLAGS" FORCE)
296305
# Don't use -rpath.
297306
set(CMAKE_SKIP_RPATH ON CACHE BOOL "Skip RPATH" FORCE)
298307

299-
set(CMAKE_C_FLAGS "-m${TARGET_PLATFORM} ${CMAKE_C_FLAGS}")
300-
set(CMAKE_CXX_FLAGS "-m${TARGET_PLATFORM} ${CMAKE_CXX_FLAGS}")
301-
set(CMAKE_Fortran_FLAGS "-m${TARGET_PLATFORM} ${CMAKE_Fortran_FLAGS}")
308+
# Need to determine the target machine of the C compiler, because
309+
# the '-m32' and '-m64' flags are supported on x86 but not on e.g. ARM.
310+
exec_program( "${CMAKE_C_COMPILER} -dumpmachine"
311+
OUTPUT_VARIABLE CMAKE_C_COMPILER_MACHINE )
312+
message( STATUS "CMAKE_C_COMPILER_MACHINE: ${CMAKE_C_COMPILER_MACHINE}" )
313+
# The "86" regular expression matches x86, x86_64, i686, etc.
314+
if(${CMAKE_C_COMPILER_MACHINE} MATCHES "86")
315+
set(CMAKE_C_FLAGS "-m${TARGET_PLATFORM} ${CMAKE_C_FLAGS}")
316+
set(CMAKE_CXX_FLAGS "-m${TARGET_PLATFORM} ${CMAKE_CXX_FLAGS}")
317+
set(CMAKE_Fortran_FLAGS "-m${TARGET_PLATFORM} ${CMAKE_Fortran_FLAGS}")
318+
endif()
302319

303320
if(TARGET_PLATFORM EQUAL 32)
304321
set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -fno-builtin")
305322
endif()
323+
elseif(CMAKE_CXX_COMPILER_ID MATCHES "Clang")
324+
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wno-narrowing")
306325
elseif( MSVC )
307326
# CMake sets huge stack frames for windows, for whatever reason. We go with compiler default.
308327
string( REGEX REPLACE "/STACK:[0-9]+" "" CMAKE_EXE_LINKER_FLAGS "${CMAKE_EXE_LINKER_FLAGS}" )
309328
string( REGEX REPLACE "/STACK:[0-9]+" "" CMAKE_SHARED_LINKER_FLAGS "${CMAKE_SHARED_LINKER_FLAGS}" )
310-
string( REGEX REPLACE "/STACK:[0-9]+" "" CMAKE_MODULE_LINKER_FLAGS "${CMAKE_MODULE_LINKER_FLAGS}" )
329+
string( REGEX REPLACE "/STACK:[0-9]+" "" CMAKE_MODULE_LINKER_FLAGS "${CMAKE_MODULE_LINKER_FLAGS}" )
311330
endif( )
312331

313332
if (WIN32)
@@ -320,13 +339,13 @@ add_definitions( -DCL_USE_DEPRECATED_OPENCL_1_1_APIS )
320339
configure_file( "${PROJECT_SOURCE_DIR}/clBLAS.version.h.in" "${PROJECT_BINARY_DIR}/include/clBLAS.version.h" )
321340

322341
# configure a header file to pass the CMake version settings to the source, and package the header files in the output archive
323-
install( FILES
324-
"clBLAS.h"
342+
install( FILES
343+
"clBLAS.h"
325344
"clAmdBlas.h"
326345
"clAmdBlas.version.h"
327346
"clBLAS-complex.h"
328347
"${PROJECT_BINARY_DIR}/include/clBLAS.version.h"
329-
DESTINATION
348+
DESTINATION
330349
"./include" )
331350

332351

@@ -351,7 +370,7 @@ if( BUILD_SAMPLE AND IS_DIRECTORY "${PROJECT_SOURCE_DIR}/samples" )
351370
add_subdirectory( samples )
352371
endif( )
353372

354-
# The build server is not supposed to build or package any of the tests; build server script will define this on the command line with
373+
# The build server is not supposed to build or package any of the tests; build server script will define this on the command line with
355374
# cmake -G "Visual Studio 10 Win64" -D BUILDSERVER:BOOL=ON ../..
356375
if( BUILD_TEST )
357376
if( IS_DIRECTORY "${PROJECT_SOURCE_DIR}/tests" )
@@ -386,7 +405,7 @@ install(FILES ${CMAKE_CURRENT_BINARY_DIR}/clBLASConfigVersion.cmake
386405
DESTINATION ${destdir})
387406

388407

389-
# The following code is setting variables to control the behavior of CPack to generate our
408+
# The following code is setting variables to control the behavior of CPack to generate our
390409
if( WIN32 )
391410
set( CPACK_SOURCE_GENERATOR "ZIP" )
392411
set( CPACK_GENERATOR "ZIP" )

src/FindNetlib.cmake

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -100,6 +100,25 @@ if( NOT contains_BLAS EQUAL -1 )
100100
FIND_PACKAGE_HANDLE_STANDARD_ARGS( NETLIB DEFAULT_MSG Netlib_BLAS_LIBRARY )
101101
endif( )
102102

103+
104+
#look for netlib cblas header
105+
if( UNIX )
106+
find_path(Netlib_INCLUDE_DIRS cblas.h
107+
HINTS
108+
/usr/include
109+
)
110+
else()
111+
find_path(Netlib_INCLUDE_DIRS cblas.h
112+
HINTS
113+
${Netlib_ROOT}/CBLAS/include/
114+
)
115+
endif()
116+
117+
if( Netlib_INCLUDE_DIRS )
118+
else()
119+
message(WARNING "Cannot find cblas.h")
120+
endif()
121+
103122
if( NETLIB_FOUND )
104123
list( APPEND Netlib_LIBRARIES ${Netlib_BLAS_LIBRARY} )
105124
else( )

0 commit comments

Comments
 (0)