FindCUDA.cmake 85.7 KB
Newer Older
1
2
3
4
#[=======================================================================[.rst:
FindCUDA
--------

5
6
.. deprecated:: 3.10

7
8
9
It is no longer necessary to use this module or call ``find_package(CUDA)``
for compiling CUDA code. Instead, list ``CUDA`` among the languages named
in the top-level call to the :command:`project` command, or call the
10
11
12
13
:command:`enable_language` command with ``CUDA``.
Then one can add CUDA (``.cu``) sources to programs directly
in calls to :command:`add_library` and :command:`add_executable`.

14
15
16
17
.. versionadded:: 3.17
  To find and use the CUDA toolkit libraries the :module:`FindCUDAToolkit`
  module has superseded this module.  It works whether or not the ``CUDA``
  language is enabled.
18

19
20
Documentation of Deprecated Usage
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
21
22
23
24
25
26
27

Tools for building CUDA C files: libraries and build dependencies.

This script locates the NVIDIA CUDA C tools.  It should work on Linux,
Windows, and macOS and should be reasonably up to date with CUDA C
releases.

28
29
30
.. versionadded:: 3.19
  QNX support.

31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
This script makes use of the standard :command:`find_package` arguments of
``<VERSION>``, ``REQUIRED`` and ``QUIET``.  ``CUDA_FOUND`` will report if an
acceptable version of CUDA was found.

The script will prompt the user to specify ``CUDA_TOOLKIT_ROOT_DIR`` if
the prefix cannot be determined by the location of nvcc in the system
path and ``REQUIRED`` is specified to :command:`find_package`.  To use
a different installed version of the toolkit set the environment variable
``CUDA_BIN_PATH`` before running cmake (e.g.
``CUDA_BIN_PATH=/usr/local/cuda1.0`` instead of the default
``/usr/local/cuda``) or set ``CUDA_TOOLKIT_ROOT_DIR`` after configuring.  If
you change the value of ``CUDA_TOOLKIT_ROOT_DIR``, various components that
depend on the path will be relocated.

It might be necessary to set ``CUDA_TOOLKIT_ROOT_DIR`` manually on certain
platforms, or to use a CUDA runtime not installed in the default
location.  In newer versions of the toolkit the CUDA library is
included with the graphics driver -- be sure that the driver version
matches what is needed by the CUDA runtime version.

51
52
53
Input Variables
"""""""""""""""

54
55
56
The following variables affect the behavior of the macros in the
script (in alphabetical order).  Note that any of these flags can be
changed multiple times in the same directory before calling
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
``cuda_add_executable()``, ``cuda_add_library()``, ``cuda_compile()``,
``cuda_compile_ptx()``, ``cuda_compile_fatbin()``, ``cuda_compile_cubin()``
or ``cuda_wrap_srcs()``:

``CUDA_64_BIT_DEVICE_CODE`` (Default: host bit size)
  Set to ``ON`` to compile for 64 bit device code, OFF for 32 bit device code.
  Note that making this different from the host code when generating object
  or C files from CUDA code just won't work, because size_t gets defined by
  nvcc in the generated source.  If you compile to PTX and then load the
  file yourself, you can mix bit sizes between device and host.

``CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE`` (Default: ``ON``)
  Set to ``ON`` if you want the custom build rule to be attached to the source
  file in Visual Studio.  Turn OFF if you add the same cuda file to multiple
  targets.

  This allows the user to build the target from the CUDA file; however, bad
  things can happen if the CUDA source file is added to multiple targets.
  When performing parallel builds it is possible for the custom build
  command to be run more than once and in parallel causing cryptic build
  errors.  VS runs the rules for every source file in the target, and a
  source can have only one rule no matter how many projects it is added to.
  When the rule is run from multiple targets race conditions can occur on
  the generated file.  Eventually everything will get built, but if the user
  is unaware of this behavior, there may be confusion.  It would be nice if
  this script could detect the reuse of source files across multiple targets
  and turn the option off for the user, but no good solution could be found.

``CUDA_BUILD_CUBIN`` (Default: ``OFF``)
  Set to ``ON`` to enable and extra compilation pass with the ``-cubin`` option in
  Device mode. The output is parsed and register, shared memory usage is
  printed during build.

``CUDA_BUILD_EMULATION`` (Default: ``OFF`` for device mode)
  Set to ``ON`` for Emulation mode. ``-D_DEVICEEMU`` is defined for CUDA C files
  when ``CUDA_BUILD_EMULATION`` is ``TRUE``.

``CUDA_LINK_LIBRARIES_KEYWORD`` (Default: ``""``)
95
96
  .. versionadded:: 3.9

97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
  The ``<PRIVATE|PUBLIC|INTERFACE>`` keyword to use for internal
  :command:`target_link_libraries` calls. The default is to use no keyword which
  uses the old "plain" form of :command:`target_link_libraries`. Note that is matters
  because whatever is used inside the ``FindCUDA`` module must also be used
  outside - the two forms of :command:`target_link_libraries` cannot be mixed.

``CUDA_GENERATED_OUTPUT_DIR`` (Default: :variable:`CMAKE_CURRENT_BINARY_DIR`)
  Set to the path you wish to have the generated files placed.  If it is
  blank output files will be placed in :variable:`CMAKE_CURRENT_BINARY_DIR`.
  Intermediate files will always be placed in
  ``CMAKE_CURRENT_BINARY_DIR/CMakeFiles``.

``CUDA_HOST_COMPILATION_CPP`` (Default: ``ON``)
  Set to ``OFF`` for C compilation of host code.

``CUDA_HOST_COMPILER`` (Default: ``CMAKE_C_COMPILER``)
  Set the host compiler to be used by nvcc.  Ignored if ``-ccbin`` or
  ``--compiler-bindir`` is already present in the ``CUDA_NVCC_FLAGS`` or
  ``CUDA_NVCC_FLAGS_<CONFIG>`` variables.  For Visual Studio targets,
  the host compiler is constructed with one or more visual studio macros
  such as ``$(VCInstallDir)``, that expands out to the path when
  the command is run from within VS.
119
120
121
122

  .. versionadded:: 3.13
    If the :envvar:`CUDAHOSTCXX` environment variable is set it will
    be used as the default.
123
124
125
126
127

``CUDA_NVCC_FLAGS``, ``CUDA_NVCC_FLAGS_<CONFIG>``
  Additional NVCC command line arguments.  NOTE: multiple arguments must be
  semi-colon delimited (e.g. ``--compiler-options;-Wall``)

128
129
130
131
  .. versionadded:: 3.6
    Contents of these variables may use
    :manual:`generator expressions <cmake-generator-expressions(7)>`.

132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
``CUDA_PROPAGATE_HOST_FLAGS`` (Default: ``ON``)
  Set to ``ON`` to propagate :variable:`CMAKE_{C,CXX}_FLAGS <CMAKE_<LANG>_FLAGS>` and their configuration
  dependent counterparts (e.g. ``CMAKE_C_FLAGS_DEBUG``) automatically to the
  host compiler through nvcc's ``-Xcompiler`` flag.  This helps make the
  generated host code match the rest of the system better.  Sometimes
  certain flags give nvcc problems, and this will help you turn the flag
  propagation off.  This does not affect the flags supplied directly to nvcc
  via ``CUDA_NVCC_FLAGS`` or through the ``OPTION`` flags specified through
  ``cuda_add_library()``, ``cuda_add_executable()``, or ``cuda_wrap_srcs()``.  Flags used for
  shared library compilation are not affected by this flag.

``CUDA_SEPARABLE_COMPILATION`` (Default: ``OFF``)
  If set this will enable separable compilation for all CUDA runtime object
  files.  If used outside of ``cuda_add_executable()`` and ``cuda_add_library()``
  (e.g. calling ``cuda_wrap_srcs()`` directly),
  ``cuda_compute_separable_compilation_object_file_name()`` and
  ``cuda_link_separable_compilation_objects()`` should be called.

``CUDA_SOURCE_PROPERTY_FORMAT``
151
152
  .. versionadded:: 3.3

153
154
155
156
157
158
159
  If this source file property is set, it can override the format specified
  to ``cuda_wrap_srcs()`` (``OBJ``, ``PTX``, ``CUBIN``, or ``FATBIN``).  If an input source file
  is not a ``.cu`` file, setting this file will cause it to be treated as a ``.cu``
  file. See documentation for set_source_files_properties on how to set
  this property.

``CUDA_USE_STATIC_CUDA_RUNTIME`` (Default: ``ON``)
160
161
  .. versionadded:: 3.3

162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
  When enabled the static version of the CUDA runtime library will be used
  in ``CUDA_LIBRARIES``.  If the version of CUDA configured doesn't support
  this option, then it will be silently disabled.

``CUDA_VERBOSE_BUILD`` (Default: ``OFF``)
  Set to ``ON`` to see all the commands used when building the CUDA file.  When
  using a Makefile generator the value defaults to ``VERBOSE`` (run
  ``make VERBOSE=1`` to see output), although setting ``CUDA_VERBOSE_BUILD`` to ``ON`` will
  always print the output.

Commands
""""""""

The script creates the following functions and macros (in alphabetical order):

.. code-block:: cmake

  cuda_add_cufft_to_target(<cuda_target>)

Adds the cufft library to the target (can be any target).  Handles whether
you are in emulation mode or not.

.. code-block:: cmake

  cuda_add_cublas_to_target(<cuda_target>)

Adds the cublas library to the target (can be any target).  Handles
whether you are in emulation mode or not.

.. code-block:: cmake

  cuda_add_executable(<cuda_target> <file>...
                      [WIN32] [MACOSX_BUNDLE] [EXCLUDE_FROM_ALL] [OPTIONS ...])

Creates an executable ``<cuda_target>`` which is made up of the files
specified.  All of the non CUDA C files are compiled using the standard
build rules specified by CMake and the CUDA files are compiled to object
files using nvcc and the host compiler.  In addition ``CUDA_INCLUDE_DIRS`` is
added automatically to :command:`include_directories`.  Some standard CMake target
calls can be used on the target after calling this macro
(e.g. :command:`set_target_properties` and :command:`target_link_libraries`), but setting
properties that adjust compilation flags will not affect code compiled by
nvcc.  Such flags should be modified before calling ``cuda_add_executable()``,
``cuda_add_library()`` or ``cuda_wrap_srcs()``.

.. code-block:: cmake

  cuda_add_library(<cuda_target> <file>...
                   [STATIC | SHARED | MODULE] [EXCLUDE_FROM_ALL] [OPTIONS ...])

Same as ``cuda_add_executable()`` except that a library is created.

.. code-block:: cmake

  cuda_build_clean_target()

Creates a convenience target that deletes all the dependency files
generated.  You should make clean after running this target to ensure the
dependency files get regenerated.

.. code-block:: cmake

  cuda_compile(<generated_files> <file>... [STATIC | SHARED | MODULE]
               [OPTIONS ...])

Returns a list of generated files from the input source files to be used
with :command:`add_library` or :command:`add_executable`.

.. code-block:: cmake

  cuda_compile_ptx(<generated_files> <file>... [OPTIONS ...])

Returns a list of ``PTX`` files generated from the input source files.

.. code-block:: cmake

  cuda_compile_fatbin(<generated_files> <file>... [OPTIONS ...])

240
241
.. versionadded:: 3.1

242
243
244
245
246
247
Returns a list of ``FATBIN`` files generated from the input source files.

.. code-block:: cmake

  cuda_compile_cubin(<generated_files> <file>... [OPTIONS ...])

248
249
.. versionadded:: 3.1

250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
Returns a list of ``CUBIN`` files generated from the input source files.

.. code-block:: cmake

  cuda_compute_separable_compilation_object_file_name(<output_file_var>
                                                      <cuda_target>
                                                      <object_files>)

Compute the name of the intermediate link file used for separable
compilation.  This file name is typically passed into
``CUDA_LINK_SEPARABLE_COMPILATION_OBJECTS``.  output_file_var is produced
based on cuda_target the list of objects files that need separable
compilation as specified by ``<object_files>``.  If the ``<object_files>`` list is
empty, then ``<output_file_var>`` will be empty.  This function is called
automatically for ``cuda_add_library()`` and ``cuda_add_executable()``.  Note that
this is a function and not a macro.

.. code-block:: cmake

  cuda_include_directories(path0 path1 ...)

Sets the directories that should be passed to nvcc
(e.g. ``nvcc -Ipath0 -Ipath1 ...``). These paths usually contain other ``.cu``
files.

.. code-block:: cmake

  cuda_link_separable_compilation_objects(<output_file_var> <cuda_target>
                                          <nvcc_flags> <object_files>)

Generates the link object required by separable compilation from the given
object files.  This is called automatically for ``cuda_add_executable()`` and
``cuda_add_library()``, but can be called manually when using ``cuda_wrap_srcs()``
directly.  When called from ``cuda_add_library()`` or ``cuda_add_executable()`` the
``<nvcc_flags>`` passed in are the same as the flags passed in via the ``OPTIONS``
argument.  The only nvcc flag added automatically is the bitness flag as
specified by ``CUDA_64_BIT_DEVICE_CODE``.  Note that this is a function
instead of a macro.

.. code-block:: cmake

  cuda_select_nvcc_arch_flags(<out_variable> [<target_CUDA_architecture> ...])

Selects GPU arch flags for nvcc based on ``target_CUDA_architecture``.

Values for ``target_CUDA_architecture``:

* ``Auto``: detects local machine GPU compute arch at runtime.
* ``Common`` and ``All``: cover common and entire subsets of architectures.
* ``<name>``: one of ``Fermi``, ``Kepler``, ``Maxwell``, ``Kepler+Tegra``, ``Kepler+Tesla``, ``Maxwell+Tegra``, ``Pascal``.
* ``<ver>``, ``<ver>(<ver>)``, ``<ver>+PTX``, where ``<ver>`` is one of
  ``2.0``, ``2.1``, ``3.0``, ``3.2``, ``3.5``, ``3.7``, ``5.0``, ``5.2``, ``5.3``, ``6.0``, ``6.2``.

Returns list of flags to be added to ``CUDA_NVCC_FLAGS`` in ``<out_variable>``.
Additionally, sets ``<out_variable>_readable`` to the resulting numeric list.

Example::

  cuda_select_nvcc_arch_flags(ARCH_FLAGS 3.0 3.5+PTX 5.2(5.0) Maxwell)
  list(APPEND CUDA_NVCC_FLAGS ${ARCH_FLAGS})

More info on CUDA architectures: https://en.wikipedia.org/wiki/CUDA.
Note that this is a function instead of a macro.

.. code-block:: cmake

  cuda_wrap_srcs(<cuda_target> <format> <generated_files> <file>...
                 [STATIC | SHARED | MODULE] [OPTIONS ...])

This is where all the magic happens.  ``cuda_add_executable()``,
``cuda_add_library()``, ``cuda_compile()``, and ``cuda_compile_ptx()`` all call this
function under the hood.

Given the list of files ``<file>...`` this macro generates
custom commands that generate either PTX or linkable objects (use ``PTX`` or
``OBJ`` for the ``<format>`` argument to switch).  Files that don't end with ``.cu``
or have the ``HEADER_FILE_ONLY`` property are ignored.

The arguments passed in after ``OPTIONS`` are extra command line options to
give to nvcc.  You can also specify per configuration options by
specifying the name of the configuration followed by the options.  General
options must precede configuration specific options.  Not all
configurations need to be specified, only the ones provided will be used.
For example:

.. code-block:: cmake

  cuda_add_executable(...
    OPTIONS -DFLAG=2 "-DFLAG_OTHER=space in flag"
    DEBUG -g
    RELEASE --use_fast_math
    RELWITHDEBINFO --use_fast_math;-g
    MINSIZEREL --use_fast_math)

For certain configurations (namely VS generating object files with
``CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE`` set to ``ON``), no generated file will
be produced for the given cuda file.  This is because when you add the
cuda file to Visual Studio it knows that this file produces an object file
and will link in the resulting object file automatically.

This script will also generate a separate cmake script that is used at
build time to invoke nvcc.  This is for several reasons:

* nvcc can return negative numbers as return values which confuses
  Visual Studio into thinking that the command succeeded.  The script now
  checks the error codes and produces errors when there was a problem.

* nvcc has been known to not delete incomplete results when it
  encounters problems.  This confuses build systems into thinking the
  target was generated when in fact an unusable file exists.  The script
  now deletes the output files if there was an error.

* By putting all the options that affect the build into a file and then
  make the build rule dependent on the file, the output files will be
  regenerated when the options change.

This script also looks at optional arguments ``STATIC``, ``SHARED``, or ``MODULE`` to
determine when to target the object compilation for a shared library.
:variable:`BUILD_SHARED_LIBS` is ignored in ``cuda_wrap_srcs()``, but it is respected in
``cuda_add_library()``.  On some systems special flags are added for building
objects intended for shared libraries.  A preprocessor macro,
``<target_name>_EXPORTS`` is defined when a shared library compilation is
detected.

Flags passed into add_definitions with ``-D`` or ``/D`` are passed along to nvcc.

Result Variables
""""""""""""""""

The script defines the following variables:

``CUDA_VERSION_MAJOR``
  The major version of cuda as reported by nvcc.

``CUDA_VERSION_MINOR``
  The minor version.

``CUDA_VERSION``, ``CUDA_VERSION_STRING``
  Full version in the ``X.Y`` format.

``CUDA_HAS_FP16``
391
392
  .. versionadded:: 3.6
    Whether a short float (``float16``, ``fp16``) is supported.
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424

``CUDA_TOOLKIT_ROOT_DIR``
  Path to the CUDA Toolkit (defined if not set).

``CUDA_SDK_ROOT_DIR``
  Path to the CUDA SDK.  Use this to find files in the SDK.  This script will
  not directly support finding specific libraries or headers, as that isn't
  supported by NVIDIA.  If you want to change libraries when the path changes
  see the ``FindCUDA.cmake`` script for an example of how to clear these
  variables.  There are also examples of how to use the ``CUDA_SDK_ROOT_DIR``
  to locate headers or libraries, if you so choose (at your own risk).

``CUDA_INCLUDE_DIRS``
  Include directory for cuda headers.  Added automatically
  for ``cuda_add_executable()`` and ``cuda_add_library()``.

``CUDA_LIBRARIES``
  Cuda RT library.

``CUDA_CUFFT_LIBRARIES``
  Device or emulation library for the Cuda FFT implementation (alternative to
  ``cuda_add_cufft_to_target()`` macro)

``CUDA_CUBLAS_LIBRARIES``
  Device or emulation library for the Cuda BLAS implementation (alternative to
  ``cuda_add_cublas_to_target()`` macro).

``CUDA_cudart_static_LIBRARY``
  Statically linkable cuda runtime library.
  Only available for CUDA version 5.5+.

``CUDA_cudadevrt_LIBRARY``
425
426
  .. versionadded:: 3.7
    Device runtime library.  Required for separable compilation.
427
428
429
430
431
432
433
434
435
436

``CUDA_cupti_LIBRARY``
  CUDA Profiling Tools Interface library.
  Only available for CUDA version 4.0+.

``CUDA_curand_LIBRARY``
  CUDA Random Number Generation library.
  Only available for CUDA version 3.2+.

``CUDA_cusolver_LIBRARY``
437
438
439
  .. versionadded:: 3.2
    CUDA Direct Solver library.
    Only available for CUDA version 7.0+.
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512

``CUDA_cusparse_LIBRARY``
  CUDA Sparse Matrix library.
  Only available for CUDA version 3.2+.

``CUDA_npp_LIBRARY``
  NVIDIA Performance Primitives lib.
  Only available for CUDA version 4.0+.

``CUDA_nppc_LIBRARY``
  NVIDIA Performance Primitives lib (core).
  Only available for CUDA version 5.5+.

``CUDA_nppi_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 5.5 - 8.0.

``CUDA_nppial_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 9.0.

``CUDA_nppicc_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 9.0.

``CUDA_nppicom_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 9.0 - 10.2.
  Replaced by nvjpeg.

``CUDA_nppidei_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 9.0.

``CUDA_nppif_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 9.0.

``CUDA_nppig_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 9.0.

``CUDA_nppim_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 9.0.

``CUDA_nppist_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 9.0.

``CUDA_nppisu_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 9.0.

``CUDA_nppitc_LIBRARY``
  NVIDIA Performance Primitives lib (image processing).
  Only available for CUDA version 9.0.

``CUDA_npps_LIBRARY``
  NVIDIA Performance Primitives lib (signal processing).
  Only available for CUDA version 5.5+.

``CUDA_nvcuvenc_LIBRARY``
  CUDA Video Encoder library.
  Only available for CUDA version 3.2+.
  Windows only.

``CUDA_nvcuvid_LIBRARY``
  CUDA Video Decoder library.
  Only available for CUDA version 3.2+.
  Windows only.

``CUDA_nvToolsExt_LIBRARY``
513
514
515
  .. versionadded:: 3.16
    NVIDA CUDA Tools Extension library.
    Available for CUDA version 5+.
516
517

``CUDA_OpenCL_LIBRARY``
518
519
520
  .. versionadded:: 3.16
    NVIDA CUDA OpenCL library.
    Available for CUDA version 5+.
521
522

#]=======================================================================]
523

524
525
526
527
528
529
530
531
532
533
#   James Bigler, NVIDIA Corp (nvidia.com - jbigler)
#   Abe Stephens, SCI Institute -- http://www.sci.utah.edu/~abe/FindCuda.html
#
#   Copyright (c) 2008 - 2009 NVIDIA Corporation.  All rights reserved.
#
#   Copyright (c) 2007-2009
#   Scientific Computing and Imaging Institute, University of Utah
#
#   This code is licensed under the MIT License.  See the FindCUDA.cmake script
#   for the text of the license.
Brad King's avatar
Brad King committed
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566

# The MIT License
#
# License for the specific language governing rights and limitations under
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included
# in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
# OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
#
###############################################################################

# FindCUDA.cmake

# This macro helps us find the location of helper files we will need the full path to
macro(CUDA_FIND_HELPER_FILE _name _extension)
  set(_full_name "${_name}.${_extension}")
  # CMAKE_CURRENT_LIST_FILE contains the full path to the file currently being
  # processed.  Using this variable, we can pull out the current path, and
  # provide a way to get access to the other files we need local to here.
  get_filename_component(CMAKE_CURRENT_LIST_DIR "${CMAKE_CURRENT_LIST_FILE}" PATH)
James Bigler's avatar
James Bigler committed
567
568
569
  set(CUDA_${_name} "${CMAKE_CURRENT_LIST_DIR}/FindCUDA/${_full_name}")
  if(NOT EXISTS "${CUDA_${_name}}")
    set(error_message "${_full_name} not found in ${CMAKE_CURRENT_LIST_DIR}/FindCUDA")
Brad King's avatar
Brad King committed
570
571
    if(CUDA_FIND_REQUIRED)
      message(FATAL_ERROR "${error_message}")
James Bigler's avatar
James Bigler committed
572
    else()
Brad King's avatar
Brad King committed
573
574
      if(NOT CUDA_FIND_QUIETLY)
        message(STATUS "${error_message}")
James Bigler's avatar
James Bigler committed
575
576
577
      endif()
    endif()
  endif()
Brad King's avatar
Brad King committed
578
579
  # Set this variable as internal, so the user isn't bugged with it.
  set(CUDA_${_name} ${CUDA_${_name}} CACHE INTERNAL "Location of ${_full_name}" FORCE)
580
endmacro()
Brad King's avatar
Brad King committed
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620

#####################################################################
## CUDA_INCLUDE_NVCC_DEPENDENCIES
##

# So we want to try and include the dependency file if it exists.  If
# it doesn't exist then we need to create an empty one, so we can
# include it.

# If it does exist, then we need to check to see if all the files it
# depends on exist.  If they don't then we should clear the dependency
# file and regenerate it later.  This covers the case where a header
# file has disappeared or moved.

macro(CUDA_INCLUDE_NVCC_DEPENDENCIES dependency_file)
  set(CUDA_NVCC_DEPEND)
  set(CUDA_NVCC_DEPEND_REGENERATE FALSE)


  # Include the dependency file.  Create it first if it doesn't exist .  The
  # INCLUDE puts a dependency that will force CMake to rerun and bring in the
  # new info when it changes.  DO NOT REMOVE THIS (as I did and spent a few
  # hours figuring out why it didn't work.
  if(NOT EXISTS ${dependency_file})
    file(WRITE ${dependency_file} "#FindCUDA.cmake generated file.  Do not edit.\n")
  endif()
  # Always include this file to force CMake to run again next
  # invocation and rebuild the dependencies.
  #message("including dependency_file = ${dependency_file}")
  include(${dependency_file})

  # Now we need to verify the existence of all the included files
  # here.  If they aren't there we need to just blank this variable and
  # make the file regenerate again.
#   if(DEFINED CUDA_NVCC_DEPEND)
#     message("CUDA_NVCC_DEPEND set")
#   else()
#     message("CUDA_NVCC_DEPEND NOT set")
#   endif()
  if(CUDA_NVCC_DEPEND)
James Bigler's avatar
James Bigler committed
621
    #message("CUDA_NVCC_DEPEND found")
Brad King's avatar
Brad King committed
622
    foreach(f ${CUDA_NVCC_DEPEND})
James Bigler's avatar
James Bigler committed
623
      # message("searching for ${f}")
Brad King's avatar
Brad King committed
624
625
626
627
      if(NOT EXISTS ${f})
        #message("file ${f} not found")
        set(CUDA_NVCC_DEPEND_REGENERATE TRUE)
      endif()
628
629
    endforeach()
  else()
Brad King's avatar
Brad King committed
630
631
632
    #message("CUDA_NVCC_DEPEND false")
    # No dependencies, so regenerate the file.
    set(CUDA_NVCC_DEPEND_REGENERATE TRUE)
633
  endif()
Brad King's avatar
Brad King committed
634
635
636
637
638
639

  #message("CUDA_NVCC_DEPEND_REGENERATE = ${CUDA_NVCC_DEPEND_REGENERATE}")
  # No incoming dependencies, so we need to generate them.  Make the
  # output depend on the dependency file itself, which should cause the
  # rule to re-run.
  if(CUDA_NVCC_DEPEND_REGENERATE)
640
641
    set(CUDA_NVCC_DEPEND ${dependency_file})
    #message("Generating an empty dependency_file: ${dependency_file}")
Brad King's avatar
Brad King committed
642
    file(WRITE ${dependency_file} "#FindCUDA.cmake generated file.  Do not edit.\n")
643
  endif()
Brad King's avatar
Brad King committed
644

645
endmacro()
Brad King's avatar
Brad King committed
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676

###############################################################################
###############################################################################
# Setup variables' defaults
###############################################################################
###############################################################################

# Allow the user to specify if the device code is supposed to be 32 or 64 bit.
if(CMAKE_SIZEOF_VOID_P EQUAL 8)
  set(CUDA_64_BIT_DEVICE_CODE_DEFAULT ON)
else()
  set(CUDA_64_BIT_DEVICE_CODE_DEFAULT OFF)
endif()
option(CUDA_64_BIT_DEVICE_CODE "Compile device code in 64 bit mode" ${CUDA_64_BIT_DEVICE_CODE_DEFAULT})

# Attach the build rule to the source file in VS.  This option
option(CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE "Attach the build rule to the CUDA source file.  Enable only when the CUDA source file is added to at most one target." ON)

# Prints out extra information about the cuda file during compilation
option(CUDA_BUILD_CUBIN "Generate and parse .cubin files in Device mode." OFF)

# Set whether we are using emulation or device mode.
option(CUDA_BUILD_EMULATION "Build in Emulation mode" OFF)

# Where to put the generated output.
set(CUDA_GENERATED_OUTPUT_DIR "" CACHE PATH "Directory to put all the output files.  If blank it will default to the CMAKE_CURRENT_BINARY_DIR")

# Parse HOST_COMPILATION mode.
option(CUDA_HOST_COMPILATION_CPP "Generated file extension" ON)

# Extra user settable flags
677
cmake_initialize_per_config_variable(CUDA_NVCC_FLAGS "Semi-colon delimit multiple arguments.")
Brad King's avatar
Brad King committed
678

679
680
681
if(DEFINED ENV{CUDAHOSTCXX})
  set(CUDA_HOST_COMPILER "$ENV{CUDAHOSTCXX}" CACHE FILEPATH "Host side compiler used by NVCC")
elseif(CMAKE_GENERATOR MATCHES "Visual Studio")
682
683
  set(_CUDA_MSVC_HOST_COMPILER "$(VCInstallDir)Tools/MSVC/$(VCToolsVersion)/bin/Host$(Platform)/$(PlatformTarget)")
  if(MSVC_VERSION LESS 1910)
684
    set(_CUDA_MSVC_HOST_COMPILER "$(VCInstallDir)bin")
685
686
687
688
  endif()

  set(CUDA_HOST_COMPILER "${_CUDA_MSVC_HOST_COMPILER}" CACHE FILEPATH "Host side compiler used by NVCC")

689
else()
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
  if(APPLE
      AND "${CMAKE_C_COMPILER_ID}" MATCHES "Clang"
      AND "${CMAKE_C_COMPILER}" MATCHES "/cc$")
    # Using cc which is symlink to clang may let NVCC think it is GCC and issue
    # unhandled -dumpspecs option to clang. Also in case neither
    # CMAKE_C_COMPILER is defined (project does not use C language) nor
    # CUDA_HOST_COMPILER is specified manually we should skip -ccbin and let
    # nvcc use its own default C compiler.
    # Only care about this on APPLE with clang to avoid
    # following symlinks to things like ccache
    if(DEFINED CMAKE_C_COMPILER AND NOT DEFINED CUDA_HOST_COMPILER)
      get_filename_component(c_compiler_realpath "${CMAKE_C_COMPILER}" REALPATH)
      # if the real path does not end up being clang then
      # go back to using CMAKE_C_COMPILER
      if(NOT "${c_compiler_realpath}" MATCHES "/clang$")
        set(c_compiler_realpath "${CMAKE_C_COMPILER}")
      endif()
    else()
      set(c_compiler_realpath "")
    endif()
    set(CUDA_HOST_COMPILER "${c_compiler_realpath}" CACHE FILEPATH "Host side compiler used by NVCC")
Will Feng's avatar
Will Feng committed
711
712
713
714
  elseif(MSVC AND "${CMAKE_C_COMPILER}" MATCHES "clcache|sccache")
    # NVCC does not think it will work if it is passed clcache.exe or sccache.exe
    # as the host compiler, which means that builds with CC=cl.exe won't work.
    # Best to just feed it whatever the actual cl.exe is as the host compiler.
715
    set(CUDA_HOST_COMPILER "cl.exe" CACHE FILEPATH "Host side compiler used by NVCC")
716
  else()
717
718
    set(CUDA_HOST_COMPILER "${CMAKE_C_COMPILER}"
      CACHE FILEPATH "Host side compiler used by NVCC")
719
  endif()
720
721
endif()

Brad King's avatar
Brad King committed
722
# Propagate the host flags to the host compiler via -Xcompiler
luz.paz's avatar
luz.paz committed
723
option(CUDA_PROPAGATE_HOST_FLAGS "Propagate C/CXX_FLAGS and friends to the host compiler via -Xcompile" ON)
Brad King's avatar
Brad King committed
724

725
726
727
# Enable CUDA_SEPARABLE_COMPILATION
option(CUDA_SEPARABLE_COMPILATION "Compile CUDA objects with separable compilation enabled.  Requires CUDA 5.0+" OFF)

Brad King's avatar
Brad King committed
728
729
730
731
732
733
734
735
736
737
# Specifies whether the commands used when compiling the .cu file will be printed out.
option(CUDA_VERBOSE_BUILD "Print out the commands run while compiling the CUDA source file.  With the Makefile generator this defaults to VERBOSE variable specified on the command line, but can be forced on with this option." OFF)

mark_as_advanced(
  CUDA_64_BIT_DEVICE_CODE
  CUDA_ATTACH_VS_BUILD_RULE_TO_CUDA_FILE
  CUDA_GENERATED_OUTPUT_DIR
  CUDA_HOST_COMPILATION_CPP
  CUDA_NVCC_FLAGS
  CUDA_PROPAGATE_HOST_FLAGS
738
739
740
741
  CUDA_BUILD_CUBIN
  CUDA_BUILD_EMULATION
  CUDA_VERBOSE_BUILD
  CUDA_SEPARABLE_COMPILATION
Brad King's avatar
Brad King committed
742
743
  )

744
745
746
747
748
749
750
751
752
753
754
755
# Single config generators like Makefiles or Ninja don't usually have
# CMAKE_CONFIGURATION_TYPES defined (but note that it can be defined if set by
# projects or developers). Even CMAKE_BUILD_TYPE might not be defined for
# single config generators (and should not be defined for multi-config
# generators). To ensure we get a complete superset of all possible
# configurations, we combine CMAKE_CONFIGURATION_TYPES, CMAKE_BUILD_TYPE and
# all of the standard configurations, then weed out duplicates with
# list(REMOVE_DUPLICATES). Looping over the unique set then ensures we have
# each configuration-specific set of nvcc flags defined and marked as advanced.
set(CUDA_configuration_types ${CMAKE_CONFIGURATION_TYPES} ${CMAKE_BUILD_TYPE} Debug MinSizeRel Release RelWithDebInfo)
list(REMOVE_DUPLICATES CUDA_configuration_types)

Brad King's avatar
Brad King committed
756
757
758
759
760
761
###############################################################################
###############################################################################
# Locate CUDA, Set Build Type, etc.
###############################################################################
###############################################################################

762
macro(cuda_unset_include_and_libraries)
Brad King's avatar
Brad King committed
763
764
  unset(CUDA_TOOLKIT_INCLUDE CACHE)
  unset(CUDA_CUDART_LIBRARY CACHE)
765
  unset(CUDA_CUDA_LIBRARY CACHE)
James Bigler's avatar
James Bigler committed
766
  # Make sure you run this before you unset CUDA_VERSION.
767
768
769
770
  if(CUDA_VERSION VERSION_EQUAL "3.0")
    # This only existed in the 3.0 version of the CUDA toolkit
    unset(CUDA_CUDARTEMU_LIBRARY CACHE)
  endif()
771
  unset(CUDA_cudart_static_LIBRARY CACHE)
772
  unset(CUDA_cudadevrt_LIBRARY CACHE)
Brad King's avatar
Brad King committed
773
  unset(CUDA_cublas_LIBRARY CACHE)
774
  unset(CUDA_cublas_device_LIBRARY CACHE)
Brad King's avatar
Brad King committed
775
776
777
  unset(CUDA_cublasemu_LIBRARY CACHE)
  unset(CUDA_cufft_LIBRARY CACHE)
  unset(CUDA_cufftemu_LIBRARY CACHE)
778
  unset(CUDA_cupti_LIBRARY CACHE)
779
  unset(CUDA_curand_LIBRARY CACHE)
780
  unset(CUDA_cusolver_LIBRARY CACHE)
781
782
  unset(CUDA_cusparse_LIBRARY CACHE)
  unset(CUDA_npp_LIBRARY CACHE)
783
784
785
  unset(CUDA_nppc_LIBRARY CACHE)
  unset(CUDA_nppi_LIBRARY CACHE)
  unset(CUDA_npps_LIBRARY CACHE)
786
787
  unset(CUDA_nvcuvenc_LIBRARY CACHE)
  unset(CUDA_nvcuvid_LIBRARY CACHE)
788
789
  unset(CUDA_nvToolsExt_LIBRARY CACHE)
  unset(CUDA_OpenCL_LIBRARY CACHE)
790
  unset(CUDA_GPU_DETECT_OUTPUT CACHE)
791
792
793
794
795
796
797
798
endmacro()

# Check to see if the CUDA_TOOLKIT_ROOT_DIR and CUDA_SDK_ROOT_DIR have changed,
# if they have then clear the cache variables, so that will be detected again.
if(NOT "${CUDA_TOOLKIT_ROOT_DIR}" STREQUAL "${CUDA_TOOLKIT_ROOT_DIR_INTERNAL}")
  unset(CUDA_TOOLKIT_TARGET_DIR CACHE)
  unset(CUDA_NVCC_EXECUTABLE CACHE)
  cuda_unset_include_and_libraries()
799
  unset(CUDA_VERSION CACHE)
800
801
802
803
endif()

if(NOT "${CUDA_TOOLKIT_TARGET_DIR}" STREQUAL "${CUDA_TOOLKIT_TARGET_DIR_INTERNAL}")
  cuda_unset_include_and_libraries()
Brad King's avatar
Brad King committed
804
805
endif()

806
807
808
#
#  End of unset()
#
809

810
811
812
#
#  Start looking for things
#
Brad King's avatar
Brad King committed
813
814

# Search for the cuda distribution.
815
if(NOT CUDA_TOOLKIT_ROOT_DIR AND NOT CMAKE_CROSSCOMPILING)
Brad King's avatar
Brad King committed
816
  # Search in the CUDA_BIN_PATH first.
817
  find_program(CUDA_TOOLKIT_ROOT_DIR_NVCC
Brad King's avatar
Brad King committed
818
    NAMES nvcc nvcc.exe
819
    PATHS
820
      ENV CUDA_TOOLKIT_ROOT
821
822
823
      ENV CUDA_PATH
      ENV CUDA_BIN_PATH
    PATH_SUFFIXES bin bin64
Brad King's avatar
Brad King committed
824
825
826
    DOC "Toolkit location."
    NO_DEFAULT_PATH
    )
827

Brad King's avatar
Brad King committed
828
  # Now search default paths
829
  find_program(CUDA_TOOLKIT_ROOT_DIR_NVCC
Brad King's avatar
Brad King committed
830
    NAMES nvcc nvcc.exe
831
    PATHS /opt/cuda/bin
832
    PATH_SUFFIXES cuda/bin
Brad King's avatar
Brad King committed
833
834
835
    DOC "Toolkit location."
    )

836
  if (CUDA_TOOLKIT_ROOT_DIR_NVCC)
837
838
839
    # Given that NVCC can be provided by multiple different sources (NVIDIA HPC SDK, CUDA Toolkit, distro)
    # each of which has a different layout, we need to extract the CUDA toolkit root from the compiler
    # itself, allowing us to support numerous different scattered toolkit layouts
840
841
842
    execute_process(COMMAND ${CUDA_TOOLKIT_ROOT_DIR_NVCC} "-v" "__cmake_determine_cuda"
      OUTPUT_VARIABLE _CUDA_NVCC_OUT ERROR_VARIABLE _CUDA_NVCC_OUT)
    if(_CUDA_NVCC_OUT MATCHES "TOP=([^\r\n]*)")
843
      get_filename_component(CUDA_TOOLKIT_ROOT_DIR "${CMAKE_MATCH_1}" ABSOLUTE CACHE)
844
    else()
845
846
      get_filename_component(CUDA_TOOLKIT_ROOT_DIR "${CUDA_TOOLKIT_ROOT_DIR_NVCC}" DIRECTORY)
      get_filename_component(CUDA_TOOLKIT_ROOT_DIR "${CUDA_TOOLKIT_ROOT_DIR}" DIRECTORY CACHE)
847
    endif()
848
    unset(_CUDA_NVCC_OUT)
849

Brad King's avatar
Brad King committed
850
851
852
    string(REGEX REPLACE "[/\\\\]?bin[64]*[/\\\\]?$" "" CUDA_TOOLKIT_ROOT_DIR ${CUDA_TOOLKIT_ROOT_DIR})
    # We need to force this back into the cache.
    set(CUDA_TOOLKIT_ROOT_DIR ${CUDA_TOOLKIT_ROOT_DIR} CACHE PATH "Toolkit location." FORCE)
853
    set(CUDA_TOOLKIT_TARGET_DIR ${CUDA_TOOLKIT_ROOT_DIR})
854
  endif()
855
  unset(CUDA_TOOLKIT_ROOT_DIR_NVCC CACHE)
856

Brad King's avatar
Brad King committed
857
858
859
860
861
862
  if (NOT EXISTS ${CUDA_TOOLKIT_ROOT_DIR})
    if(CUDA_FIND_REQUIRED)
      message(FATAL_ERROR "Specify CUDA_TOOLKIT_ROOT_DIR")
    elseif(NOT CUDA_FIND_QUIETLY)
      message("CUDA_TOOLKIT_ROOT_DIR not found or specified")
    endif()
863
864
  endif ()
endif ()
Brad King's avatar
Brad King committed
865

866
867
868
869
870
871
872
873
874
875
876
877
if(CMAKE_CROSSCOMPILING)
  SET (CUDA_TOOLKIT_ROOT $ENV{CUDA_TOOLKIT_ROOT})
  if(CMAKE_SYSTEM_PROCESSOR STREQUAL "armv7-a")
    # Support for NVPACK
    set (CUDA_TOOLKIT_TARGET_NAME "armv7-linux-androideabi")
  elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "arm")
    # Support for arm cross compilation
    set(CUDA_TOOLKIT_TARGET_NAME "armv7-linux-gnueabihf")
  elseif(CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64")
    # Support for aarch64 cross compilation
    if (ANDROID_ARCH_NAME STREQUAL "arm64")
      set(CUDA_TOOLKIT_TARGET_NAME "aarch64-linux-androideabi")
Jan Bernlöhr's avatar
Jan Bernlöhr committed
878
879
    elseif (CMAKE_SYSTEM_NAME STREQUAL "QNX")
      set(CUDA_TOOLKIT_TARGET_NAME "aarch64-qnx")
880
881
882
883
884
885
886
887
888
889
890
891
892
893
    else()
      set(CUDA_TOOLKIT_TARGET_NAME "aarch64-linux")
    endif (ANDROID_ARCH_NAME STREQUAL "arm64")
  endif()

  if (EXISTS "${CUDA_TOOLKIT_ROOT}/targets/${CUDA_TOOLKIT_TARGET_NAME}")
    set(CUDA_TOOLKIT_TARGET_DIR "${CUDA_TOOLKIT_ROOT}/targets/${CUDA_TOOLKIT_TARGET_NAME}" CACHE PATH "CUDA Toolkit target location.")
    SET (CUDA_TOOLKIT_ROOT_DIR ${CUDA_TOOLKIT_ROOT})
    mark_as_advanced(CUDA_TOOLKIT_TARGET_DIR)
  endif()

  # add known CUDA targetr root path to the set of directories we search for programs, libraries and headers
  set( CMAKE_FIND_ROOT_PATH "${CUDA_TOOLKIT_TARGET_DIR};${CMAKE_FIND_ROOT_PATH}")
  macro( cuda_find_host_program )
894
895
896
897
898
    if (COMMAND find_host_program)
      find_host_program( ${ARGN} )
    else()
      find_program( ${ARGN} )
    endif()
899
900
901
902
903
904
905
906
907
908
  endmacro()
else()
  # for non-cross-compile, find_host_program == find_program and CUDA_TOOLKIT_TARGET_DIR == CUDA_TOOLKIT_ROOT_DIR
  macro( cuda_find_host_program )
    find_program( ${ARGN} )
  endmacro()
  SET (CUDA_TOOLKIT_TARGET_DIR ${CUDA_TOOLKIT_ROOT_DIR})
endif()


Brad King's avatar
Brad King committed
909
# CUDA_NVCC_EXECUTABLE
910
911
912
913
914
915
916
917
918
919
920
921
922
923
if(DEFINED ENV{CUDA_NVCC_EXECUTABLE})
  set(CUDA_NVCC_EXECUTABLE "$ENV{CUDA_NVCC_EXECUTABLE}" CACHE FILEPATH "The CUDA compiler")
else()
  cuda_find_host_program(CUDA_NVCC_EXECUTABLE
    NAMES nvcc
    PATHS "${CUDA_TOOLKIT_ROOT_DIR}"
    ENV CUDA_PATH
    ENV CUDA_BIN_PATH
    PATH_SUFFIXES bin bin64
    NO_DEFAULT_PATH
    )
  # Search default search paths, after we search our own set of paths.
  cuda_find_host_program(CUDA_NVCC_EXECUTABLE nvcc)
endif()
Brad King's avatar
Brad King committed
924
925
926
927
mark_as_advanced(CUDA_NVCC_EXECUTABLE)

if(CUDA_NVCC_EXECUTABLE AND NOT CUDA_VERSION)
  # Compute the version.
928
  execute_process (COMMAND ${CUDA_NVCC_EXECUTABLE} "--version" OUTPUT_VARIABLE NVCC_OUT)
Brad King's avatar
Brad King committed
929
930
931
932
  string(REGEX REPLACE ".*release ([0-9]+)\\.([0-9]+).*" "\\1" CUDA_VERSION_MAJOR ${NVCC_OUT})
  string(REGEX REPLACE ".*release ([0-9]+)\\.([0-9]+).*" "\\2" CUDA_VERSION_MINOR ${NVCC_OUT})
  set(CUDA_VERSION "${CUDA_VERSION_MAJOR}.${CUDA_VERSION_MINOR}" CACHE STRING "Version of CUDA as computed from nvcc.")
  mark_as_advanced(CUDA_VERSION)
933
934
else()
  # Need to set these based off of the cached value
935
936
  string(REGEX REPLACE "([0-9]+)\\.([0-9]+).*" "\\1" CUDA_VERSION_MAJOR "${CUDA_VERSION}")
  string(REGEX REPLACE "([0-9]+)\\.([0-9]+).*" "\\2" CUDA_VERSION_MINOR "${CUDA_VERSION}")
Brad King's avatar
Brad King committed
937
938
endif()

939

Brad King's avatar
Brad King committed
940
941
942
943
944
945
# Always set this convenience variable
set(CUDA_VERSION_STRING "${CUDA_VERSION}")

# CUDA_TOOLKIT_INCLUDE
find_path(CUDA_TOOLKIT_INCLUDE
  device_functions.h # Header included in toolkit
946
  PATHS ${CUDA_TOOLKIT_TARGET_DIR}
947
  ENV CUDA_PATH
Brad King's avatar
Brad King committed
948
  ENV CUDA_INC_PATH
949
  PATH_SUFFIXES include
Brad King's avatar
Brad King committed
950
951
952
953
954
955
  NO_DEFAULT_PATH
  )
# Search default search paths, after we search our own set of paths.
find_path(CUDA_TOOLKIT_INCLUDE device_functions.h)
mark_as_advanced(CUDA_TOOLKIT_INCLUDE)

956
957
958
959
960
961
if (CUDA_VERSION VERSION_GREATER "7.0" OR EXISTS "${CUDA_TOOLKIT_INCLUDE}/cuda_fp16.h")
  set(CUDA_HAS_FP16 TRUE)
else()
  set(CUDA_HAS_FP16 FALSE)
endif()

Brad King's avatar
Brad King committed
962
# Set the user list of include dir to nothing to initialize it.
963
set (CUDA_NVCC_INCLUDE_DIRS_USER "")
Brad King's avatar
Brad King committed
964
965
set (CUDA_INCLUDE_DIRS ${CUDA_TOOLKIT_INCLUDE})

James Bigler's avatar
James Bigler committed
966
macro(cuda_find_library_local_first_with_path_ext _var _names _doc _path_ext )
Brad King's avatar
Brad King committed
967
  if(CMAKE_SIZEOF_VOID_P EQUAL 8)
Rolf Eike Beer's avatar
Rolf Eike Beer committed
968
    # CUDA 3.2+ on Windows moved the library directories, so we need the new
969
    # and old paths.
James Bigler's avatar
James Bigler committed
970
    set(_cuda_64bit_lib_dir "${_path_ext}lib/x64" "${_path_ext}lib64" "${_path_ext}libx64" )
Brad King's avatar
Brad King committed
971
  endif()
972
973
  # CUDA 3.2+ on Windows moved the library directories, so we need to new
  # (lib/Win32) and the old path (lib).
Brad King's avatar
Brad King committed
974
975
  find_library(${_var}
    NAMES ${_names}
976
    PATHS "${CUDA_TOOLKIT_TARGET_DIR}"
977
    ENV CUDA_PATH
Brad King's avatar
Brad King committed
978
    ENV CUDA_LIB_PATH
James Bigler's avatar
James Bigler committed
979
    PATH_SUFFIXES ${_cuda_64bit_lib_dir} "${_path_ext}lib/Win32" "${_path_ext}lib" "${_path_ext}libWin32"
Brad King's avatar
Brad King committed
980
981
982
    DOC ${_doc}
    NO_DEFAULT_PATH
    )
983
984
985
986
987
988
989
990
  if (NOT CMAKE_CROSSCOMPILING)
    # Search default search paths, after we search our own set of paths.
    find_library(${_var}
      NAMES ${_names}
      PATHS "/usr/lib/nvidia-current"
      DOC ${_doc}
      )
  endif()
Brad King's avatar
Brad King committed
991
992
endmacro()

James Bigler's avatar
James Bigler committed
993
994
995
996
997
998
999
1000
1001
macro(cuda_find_library_local_first _var _names _doc)
  cuda_find_library_local_first_with_path_ext( "${_var}" "${_names}" "${_doc}" "" )
endmacro()

macro(find_library_local_first _var _names _doc )
  cuda_find_library_local_first( "${_var}" "${_names}" "${_doc}" "" )
endmacro()


Brad King's avatar
Brad King committed
1002
# CUDA_LIBRARIES
James Bigler's avatar
James Bigler committed
1003
cuda_find_library_local_first(CUDA_CUDART_LIBRARY cudart "\"cudart\" library")
1004
1005
if(CUDA_VERSION VERSION_EQUAL "3.0")
  # The cudartemu library only existed for the 3.0 version of CUDA.
James Bigler's avatar
James Bigler committed
1006
  cuda_find_library_local_first(CUDA_CUDARTEMU_LIBRARY cudartemu "\"cudartemu\" library")
1007
1008
1009
1010
  mark_as_advanced(
    CUDA_CUDARTEMU_LIBRARY
    )
endif()
1011

1012
if(NOT CUDA_VERSION VERSION_LESS "5.5")
1013
1014
1015
  cuda_find_library_local_first(CUDA_cudart_static_LIBRARY cudart_static "static CUDA runtime library")
  mark_as_advanced(CUDA_cudart_static_LIBRARY)
endif()
1016
1017


1018
if(CUDA_cudart_static_LIBRARY)
1019
  # If static cudart available, use it by default, but provide a user-visible option to disable it.
1020
1021
  option(CUDA_USE_STATIC_CUDA_RUNTIME "Use the static version of the CUDA runtime library if available" ON)
else()
1022
1023
  # If not available, silently disable the option.
  set(CUDA_USE_STATIC_CUDA_RUNTIME OFF CACHE INTERNAL "")
1024
1025
1026
1027
1028
endif()

if(CUDA_USE_STATIC_CUDA_RUNTIME)
  set(CUDA_CUDART_LIBRARY_VAR CUDA_cudart_static_LIBRARY)
else()
1029
  set(CUDA_CUDART_LIBRARY_VAR CUDA_CUDART_LIBRARY)
1030
endif()
1031

1032
1033
1034
1035
if(NOT CUDA_VERSION VERSION_LESS "5.0")
  cuda_find_library_local_first(CUDA_cudadevrt_LIBRARY cudadevrt "\"cudadevrt\" library")
  mark_as_advanced(CUDA_cudadevrt_LIBRARY)
endif()
1036
1037
1038

if(CUDA_USE_STATIC_CUDA_RUNTIME)
  if(UNIX)
1039
    # Check for the dependent libraries.
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049

    # Many of the FindXYZ CMake comes with makes use of try_compile with int main(){return 0;}
    # as the source file.  Unfortunately this causes a warning with -Wstrict-prototypes and
    # -Werror causes the try_compile to fail.  We will just temporarily disable other flags
    # when doing the find_package command here.
    set(_cuda_cmake_c_flags ${CMAKE_C_FLAGS})
    set(CMAKE_C_FLAGS "-fPIC")
    find_package(Threads REQUIRED)
    set(CMAKE_C_FLAGS ${_cuda_cmake_c_flags})

Jan Bernlöhr's avatar
Jan Bernlöhr committed
1050
    if(NOT APPLE AND NOT (CMAKE_SYSTEM_NAME STREQUAL "QNX"))
1051
1052
1053
1054
1055
      #On Linux, you must link against librt when using the static cuda runtime.
      find_library(CUDA_rt_LIBRARY rt)
      if (NOT CUDA_rt_LIBRARY)
        message(WARNING "Expecting to find librt for libcudart_static, but didn't find it.")
      endif()
1056
1057
1058
    endif()
  endif()
endif()
James Bigler's avatar
James Bigler committed
1059
1060
1061
1062
1063
1064
1065

# CUPTI library showed up in cuda toolkit 4.0
if(NOT CUDA_VERSION VERSION_LESS "4.0")
  cuda_find_library_local_first_with_path_ext(CUDA_cupti_LIBRARY cupti "\"cupti\" library" "extras/CUPTI/")
  mark_as_advanced(CUDA_cupti_LIBRARY)
endif()

1066
1067
1068
1069
1070
1071
1072
# Set the CUDA_LIBRARIES variable.  This is the set of stuff to link against if you are
# using the CUDA runtime.  For the dynamic version of the runtime, most of the
# dependencies are brough in, but for the static version there are additional libraries
# and linker commands needed.
# Initialize to empty
set(CUDA_LIBRARIES)

1073
1074
1075
# If we are using emulation mode and we found the cudartemu library then use
# that one instead of cudart.
if(CUDA_BUILD_EMULATION AND CUDA_CUDARTEMU_LIBRARY)
1076
1077
  list(APPEND CUDA_LIBRARIES ${CUDA_CUDARTEMU_LIBRARY})
elseif(CUDA_USE_STATIC_CUDA_RUNTIME AND CUDA_cudart_static_LIBRARY)
1078
1079
1080
1081
1082
  list(APPEND CUDA_LIBRARIES ${CUDA_cudart_static_LIBRARY})
  if (TARGET Threads::Threads)
    list(APPEND CUDA_LIBRARIES Threads::Threads)
  endif()
  list(APPEND CUDA_LIBRARIES ${CMAKE_DL_LIBS})
1083
1084
1085
1086
1087
1088
1089
1090
  if (CUDA_rt_LIBRARY)
    list(APPEND CUDA_LIBRARIES ${CUDA_rt_LIBRARY})
  endif()
  if(APPLE)
    # We need to add the default path to the driver (libcuda.dylib) as an rpath, so that
    # the static cuda runtime can find it at runtime.
    list(APPEND CUDA_LIBRARIES -Wl,-rpath,/usr/local/cuda/lib)
  endif()
1091
else()
1092
  list(APPEND CUDA_LIBRARIES ${CUDA_CUDART_LIBRARY})
1093
endif()
Brad King's avatar
Brad King committed
1094
1095
1096

# 1.1 toolkit on linux doesn't appear to have a separate library on
# some platforms.
James Bigler's avatar
James Bigler committed
1097
cuda_find_library_local_first(CUDA_CUDA_LIBRARY cuda "\"cuda\" library (older versions only).")
Brad King's avatar
Brad King committed
1098
1099
1100
1101
1102
1103
1104
1105
1106

mark_as_advanced(
  CUDA_CUDA_LIBRARY
  CUDA_CUDART_LIBRARY
  )

#######################
# Look for some of the toolkit helper libraries
macro(FIND_CUDA_HELPER_LIBS _name)
James Bigler's avatar
James Bigler committed
1107
  cuda_find_library_local_first(CUDA_${_name}_LIBRARY ${_name} "\"${_name}\" library")
Brad King's avatar
Brad King committed
1108
  mark_as_advanced(CUDA_${_name}_LIBRARY)
1109
endmacro()
Brad King's avatar
Brad King committed
1110

1111
1112
1113
1114
1115
1116
1117
1118
#######################
# Disable emulation for v3.1 onward
if(CUDA_VERSION VERSION_GREATER "3.0")
  if(CUDA_BUILD_EMULATION)
    message(FATAL_ERROR "CUDA_BUILD_EMULATION is not supported in version 3.1 and onwards.  You must disable it to proceed.  You have version ${CUDA_VERSION}.")
  endif()
endif()

1119
# Search for additional CUDA toolkit libraries.
1120
1121
1122
1123
1124
if(CUDA_VERSION VERSION_LESS "3.1")
  # Emulation libraries aren't available in version 3.1 onward.
  find_cuda_helper_libs(cufftemu)
  find_cuda_helper_libs(cublasemu)
endif()
Brad King's avatar
Brad King committed
1125
1126
find_cuda_helper_libs(cufft)
find_cuda_helper_libs(cublas)
1127
1128
1129
1130
1131
1132
1133
1134
1135
if(NOT CUDA_VERSION VERSION_LESS "3.2")
  # cusparse showed up in version 3.2
  find_cuda_helper_libs(cusparse)
  find_cuda_helper_libs(curand)
  if (WIN32)
    find_cuda_helper_libs(nvcuvenc)
    find_cuda_helper_libs(nvcuvid)
  endif()
endif()
1136
1137
if(CUDA_VERSION VERSION_GREATER "5.0" AND CUDA_VERSION VERSION_LESS "9.2")
  # In CUDA 9.2 cublas_device was deprecated
1138
  find_cuda_helper_libs(cublas_device)
1139
1140
endif()

1141
1142
1143
1144
1145
if(NOT CUDA_VERSION VERSION_LESS "5.0")
  find_cuda_helper_libs(nvToolsExt)
  find_cuda_helper_libs(OpenCL)
endif()

1146
1147
1148
1149
1150
if(NOT CUDA_VERSION VERSION_LESS "9.0")
  # In CUDA 9.0 NPP was nppi was removed
  find_cuda_helper_libs(nppc)
  find_cuda_helper_libs(nppial)
  find_cuda_helper_libs(nppicc)
1151
1152
1153
  if(CUDA_VERSION VERSION_LESS "11.0")
    find_cuda_helper_libs(nppicom)
  endif()
1154
1155
1156
1157
1158
1159
1160
1161
1162
1163
  find_cuda_helper_libs(nppidei)
  find_cuda_helper_libs(nppif)
  find_cuda_helper_libs(nppig)
  find_cuda_helper_libs(nppim)
  find_cuda_helper_libs(nppist)
  find_cuda_helper_libs(nppisu)
  find_cuda_helper_libs(nppitc)
  find_cuda_helper_libs(npps)
  set(CUDA_npp_LIBRARY "${CUDA_nppc_LIBRARY};${CUDA_nppial_LIBRARY};${CUDA_nppicc_LIBRARY};${CUDA_nppicom_LIBRARY};${CUDA_nppidei_LIBRARY};${CUDA_nppif_LIBRARY};${CUDA_nppig_LIBRARY};${CUDA_nppim_LIBRARY};${CUDA_nppist_LIBRARY};${CUDA_nppisu_LIBRARY};${CUDA_nppitc_LIBRARY};${CUDA_npps_LIBRARY}")
elseif(CUDA_VERSION VERSION_GREATER "5.0")
Luz Paz's avatar
Luz Paz committed
1164
  # In CUDA 5.5 NPP was split into 3 separate libraries.
1165
1166
1167
1168
1169
  find_cuda_helper_libs(nppc)
  find_cuda_helper_libs(nppi)
  find_cuda_helper_libs(npps)
  set(CUDA_npp_LIBRARY "${CUDA_nppc_LIBRARY};${CUDA_nppi_LIBRARY};${CUDA_npps_LIBRARY}")
elseif(NOT CUDA_VERSION VERSION_LESS "4.0")
1170
1171
  find_cuda_helper_libs(npp)
endif()
1172
1173
1174
1175
if(NOT CUDA_VERSION VERSION_LESS "7.0")
  # cusolver showed up in version 7.0
  find_cuda_helper_libs(cusolver)
endif()
Brad King's avatar
Brad King committed
1176
1177
1178
1179
1180
1181

if (CUDA_BUILD_EMULATION)
  set(CUDA_CUFFT_LIBRARIES ${CUDA_cufftemu_LIBRARY})
  set(CUDA_CUBLAS_LIBRARIES ${CUDA_cublasemu_LIBRARY})
else()
  set(CUDA_CUFFT_LIBRARIES ${CUDA_cufft_LIBRARY})
1182
  set(CUDA_CUBLAS_LIBRARIES ${CUDA_cublas_LIBRARY} ${CUDA_cublas_device_LIBRARY})
Brad King's avatar
Brad King committed
1183
1184
1185
endif()

########################
James Bigler's avatar
James Bigler committed
1186
1187
# Look for the SDK stuff.  As of CUDA 3.0 NVSDKCUDA_ROOT has been replaced with
# NVSDKCOMPUTE_ROOT with the old CUDA C contents moved into the C subdirectory
Brad King's avatar
Brad King committed
1188
find_path(CUDA_SDK_ROOT_DIR common/inc/cutil.h
1189
 HINTS
James Bigler's avatar
James Bigler committed
1190
  "$ENV{NVSDKCOMPUTE_ROOT}/C"
1191
  ENV NVSDKCUDA_ROOT
Brad King's avatar
Brad King committed
1192
  "[HKEY_LOCAL_MACHINE\\SOFTWARE\\NVIDIA Corporation\\Installed Products\\NVIDIA SDK 10\\Compute;InstallDir]"
1193
 PATHS
1194
  "/Developer/GPU\ Computing/C"
Brad King's avatar
Brad King committed
1195
1196
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
  )

# Keep the CUDA_SDK_ROOT_DIR first in order to be able to override the
# environment variables.
set(CUDA_SDK_SEARCH_PATH
  "${CUDA_SDK_ROOT_DIR}"
  "${CUDA_TOOLKIT_ROOT_DIR}/local/NVSDK0.2"
  "${CUDA_TOOLKIT_ROOT_DIR}/NVSDK0.2"
  "${CUDA_TOOLKIT_ROOT_DIR}/NV_CUDA_SDK"
  "$ENV{HOME}/NVIDIA_CUDA_SDK"
  "$ENV{HOME}/NVIDIA_CUDA_SDK_MACOSX"
  "/Developer/CUDA"
  )

1209
1210
1211
1212
1213
1214
1215
1216
1217
1218
1219
1220
1221
1222
1223
1224
1225
1226
1227
1228
1229
1230
1231
# Example of how to find an include file from the CUDA_SDK_ROOT_DIR

# find_path(CUDA_CUT_INCLUDE_DIR
#   cutil.h
#   PATHS ${CUDA_SDK_SEARCH_PATH}
#   PATH_SUFFIXES "common/inc"
#   DOC "Location of cutil.h"
#   NO_DEFAULT_PATH
#   )
# # Now search system paths
# find_path(CUDA_CUT_INCLUDE_DIR cutil.h DOC "Location of cutil.h")

# mark_as_advanced(CUDA_CUT_INCLUDE_DIR)


# Example of how to find a library in the CUDA_SDK_ROOT_DIR

# # cutil library is called cutil64 for 64 bit builds on windows.  We don't want
# # to get these confused, so we are setting the name based on the word size of
# # the build.

# if(CMAKE_SIZEOF_VOID_P EQUAL 8)
#   set(cuda_cutil_name cutil64)
1232
# else()
1233
#   set(cuda_cutil_name cutil32)
1234
# endif()
1235
1236
1237
1238
1239
1240
1241
1242
1243
1244
1245
1246
1247

# find_library(CUDA_CUT_LIBRARY
#   NAMES cutil ${cuda_cutil_name}
#   PATHS ${CUDA_SDK_SEARCH_PATH}
#   # The new version of the sdk shows up in common/lib, but the old one is in lib
#   PATH_SUFFIXES "common/lib" "lib"
#   DOC "Location of cutil library"
#   NO_DEFAULT_PATH
#   )
# # Now search system paths
# find_library(CUDA_CUT_LIBRARY NAMES cutil ${cuda_cutil_name} DOC "Location of cutil library")
# mark_as_advanced(CUDA_CUT_LIBRARY)
# set(CUDA_CUT_LIBRARIES ${CUDA_CUT_LIBRARY})
Brad King's avatar
Brad King committed
1248
1249
1250
1251
1252
1253
1254
1255
1256



#############################
# Check for required components
set(CUDA_FOUND TRUE)

set(CUDA_TOOLKIT_ROOT_DIR_INTERNAL "${CUDA_TOOLKIT_ROOT_DIR}" CACHE INTERNAL
  "This is the value of the last time CUDA_TOOLKIT_ROOT_DIR was set successfully." FORCE)
1257
1258
set(CUDA_TOOLKIT_TARGET_DIR_INTERNAL "${CUDA_TOOLKIT_TARGET_DIR}" CACHE INTERNAL
  "This is the value of the last time CUDA_TOOLKIT_TARGET_DIR was set successfully." FORCE)
Brad King's avatar
Brad King committed
1259
1260
1261
set(CUDA_SDK_ROOT_DIR_INTERNAL "${CUDA_SDK_ROOT_DIR}" CACHE INTERNAL
  "This is the value of the last time CUDA_SDK_ROOT_DIR was set successfully." FORCE)

1262
include(${CMAKE_CURRENT_LIST_DIR}/FindPackageHandleStandardArgs.cmake)
1263

1264
1265
1266
1267
1268
find_package_handle_standard_args(CUDA
  REQUIRED_VARS
    CUDA_TOOLKIT_ROOT_DIR
    CUDA_NVCC_EXECUTABLE
    CUDA_INCLUDE_DIRS
1269
    ${CUDA_CUDART_LIBRARY_VAR}
1270
1271
  VERSION_VAR
    CUDA_VERSION
Brad King's avatar
Brad King committed
1272
1273
1274
1275
1276
1277
1278
1279
1280
1281
1282
1283
1284
1285
  )



###############################################################################
###############################################################################
# Macros
###############################################################################
###############################################################################

###############################################################################
# Add include directories to pass to the nvcc command.
macro(CUDA_INCLUDE_DIRECTORIES)
  foreach(dir ${ARGN})
1286
    list(APPEND CUDA_NVCC_INCLUDE_DIRS_USER ${dir})
1287
1288
  endforeach()
endmacro()
Brad King's avatar
Brad King committed
1289
1290
1291
1292
1293
1294


##############################################################################
cuda_find_helper_file(parse_cubin cmake)
cuda_find_helper_file(make2cmake cmake)
cuda_find_helper_file(run_nvcc cmake)
1295
include("${CMAKE_CURRENT_LIST_DIR}/FindCUDA/select_compute_arch.cmake")
Brad King's avatar
Brad King committed
1296
1297
1298
1299
1300
1301
1302
1303
1304
1305

##############################################################################
# Separate the OPTIONS out from the sources
#
macro(CUDA_GET_SOURCES_AND_OPTIONS _sources _cmake_options _options)
  set( ${_sources} )
  set( ${_cmake_options} )
  set( ${_options} )
  set( _found_options FALSE )
  foreach(arg ${ARGN})
1306
    if("x${arg}" STREQUAL "xOPTIONS")
Brad King's avatar
Brad King committed
1307
1308
      set( _found_options TRUE )
    elseif(
1309
1310
1311
1312
1313
1314
        "x${arg}" STREQUAL "xWIN32" OR
        "x${arg}" STREQUAL "xMACOSX_BUNDLE" OR
        "x${arg}" STREQUAL "xEXCLUDE_FROM_ALL" OR
        "x${arg}" STREQUAL "xSTATIC" OR
        "x${arg}" STREQUAL "xSHARED" OR
        "x${arg}" STREQUAL "xMODULE"
Brad King's avatar
Brad King committed
1315
        )
1316
      list(APPEND ${_cmake_options} ${arg})
Brad King's avatar
Brad King committed
1317
1318
    else()
      if ( _found_options )
1319
        list(APPEND ${_options} ${arg})
Brad King's avatar
Brad King committed
1320
1321
      else()
        # Assume this is a file
1322
        list(APPEND ${_sources} ${arg})
Brad King's avatar
Brad King committed
1323
1324
1325
1326
1327
1328
1329
1330
1331
1332
1333
1334
1335
1336
1337
1338
1339
1340
1341
1342
1343
1344
1345
1346
1347
1348
1349
1350
1351
1352
1353
1354
1355
1356
1357
1358
1359
1360
1361
1362
1363
1364
1365
1366
      endif()
    endif()
  endforeach()
endmacro()

##############################################################################
# Parse the OPTIONS from ARGN and set the variables prefixed by _option_prefix
#
macro(CUDA_PARSE_NVCC_OPTIONS _option_prefix)
  set( _found_config )
  foreach(arg ${ARGN})
    # Determine if we are dealing with a perconfiguration flag
    foreach(config ${CUDA_configuration_types})
      string(TOUPPER ${config} config_upper)
      if (arg STREQUAL "${config_upper}")
        set( _found_config _${arg})
        # Set arg to nothing to keep it from being processed further
        set( arg )
      endif()
    endforeach()

    if ( arg )
      list(APPEND ${_option_prefix}${_found_config} "${arg}")
    endif()
  endforeach()
endmacro()

##############################################################################
# Helper to add the include directory for CUDA only once
function(CUDA_ADD_CUDA_INCLUDE_ONCE)
  get_directory_property(_include_directories INCLUDE_DIRECTORIES)
  set(_add TRUE)
  if(_include_directories)
    foreach(dir ${_include_directories})
      if("${dir}" STREQUAL "${CUDA_INCLUDE_DIRS}")
        set(_add FALSE)
      endif()
    endforeach()
  endif()
  if(_add)
    include_directories(${CUDA_INCLUDE_DIRS})
  endif()
endfunction()

1367
1368
1369
1370
1371
1372
1373
1374
1375
1376
1377
1378
1379
1380
1381
1382
1383
1384
1385
1386
1387
function(CUDA_BUILD_SHARED_LIBRARY shared_flag)
  set(cmake_args ${ARGN})
  # If SHARED, MODULE, or STATIC aren't already in the list of arguments, then
  # add SHARED or STATIC based on the value of BUILD_SHARED_LIBS.
  list(FIND cmake_args SHARED _cuda_found_SHARED)
  list(FIND cmake_args MODULE _cuda_found_MODULE)
  list(FIND cmake_args STATIC _cuda_found_STATIC)
  if( _cuda_found_SHARED GREATER -1 OR
      _cuda_found_MODULE GREATER -1 OR
      _cuda_found_STATIC GREATER -1)
    set(_cuda_build_shared_libs)
  else()
    if (BUILD_SHARED_LIBS)
      set(_cuda_build_shared_libs SHARED)
    else()
      set(_cuda_build_shared_libs STATIC)
    endif()
  endif()
  set(${shared_flag} ${_cuda_build_shared_libs} PARENT_SCOPE)
endfunction()

1388
1389
1390
1391
1392
1393
1394
1395
1396
1397
1398
1399
##############################################################################
# Helper to avoid clashes of files with the same basename but different paths.
# This doesn't attempt to do exactly what CMake internals do, which is to only
# add this path when there is a conflict, since by the time a second collision
# in names is detected it's already too late to fix the first one.  For
# consistency sake the relative path will be added to all files.
function(CUDA_COMPUTE_BUILD_PATH path build_path)
  #message("CUDA_COMPUTE_BUILD_PATH([${path}] ${build_path})")
  # Only deal with CMake style paths from here on out
  file(TO_CMAKE_PATH "${path}" bpath)
  if (IS_ABSOLUTE "${bpath}")
    # Absolute paths are generally unnessary, especially if something like
1400
    # file(GLOB_RECURSE) is used to pick up the files.
1401
1402
1403
1404
1405
1406
1407

    string(FIND "${bpath}" "${CMAKE_CURRENT_BINARY_DIR}" _binary_dir_pos)
    if (_binary_dir_pos EQUAL 0)
      file(RELATIVE_PATH bpath "${CMAKE_CURRENT_BINARY_DIR}" "${bpath}")
    else()
      file(RELATIVE_PATH bpath "${CMAKE_CURRENT_SOURCE_DIR}" "${bpath}")
    endif()
1408
1409
  endif()

Ben Boeckel's avatar
Ben Boeckel committed
1410
  # This recipe is from cmLocalGenerator::CreateSafeUniqueObjectFileName in the
1411
1412
1413
1414
1415
1416
1417
1418
1419
1420
1421
1422
1423
1424
1425
1426
1427
1428
1429
1430
  # CMake source.

  # Remove leading /
  string(REGEX REPLACE "^[/]+" "" bpath "${bpath}")
  # Avoid absolute paths by removing ':'
  string(REPLACE ":" "_" bpath "${bpath}")
  # Avoid relative paths that go up the tree
  string(REPLACE "../" "__/" bpath "${bpath}")
  # Avoid spaces
  string(REPLACE " " "_" bpath "${bpath}")

  # Strip off the filename.  I wait until here to do it, since removin the
  # basename can make a path that looked like path/../basename turn into
  # path/.. (notice the trailing slash).
  get_filename_component(bpath "${bpath}" PATH)

  set(${build_path} "${bpath}" PARENT_SCOPE)
  #message("${build_path} = ${bpath}")
endfunction()

Brad King's avatar
Brad King committed
1431
1432
1433
##############################################################################
# This helper macro populates the following variables and setups up custom
# commands and targets to invoke the nvcc compiler to generate C or PTX source
1434
# dependent upon the format parameter.  The compiler is invoked once with -M
Brad King's avatar
Brad King committed
1435
1436
1437
1438
# to generate a dependency file and a second time with -cuda or -ptx to generate
# a .cpp or .ptx file.
# INPUT:
#   cuda_target         - Target name
1439
#   format              - PTX, CUBIN, FATBIN or OBJ
Brad King's avatar
Brad King committed
1440
1441
1442
1443
1444
1445
1446
1447
1448
#   FILE1 .. FILEN      - The remaining arguments are the sources to be wrapped.
#   OPTIONS             - Extra options to NVCC
# OUTPUT:
#   generated_files     - List of generated files
##############################################################################
##############################################################################

macro(CUDA_WRAP_SRCS cuda_target format generated_files)

1449
1450
1451
1452
1453
1454
1455
1456
1457
1458
1459
1460
  # Put optional arguments in list.
  set(_argn_list "${ARGN}")
  # If one of the given optional arguments is "PHONY", make a note of it, then
  # remove it from the list.
  list(FIND _argn_list "PHONY" _phony_idx)
  if("${_phony_idx}" GREATER "-1")
    set(_target_is_phony true)
    list(REMOVE_AT _argn_list ${_phony_idx})
  else()
    set(_target_is_phony false)
  endif()

1461
1462
1463
1464
1465
  # If CMake doesn't support separable compilation, complain
  if(CUDA_SEPARABLE_COMPILATION AND CMAKE_VERSION VERSION_LESS "2.8.10.1")
    message(SEND_ERROR "CUDA_SEPARABLE_COMPILATION isn't supported for CMake versions less than 2.8.10.1")
  endif()

1466
  # Set up all the command line flags here, so that they can be overridden on a per target basis.
Brad King's avatar
Brad King committed
1467
1468
1469
1470
1471
1472
1473

  set(nvcc_flags "")

  # Emulation if the card isn't present.
  if (CUDA_BUILD_EMULATION)
    # Emulation.
    set(nvcc_flags ${nvcc_flags} --device-emulation -D_DEVICEEMU -g)
1474
  else()
Brad King's avatar
Brad King committed
1475
    # Device mode.  No flags necessary.
1476
  endif()
Brad King's avatar
Brad King committed
1477
1478
1479

  if(CUDA_HOST_COMPILATION_CPP)
    set(CUDA_C_OR_CXX CXX)
1480
  else()
1481
1482
1483
    if(CUDA_VERSION VERSION_LESS "3.0")
      set(nvcc_flags ${nvcc_flags} --host-compilation C)
    else()
1484
      message(WARNING "--host-compilation flag is deprecated in CUDA version >= 3.0.  Removing --host-compilation C flag" )
1485
    endif()
Brad King's avatar
Brad King committed
1486
    set(CUDA_C_OR_CXX C)
1487
  endif()
Brad King's avatar
Brad King committed
1488
1489
1490
1491
1492
1493
1494
1495
1496

  set(generated_extension ${CMAKE_${CUDA_C_OR_CXX}_OUTPUT_EXTENSION})

  if(CUDA_64_BIT_DEVICE_CODE)
    set(nvcc_flags ${nvcc_flags} -m64)
  else()
    set(nvcc_flags ${nvcc_flags} -m32)
  endif()

1497
1498
1499
1500
  if(CUDA_TARGET_CPU_ARCH)
    set(nvcc_flags ${nvcc_flags} "--target-cpu-architecture=${CUDA_TARGET_CPU_ARCH}")
  endif()

Brad King's avatar
Brad King committed
1501
  # This needs to be passed in at this stage, because VS needs to fill out the
1502
  # various macros from within VS.  Note that CCBIN is only used if
1503
  # -ccbin or --compiler-bindir isn't used and CUDA_HOST_COMPILER matches
1504
  # _CUDA_MSVC_HOST_COMPILER
Brad King's avatar
Brad King committed
1505
  if(CMAKE_GENERATOR MATCHES "Visual Studio")
1506
    set(ccbin_flags -D "\"CCBIN:PATH=${_CUDA_MSVC_HOST_COMPILER}\"" )
1507
1508
  else()
    set(ccbin_flags)
Brad King's avatar
Brad King committed
1509
1510
1511
1512
1513
1514
1515
1516
1517
1518
1519
1520
1521
  endif()

  # Figure out which configure we will use and pass that in as an argument to
  # the script.  We need to defer the decision until compilation time, because
  # for VS projects we won't know if we are making a debug or release build
  # until build time.
  if(CMAKE_GENERATOR MATCHES "Visual Studio")
    set( CUDA_build_configuration "$(ConfigurationName)" )
  else()
    set( CUDA_build_configuration "${CMAKE_BUILD_TYPE}")
  endif()

  # Initialize our list of includes with the user ones followed by the CUDA system ones.
1522
  set(CUDA_NVCC_INCLUDE_DIRS ${CUDA_NVCC_INCLUDE_DIRS_USER} "${CUDA_INCLUDE_DIRS}")
1523
1524
1525
1526
1527
1528
1529
1530
1531
1532
1533
1534
1535
1536
  if(_target_is_phony)
    # If the passed in target name isn't a real target (i.e., this is from a call to one of the
    # cuda_compile_* functions), need to query directory properties to get include directories
    # and compile definitions.
    get_directory_property(_dir_include_dirs INCLUDE_DIRECTORIES)
    get_directory_property(_dir_compile_defs COMPILE_DEFINITIONS)

    list(APPEND CUDA_NVCC_INCLUDE_DIRS "${_dir_include_dirs}")
    set(CUDA_NVCC_COMPILE_DEFINITIONS "${_dir_compile_defs}")
  else()
    # Append the include directories for this target via generator expression, which is
    # expanded by the FILE(GENERATE) call below.  This generator expression captures all
    # include dirs set by the user, whether via directory properties or target properties
    list(APPEND CUDA_NVCC_INCLUDE_DIRS "$<TARGET_PROPERTY:${cuda_target},INCLUDE_DIRECTORIES>")
1537

1538
1539
1540
    # Do the same thing with compile definitions
    set(CUDA_NVCC_COMPILE_DEFINITIONS "$<TARGET_PROPERTY:${cuda_target},COMPILE_DEFINITIONS>")
  endif()
1541

Brad King's avatar
Brad King committed
1542
1543
1544
1545
1546
1547
1548
1549

  # Reset these variables
  set(CUDA_WRAP_OPTION_NVCC_FLAGS)
  foreach(config ${CUDA_configuration_types})
    string(TOUPPER ${config} config_upper)
    set(CUDA_WRAP_OPTION_NVCC_FLAGS_${config_upper})
  endforeach()

1550
  CUDA_GET_SOURCES_AND_OPTIONS(_cuda_wrap_sources _cuda_wrap_cmake_options _cuda_wrap_options ${_argn_list})
Brad King's avatar
Brad King committed
1551
1552
  CUDA_PARSE_NVCC_OPTIONS(CUDA_WRAP_OPTION_NVCC_FLAGS ${_cuda_wrap_options})

1553
1554
1555
  # Figure out if we are building a shared library.  BUILD_SHARED_LIBS is
  # respected in CUDA_ADD_LIBRARY.
  set(_cuda_build_shared_libs FALSE)
Brad King's avatar
Brad King committed
1556
1557
1558
1559
1560
1561
1562
1563
1564
1565
1566
1567
1568
1569
1570
1571
1572
  # SHARED, MODULE
  list(FIND _cuda_wrap_cmake_options SHARED _cuda_found_SHARED)
  list(FIND _cuda_wrap_cmake_options MODULE _cuda_found_MODULE)
  if(_cuda_found_SHARED GREATER -1 OR _cuda_found_MODULE GREATER -1)
    set(_cuda_build_shared_libs TRUE)
  endif()
  # STATIC
  list(FIND _cuda_wrap_cmake_options STATIC _cuda_found_STATIC)
  if(_cuda_found_STATIC GREATER -1)
    set(_cuda_build_shared_libs FALSE)
  endif()

  # CUDA_HOST_FLAGS
  if(_cuda_build_shared_libs)
    # If we are setting up code for a shared library, then we need to add extra flags for
    # compiling objects for shared libraries.
    set(CUDA_HOST_SHARED_FLAGS ${CMAKE_SHARED_LIBRARY_${CUDA_C_OR_CXX}_FLAGS})
1573
1574
  else()
    set(CUDA_HOST_SHARED_FLAGS)
Brad King's avatar
Brad King committed
1575
1576
1577
  endif()
  # Only add the CMAKE_{C,CXX}_FLAGS if we are propagating host flags.  We
  # always need to set the SHARED_FLAGS, though.
1578
  if(CUDA_PROPAGATE_HOST_FLAGS)
1579
    set(_cuda_host_flags "set(CMAKE_HOST_FLAGS ${CMAKE_${CUDA_C_OR_CXX}_FLAGS} ${CUDA_HOST_SHARED_FLAGS})")
1580
  else()
1581
    set(_cuda_host_flags "set(CMAKE_HOST_FLAGS ${CUDA_HOST_SHARED_FLAGS})")
1582
1583
  endif()

1584
  set(_cuda_nvcc_flags_config "# Build specific configuration flags")
1585
1586
1587
1588
1589
  # Loop over all the configuration types to generate appropriate flags for run_nvcc.cmake
  foreach(config ${CUDA_configuration_types})
    string(TOUPPER ${config} config_upper)
    # CMAKE_FLAGS are strings and not lists.  By not putting quotes around CMAKE_FLAGS
    # we convert the strings to lists (like we want).
Brad King's avatar
Brad King committed
1590

1591
    if(CUDA_PROPAGATE_HOST_FLAGS)
James Bigler's avatar
James Bigler committed
1592
      # nvcc chokes on -g3 in versions previous to 3.0, so replace it with -g
1593
1594
1595
1596
1597
1598
1599
1600
1601
1602
1603
      set(_cuda_fix_g3 FALSE)

      if(CMAKE_COMPILER_IS_GNUCC)
        if (CUDA_VERSION VERSION_LESS  "3.0" OR
            CUDA_VERSION VERSION_EQUAL "4.1" OR
            CUDA_VERSION VERSION_EQUAL "4.2"
            )
          set(_cuda_fix_g3 TRUE)
        endif()
      endif()
      if(_cuda_fix_g3)
Brad King's avatar
Brad King committed
1604
1605
1606
1607
        string(REPLACE "-g3" "-g" _cuda_C_FLAGS "${CMAKE_${CUDA_C_OR_CXX}_FLAGS_${config_upper}}")
      else()
        set(_cuda_C_FLAGS "${CMAKE_${CUDA_C_OR_CXX}_FLAGS_${config_upper}}")
      endif()
1608

Daniel Pfeifer's avatar
Daniel Pfeifer committed
1609
      string(APPEND _cuda_host_flags "\nset(CMAKE_HOST_FLAGS_${config_upper} ${_cuda_C_FLAGS})")
1610
1611
1612
1613
1614
    endif()

    # Note that if we ever want CUDA_NVCC_FLAGS_<CONFIG> to be string (instead of a list
    # like it is currently), we can remove the quotes around the
    # ${CUDA_NVCC_FLAGS_${config_upper}} variable like the CMAKE_HOST_FLAGS_<CONFIG> variable.
Daniel Pfeifer's avatar
Daniel Pfeifer committed
1615
    string(APPEND _cuda_nvcc_flags_config "\nset(CUDA_NVCC_FLAGS_${config_upper} ${CUDA_NVCC_FLAGS_${config_upper}} ;; ${CUDA_WRAP_OPTION_NVCC_FLAGS_${config_upper}})")
1616
  endforeach()
Brad King's avatar
Brad King committed
1617

1618
1619
1620
1621
1622
1623
1624
  # Process the C++11 flag.  If the host sets the flag, we need to add it to nvcc and
  # remove it from the host. This is because -Xcompile -std=c++ will choke nvcc (it uses
  # the C preprocessor).  In order to get this to work correctly, we need to use nvcc's
  # specific c++11 flag.
  if( "${_cuda_host_flags}" MATCHES "-std=c\\+\\+11")
    # Add the c++11 flag to nvcc if it isn't already present.  Note that we only look at
    # the main flag instead of the configuration specific flags.