ExternalData.cmake 44.3 KB
Newer Older
1
2
3
# Distributed under the OSI-approved BSD 3-Clause License.  See accompanying
# file Copyright.txt or https://cmake.org/licensing for details.

4
5
6
7
#[=======================================================================[.rst:
ExternalData
------------

8
9
10
11
.. only:: html

   .. contents::

12
13
Manage data files stored outside source tree

14
15
16
Introduction
^^^^^^^^^^^^

17
18
19
20
21
22
23
Use this module to unambiguously reference data files stored outside
the source tree and fetch them at build time from arbitrary local and
remote content-addressed locations.  Functions provided by this module
recognize arguments with the syntax ``DATA{<name>}`` as references to
external data, replace them with full paths to local copies of those
data, and create build rules to fetch and update the local copies.

24
For example:
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

.. code-block:: cmake

 include(ExternalData)
 set(ExternalData_URL_TEMPLATES "file:///local/%(algo)/%(hash)"
                                "file:////host/share/%(algo)/%(hash)"
                                "http://data.org/%(algo)/%(hash)")
 ExternalData_Add_Test(MyData
   NAME MyTest
   COMMAND MyExe DATA{MyInput.png}
   )
 ExternalData_Add_Target(MyData)

When test ``MyTest`` runs the ``DATA{MyInput.png}`` argument will be
replaced by the full path to a real instance of the data file
``MyInput.png`` on disk.  If the source tree contains a content link
such as ``MyInput.png.md5`` then the ``MyData`` target creates a real
``MyInput.png`` in the build tree.

44
45
Module Functions
^^^^^^^^^^^^^^^^
46

47
.. command:: ExternalData_Expand_Arguments
48

49
50
  The ``ExternalData_Expand_Arguments`` function evaluates ``DATA{}``
  references in its arguments and constructs a new list of arguments::
51

52
53
54
55
56
    ExternalData_Expand_Arguments(
      <target>   # Name of data management target
      <outVar>   # Output variable
      [args...]  # Input arguments, DATA{} allowed
      )
57

58
59
  It replaces each ``DATA{}`` reference in an argument with the full path of
  a real data file on disk that will exist after the ``<target>`` builds.
60

61
.. command:: ExternalData_Add_Test
62

63
64
65
  The ``ExternalData_Add_Test`` function wraps around the CMake
  :command:`add_test` command but supports ``DATA{}`` references in
  its arguments::
66

67
68
69
70
    ExternalData_Add_Test(
      <target>   # Name of data management target
      ...        # Arguments of add_test(), DATA{} allowed
      )
71

72
73
  It passes its arguments through ``ExternalData_Expand_Arguments`` and then
  invokes the :command:`add_test` command using the results.
74

75
.. command:: ExternalData_Add_Target
76

77
78
79
80
  The ``ExternalData_Add_Target`` function creates a custom target to
  manage local instances of data files stored externally::

    ExternalData_Add_Target(
81
82
      <target>                  # Name of data management target
      [SHOW_PROGRESS <ON|OFF>]  # Show progress during the download
83
84
85
86
      )

  It creates custom commands in the target as necessary to make data
  files available for each ``DATA{}`` reference previously evaluated by
87
88
89
90
91
  other functions provided by this module.
  Data files may be fetched from one of the URL templates specified in
  the ``ExternalData_URL_TEMPLATES`` variable, or may be found locally
  in one of the paths specified in the ``ExternalData_OBJECT_STORES``
  variable.
92

93
94
95
96
97
  .. versionadded:: 3.20
    The ``SHOW_PROGRESS`` argument may be passed to suppress progress information
    during the download of objects. If not provided, it defaults to ``OFF`` for
    :generator:`Ninja` and :generator:`Ninja Multi-Config` generators and ``ON``
    otherwise.
98

99
100
101
102
  Typically only one target is needed to manage all external data within
  a project.  Call this function once at the end of configuration after
  all data references have been processed.

103
104
Module Variables
^^^^^^^^^^^^^^^^
105

106
107
The following variables configure behavior.  They should be set before
calling any of the functions provided by this module.
108

109
.. variable:: ExternalData_BINARY_ROOT
110

111
112
113
114
  The ``ExternalData_BINARY_ROOT`` variable may be set to the directory to
  hold the real data files named by expanded ``DATA{}`` references.  The
  default is ``CMAKE_BINARY_DIR``.  The directory layout will mirror that of
  content links under ``ExternalData_SOURCE_ROOT``.
115

116
117
.. variable:: ExternalData_CUSTOM_SCRIPT_<key>

118
119
  .. versionadded:: 3.2

120
121
122
123
  Specify a full path to a ``.cmake`` custom fetch script identified by
  ``<key>`` in entries of the ``ExternalData_URL_TEMPLATES`` list.
  See `Custom Fetch Scripts`_.

124
.. variable:: ExternalData_LINK_CONTENT
125

126
127
128
129
130
131
132
133
134
  The ``ExternalData_LINK_CONTENT`` variable may be set to the name of a
  supported hash algorithm to enable automatic conversion of real data
  files referenced by the ``DATA{}`` syntax into content links.  For each
  such ``<file>`` a content link named ``<file><ext>`` is created.  The
  original file is renamed to the form ``.ExternalData_<algo>_<hash>`` to
  stage it for future transmission to one of the locations in the list
  of URL templates (by means outside the scope of this module).  The
  data fetch rule created for the content link will use the staged
  object if it cannot be found using any URL template.
135

136
137
.. variable:: ExternalData_NO_SYMLINKS

138
139
  .. versionadded:: 3.3

140
141
142
143
144
  The real data files named by expanded ``DATA{}`` references may be made
  available under ``ExternalData_BINARY_ROOT`` using symbolic links on
  some platforms.  The ``ExternalData_NO_SYMLINKS`` variable may be set
  to disable use of symbolic links and enable use of copies instead.

145
.. variable:: ExternalData_OBJECT_STORES
146

147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
  The ``ExternalData_OBJECT_STORES`` variable may be set to a list of local
  directories that store objects using the layout ``<dir>/%(algo)/%(hash)``.
  These directories will be searched first for a needed object.  If the
  object is not available in any store then it will be fetched remotely
  using the URL templates and added to the first local store listed.  If
  no stores are specified the default is a location inside the build
  tree.

.. variable:: ExternalData_SERIES_PARSE
              ExternalData_SERIES_PARSE_PREFIX
              ExternalData_SERIES_PARSE_NUMBER
              ExternalData_SERIES_PARSE_SUFFIX
              ExternalData_SERIES_MATCH

  See `Referencing File Series`_.

.. variable:: ExternalData_SOURCE_ROOT

  The ``ExternalData_SOURCE_ROOT`` variable may be set to the highest source
  directory containing any path named by a ``DATA{}`` reference.  The
  default is ``CMAKE_SOURCE_DIR``.  ``ExternalData_SOURCE_ROOT`` and
  ``CMAKE_SOURCE_DIR`` must refer to directories within a single source
  distribution (e.g.  they come together in one tarball).

.. variable:: ExternalData_TIMEOUT_ABSOLUTE

  The ``ExternalData_TIMEOUT_ABSOLUTE`` variable sets the download
  absolute timeout, in seconds, with a default of ``300`` seconds.
  Set to ``0`` to disable enforcement.

.. variable:: ExternalData_TIMEOUT_INACTIVITY

  The ``ExternalData_TIMEOUT_INACTIVITY`` variable sets the download
  inactivity timeout, in seconds, with a default of ``60`` seconds.
  Set to ``0`` to disable enforcement.

183
184
.. variable:: ExternalData_URL_ALGO_<algo>_<key>

185
186
  .. versionadded:: 3.3

187
188
189
190
191
192
  Specify a custom URL component to be substituted for URL template
  placeholders of the form ``%(algo:<key>)``, where ``<key>`` is a
  valid C identifier, when fetching an object referenced via hash
  algorithm ``<algo>``.  If not defined, the default URL component
  is just ``<algo>`` for any ``<key>``.

193
194
.. variable:: ExternalData_URL_TEMPLATES

195
  The ``ExternalData_URL_TEMPLATES`` may be set to provide a list
196
197
198
  of URL templates using the placeholders ``%(algo)`` and ``%(hash)``
  in each template.  Data fetch rules try each URL template in order
  by substituting the hash algorithm name for ``%(algo)`` and the hash
199
200
201
  value for ``%(hash)``.  Alternatively one may use ``%(algo:<key>)``
  with ``ExternalData_URL_ALGO_<algo>_<key>`` variables to gain more
  flexibility in remote URLs.
202

203
204
205
206
207
208
209
210
211
212
213
214
215
216
Referencing Files
^^^^^^^^^^^^^^^^^

Referencing Single Files
""""""""""""""""""""""""

The ``DATA{}`` syntax is literal and the ``<name>`` is a full or relative path
within the source tree.  The source tree must contain either a real
data file at ``<name>`` or a "content link" at ``<name><ext>`` containing a
hash of the real file using a hash algorithm corresponding to ``<ext>``.
For example, the argument ``DATA{img.png}`` may be satisfied by either a
real ``img.png`` file in the current source directory or a ``img.png.md5``
file containing its MD5 sum.

217
218
219
220
221
.. versionadded:: 3.8
  Multiple content links of the same name with different hash algorithms
  are supported (e.g. ``img.png.sha256`` and ``img.png.sha1``) so long as
  they all correspond to the same real file.  This allows objects to be
  fetched from sources indexed by different hash algorithms.
222

223
Referencing File Series
224
"""""""""""""""""""""""
225

226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
The ``DATA{}`` syntax can be told to fetch a file series using the form
``DATA{<name>,:}``, where the ``:`` is literal.  If the source tree
contains a group of files or content links named like a series then a
reference to one member adds rules to fetch all of them.  Although all
members of a series are fetched, only the file originally named by the
``DATA{}`` argument is substituted for it.  The default configuration
recognizes file series names ending with ``#.ext``, ``_#.ext``, ``.#.ext``,
or ``-#.ext`` where ``#`` is a sequence of decimal digits and ``.ext`` is
any single extension.  Configure it with a regex that parses ``<number>``
and ``<suffix>`` parts from the end of ``<name>``::

 ExternalData_SERIES_PARSE = regex of the form (<number>)(<suffix>)$

For more complicated cases set::

 ExternalData_SERIES_PARSE = regex with at least two () groups
 ExternalData_SERIES_PARSE_PREFIX = <prefix> regex group number, if any
 ExternalData_SERIES_PARSE_NUMBER = <number> regex group number
 ExternalData_SERIES_PARSE_SUFFIX = <suffix> regex group number

Configure series number matching with a regex that matches the
``<number>`` part of series members named ``<prefix><number><suffix>``::

 ExternalData_SERIES_MATCH = regex matching <number> in all series members

Note that the ``<suffix>`` of a series does not include a hash-algorithm
extension.

254
Referencing Associated Files
255
""""""""""""""""""""""""""""
256

257
258
259
260
261
262
263
264
265
266
267
268
269
The ``DATA{}`` syntax can alternatively match files associated with the
named file and contained in the same directory.  Associated files may
be specified by options using the syntax
``DATA{<name>,<opt1>,<opt2>,...}``.  Each option may specify one file by
name or specify a regular expression to match file names using the
syntax ``REGEX:<regex>``.  For example, the arguments::

 DATA{MyData/MyInput.mhd,MyInput.img}                   # File pair
 DATA{MyData/MyFrames00.png,REGEX:MyFrames[0-9]+\\.png} # Series

will pass ``MyInput.mha`` and ``MyFrames00.png`` on the command line but
ensure that the associated files are present next to them.

270
Referencing Directories
271
"""""""""""""""""""""""
272

273
274
275
276
277
278
279
The ``DATA{}`` syntax may reference a directory using a trailing slash and
a list of associated files.  The form ``DATA{<name>/,<opt1>,<opt2>,...}``
adds rules to fetch any files in the directory that match one of the
associated file options.  For example, the argument
``DATA{MyDataDir/,REGEX:.*}`` will pass the full path to a ``MyDataDir``
directory on the command line and ensure that the directory contains
files corresponding to every file or content link in the ``MyDataDir``
280
281
282
283
284
source directory.

.. versionadded:: 3.3
  In order to match associated files in subdirectories,
  specify a ``RECURSE:`` option, e.g. ``DATA{MyDataDir/,RECURSE:,REGEX:.*}``.
285

286
287
Hash Algorithms
^^^^^^^^^^^^^^^
288

289
The following hash algorithms are supported::
290

291
292
293
294
295
296
297
298
 %(algo)     <ext>     Description
 -------     -----     -----------
 MD5         .md5      Message-Digest Algorithm 5, RFC 1321
 SHA1        .sha1     US Secure Hash Algorithm 1, RFC 3174
 SHA224      .sha224   US Secure Hash Algorithms, RFC 4634
 SHA256      .sha256   US Secure Hash Algorithms, RFC 4634
 SHA384      .sha384   US Secure Hash Algorithms, RFC 4634
 SHA512      .sha512   US Secure Hash Algorithms, RFC 4634
299
300
301
302
 SHA3_224    .sha3-224 Keccak SHA-3
 SHA3_256    .sha3-256 Keccak SHA-3
 SHA3_384    .sha3-384 Keccak SHA-3
 SHA3_512    .sha3-512 Keccak SHA-3
303

304
305
306
.. versionadded:: 3.8
  Added the ``SHA3_*`` hash algorithms.

307
308
Note that the hashes are used only for unique data identification and
download verification.
309

310
311
.. _`ExternalData Custom Fetch Scripts`:

312
313
314
Custom Fetch Scripts
^^^^^^^^^^^^^^^^^^^^

315
316
.. versionadded:: 3.2

317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
When a data file must be fetched from one of the URL templates
specified in the ``ExternalData_URL_TEMPLATES`` variable, it is
normally downloaded using the :command:`file(DOWNLOAD)` command.
One may specify usage of a custom fetch script by using a URL
template of the form ``ExternalDataCustomScript://<key>/<loc>``.
The ``<key>`` must be a C identifier, and the ``<loc>`` must
contain the ``%(algo)`` and ``%(hash)`` placeholders.
A variable corresponding to the key, ``ExternalData_CUSTOM_SCRIPT_<key>``,
must be set to the full path to a ``.cmake`` script file.  The script
will be included to perform the actual fetch, and provided with
the following variables:

.. variable:: ExternalData_CUSTOM_LOCATION

  When a custom fetch script is loaded, this variable is set to the
  location part of the URL, which will contain the substituted hash
  algorithm name and content hash value.

.. variable:: ExternalData_CUSTOM_FILE

  When a custom fetch script is loaded, this variable is set to the
  full path to a file in which the script must store the fetched
  content.  The name of the file is unspecified and should not be
  interpreted in any way.

The custom fetch script is expected to store fetched content in the
file or set a variable:

.. variable:: ExternalData_CUSTOM_ERROR

  When a custom fetch script fails to fetch the requested content,
  it must set this variable to a short one-line message describing
  the reason for failure.

351
#]=======================================================================]
Brad King's avatar
Brad King committed
352
353

function(ExternalData_add_test target)
354
355
  # Expand all arguments as a single string to preserve escaped semicolons.
  ExternalData_expand_arguments("${target}" testArgs "${ARGN}")
Brad King's avatar
Brad King committed
356
357
358
359
  add_test(${testArgs})
endfunction()

function(ExternalData_add_target target)
360
361
362
  if(NOT ExternalData_URL_TEMPLATES AND NOT ExternalData_OBJECT_STORES)
    message(FATAL_ERROR
      "Neither ExternalData_URL_TEMPLATES nor ExternalData_OBJECT_STORES is set!")
Brad King's avatar
Brad King committed
363
364
365
366
  endif()
  if(NOT ExternalData_OBJECT_STORES)
    set(ExternalData_OBJECT_STORES ${CMAKE_BINARY_DIR}/ExternalData/Objects)
  endif()
367
368
  set(_ExternalData_CONFIG_CODE "")

369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
  cmake_parse_arguments(PARSE_ARGV 1 _ExternalData_add_target
    ""
    "SHOW_PROGRESS"
    "")
  if (_ExternalData_add_target_UNPARSED_ARGUMENTS)
    message(AUTHOR_WARNING
      "Ignoring unrecognized arguments passed to ExternalData_add_target: "
      "`${_ExternalData_add_target_UNPARSED_ARGUMENTS}`")
  endif ()

  # Turn `SHOW_PROGRESS` into a boolean
  if (NOT DEFINED _ExternalData_add_target_SHOW_PROGRESS)
    # The default setting
    if (CMAKE_GENERATOR MATCHES "Ninja")
      set(_ExternalData_add_target_SHOW_PROGRESS OFF)
    else ()
      set(_ExternalData_add_target_SHOW_PROGRESS ON)
    endif ()
  elseif (_ExternalData_add_target_SHOW_PROGRESS)
    set(_ExternalData_add_target_SHOW_PROGRESS ON)
  else ()
    set(_ExternalData_add_target_SHOW_PROGRESS OFF)
  endif ()

393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
  # Store custom script configuration.
  foreach(url_template IN LISTS ExternalData_URL_TEMPLATES)
    if("${url_template}" MATCHES "^ExternalDataCustomScript://([^/]*)/(.*)$")
      set(key "${CMAKE_MATCH_1}")
      if(key MATCHES "^[A-Za-z_][A-Za-z0-9_]*$")
        if(ExternalData_CUSTOM_SCRIPT_${key})
          if(IS_ABSOLUTE "${ExternalData_CUSTOM_SCRIPT_${key}}")
            string(CONCAT _ExternalData_CONFIG_CODE "${_ExternalData_CONFIG_CODE}\n"
              "set(ExternalData_CUSTOM_SCRIPT_${key} \"${ExternalData_CUSTOM_SCRIPT_${key}}\")")
          else()
            message(FATAL_ERROR
              "No ExternalData_CUSTOM_SCRIPT_${key} is not set to a full path:\n"
              " ${ExternalData_CUSTOM_SCRIPT_${key}}")
          endif()
        else()
          message(FATAL_ERROR
            "No ExternalData_CUSTOM_SCRIPT_${key} is set for URL template:\n"
            " ${url_template}")
        endif()
      else()
        message(FATAL_ERROR
          "Bad ExternalDataCustomScript key '${key}' in URL template:\n"
          " ${url_template}\n"
          "The key must be a valid C identifier.")
      endif()
    endif()
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437

    # Store custom algorithm name to URL component maps.
    if("${url_template}" MATCHES "%\\(algo:([^)]*)\\)")
      set(key "${CMAKE_MATCH_1}")
      if(key MATCHES "^[A-Za-z_][A-Za-z0-9_]*$")
        string(REPLACE "|" ";" _algos "${_ExternalData_REGEX_ALGO}")
        foreach(algo ${_algos})
          if(DEFINED ExternalData_URL_ALGO_${algo}_${key})
            string(CONCAT _ExternalData_CONFIG_CODE "${_ExternalData_CONFIG_CODE}\n"
              "set(ExternalData_URL_ALGO_${algo}_${key} \"${ExternalData_URL_ALGO_${algo}_${key}}\")")
          endif()
        endforeach()
      else()
        message(FATAL_ERROR
          "Bad %(algo:${key}) in URL template:\n"
          " ${url_template}\n"
          "The transform name must be a valid C identifier.")
      endif()
    endif()
438
439
440
  endforeach()

  # Store configuration for use by build-time script.
Brad King's avatar
Brad King committed
441
442
443
444
445
  set(config ${CMAKE_CURRENT_BINARY_DIR}/${target}_config.cmake)
  configure_file(${_ExternalData_SELF_DIR}/ExternalData_config.cmake.in ${config} @ONLY)

  set(files "")

446
447
448
449
450
451
452
453
  # Set a "_ExternalData_FILE_${file}" variable for each output file to avoid
  # duplicate entries within this target.  Set a directory property of the same
  # name to avoid repeating custom commands with the same output in this directory.
  # Repeating custom commands with the same output across directories or across
  # targets in the same directory may be a race, but this is likely okay because
  # we use atomic replacement of output files.
  #
  # Use local data first to prefer real files over content links.
Brad King's avatar
Brad King committed
454
455
456
457
458
459
460
461
462

  # Custom commands to copy or link local data.
  get_property(data_local GLOBAL PROPERTY _ExternalData_${target}_LOCAL)
  foreach(entry IN LISTS data_local)
    string(REPLACE "|" ";" tuple "${entry}")
    list(GET tuple 0 file)
    list(GET tuple 1 name)
    if(NOT DEFINED "_ExternalData_FILE_${file}")
      set("_ExternalData_FILE_${file}" 1)
463
464
465
466
467
468
469
470
471
      get_property(added DIRECTORY PROPERTY "_ExternalData_FILE_${file}")
      if(NOT added)
        set_property(DIRECTORY PROPERTY "_ExternalData_FILE_${file}" 1)
        add_custom_command(
          COMMENT "Generating ${file}"
          OUTPUT "${file}"
          COMMAND ${CMAKE_COMMAND} -Drelative_top=${CMAKE_BINARY_DIR}
                                   -Dfile=${file} -Dname=${name}
                                   -DExternalData_ACTION=local
472
                                   -DExternalData_SHOW_PROGRESS=${_ExternalData_add_target_SHOW_PROGRESS}
473
474
475
476
477
                                   -DExternalData_CONFIG=${config}
                                   -P ${_ExternalData_SELF}
          MAIN_DEPENDENCY "${name}"
          )
      endif()
Brad King's avatar
Brad King committed
478
479
480
481
482
483
484
485
486
487
      list(APPEND files "${file}")
    endif()
  endforeach()

  # Custom commands to fetch remote data.
  get_property(data_fetch GLOBAL PROPERTY _ExternalData_${target}_FETCH)
  foreach(entry IN LISTS data_fetch)
    string(REPLACE "|" ";" tuple "${entry}")
    list(GET tuple 0 file)
    list(GET tuple 1 name)
488
489
490
491
    list(GET tuple 2 exts)
    string(REPLACE "+" ";" exts_list "${exts}")
    list(GET exts_list 0 first_ext)
    set(stamp "-hash-stamp")
Brad King's avatar
Brad King committed
492
493
    if(NOT DEFINED "_ExternalData_FILE_${file}")
      set("_ExternalData_FILE_${file}" 1)
494
495
496
497
498
499
500
501
502
503
504
505
506
      get_property(added DIRECTORY PROPERTY "_ExternalData_FILE_${file}")
      if(NOT added)
        set_property(DIRECTORY PROPERTY "_ExternalData_FILE_${file}" 1)
        add_custom_command(
          # Users care about the data file, so hide the hash/timestamp file.
          COMMENT "Generating ${file}"
          # The hash/timestamp file is the output from the build perspective.
          # List the real file as a second output in case it is a broken link.
          # The files must be listed in this order so CMake can hide from the
          # make tool that a symlink target may not be newer than the input.
          OUTPUT "${file}${stamp}" "${file}"
          # Run the data fetch/update script.
          COMMAND ${CMAKE_COMMAND} -Drelative_top=${CMAKE_BINARY_DIR}
507
                                   -Dfile=${file} -Dname=${name} -Dexts=${exts}
508
                                   -DExternalData_ACTION=fetch
509
                                   -DExternalData_SHOW_PROGRESS=${_ExternalData_add_target_SHOW_PROGRESS}
510
511
512
                                   -DExternalData_CONFIG=${config}
                                   -P ${_ExternalData_SELF}
          # Update whenever the object hash changes.
513
          MAIN_DEPENDENCY "${name}${first_ext}"
514
515
          )
      endif()
Brad King's avatar
Brad King committed
516
517
518
519
520
521
522
523
524
525
      list(APPEND files "${file}${stamp}")
    endif()
  endforeach()

  # Custom target to drive all update commands.
  add_custom_target(${target} ALL DEPENDS ${files})
endfunction()

function(ExternalData_expand_arguments target outArgsVar)
  # Replace DATA{} references with real arguments.
526
  set(data_regex "DATA{([^;{}\r\n]*)}")
Brad King's avatar
Brad King committed
527
528
  set(other_regex "([^D]|D[^A]|DA[^T]|DAT[^A]|DATA[^{])+|.")
  set(outArgs "")
529
530
  # This list expansion un-escapes semicolons in list element values so we
  # must re-escape them below anywhere a new list expansion will occur.
Brad King's avatar
Brad King committed
531
532
  foreach(arg IN LISTS ARGN)
    if("x${arg}" MATCHES "${data_regex}")
533
534
      # Re-escape in-value semicolons before expansion in foreach below.
      string(REPLACE ";" "\\;" tmp "${arg}")
Brad King's avatar
Brad King committed
535
      # Split argument into DATA{}-pieces and other pieces.
536
      string(REGEX MATCHALL "${data_regex}|${other_regex}" pieces "${tmp}")
Brad King's avatar
Brad King committed
537
538
539
540
541
      # Compose output argument with DATA{}-pieces replaced.
      set(outArg "")
      foreach(piece IN LISTS pieces)
        if("x${piece}" MATCHES "^x${data_regex}$")
          # Replace this DATA{}-piece with a file path.
542
          _ExternalData_arg("${target}" "${piece}" "${CMAKE_MATCH_1}" file)
Daniel Pfeifer's avatar
Daniel Pfeifer committed
543
          string(APPEND outArg "${file}")
Brad King's avatar
Brad King committed
544
545
        else()
          # No replacement needed for this piece.
Daniel Pfeifer's avatar
Daniel Pfeifer committed
546
          string(APPEND outArg "${piece}")
Brad King's avatar
Brad King committed
547
548
549
550
        endif()
      endforeach()
    else()
      # No replacements needed in this argument.
551
      set(outArg "${arg}")
Brad King's avatar
Brad King committed
552
    endif()
553
554
555
    # Re-escape in-value semicolons in resulting list.
    string(REPLACE ";" "\\;" outArg "${outArg}")
    list(APPEND outArgs "${outArg}")
Brad King's avatar
Brad King committed
556
557
558
559
560
561
562
  endforeach()
  set("${outArgsVar}" "${outArgs}" PARENT_SCOPE)
endfunction()

#-----------------------------------------------------------------------------
# Private helper interface

563
564
set(_ExternalData_REGEX_ALGO "MD5|SHA1|SHA224|SHA256|SHA384|SHA512|SHA3_224|SHA3_256|SHA3_384|SHA3_512")
set(_ExternalData_REGEX_EXT "md5|sha1|sha224|sha256|sha384|sha512|sha3-224|sha3-256|sha3-384|sha3-512")
Brad King's avatar
Brad King committed
565
566
567
568
set(_ExternalData_SELF "${CMAKE_CURRENT_LIST_FILE}")
get_filename_component(_ExternalData_SELF_DIR "${_ExternalData_SELF}" PATH)

function(_ExternalData_compute_hash var_hash algo file)
569
570
  if("${algo}" MATCHES "^${_ExternalData_REGEX_ALGO}$")
    file("${algo}" "${file}" hash)
Brad King's avatar
Brad King committed
571
572
573
574
575
576
577
    set("${var_hash}" "${hash}" PARENT_SCOPE)
  else()
    message(FATAL_ERROR "Hash algorithm ${algo} unimplemented.")
  endif()
endfunction()

function(_ExternalData_random var)
578
  string(RANDOM LENGTH 6 random)
Brad King's avatar
Brad King committed
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
  set("${var}" "${random}" PARENT_SCOPE)
endfunction()

function(_ExternalData_exact_regex regex_var string)
  string(REGEX REPLACE "([][+.*()^])" "\\\\\\1" regex "${string}")
  set("${regex_var}" "${regex}" PARENT_SCOPE)
endfunction()

function(_ExternalData_atomic_write file content)
  _ExternalData_random(random)
  set(tmp "${file}.tmp${random}")
  file(WRITE "${tmp}" "${content}")
  file(RENAME "${tmp}" "${file}")
endfunction()

function(_ExternalData_link_content name var_ext)
595
  if("${ExternalData_LINK_CONTENT}" MATCHES "^(${_ExternalData_REGEX_ALGO})$")
Brad King's avatar
Brad King committed
596
597
598
599
600
601
602
603
604
    set(algo "${ExternalData_LINK_CONTENT}")
  else()
    message(FATAL_ERROR
      "Unknown hash algorithm specified by ExternalData_LINK_CONTENT:\n"
      "  ${ExternalData_LINK_CONTENT}")
  endif()
  _ExternalData_compute_hash(hash "${algo}" "${name}")
  get_filename_component(dir "${name}" PATH)
  set(staged "${dir}/.ExternalData_${algo}_${hash}")
605
  string(TOLOWER ".${algo}" ext)
Brad King's avatar
Brad King committed
606
607
608
609
610
611
612
613
614
615
616
617
618
619
  _ExternalData_atomic_write("${name}${ext}" "${hash}\n")
  file(RENAME "${name}" "${staged}")
  set("${var_ext}" "${ext}" PARENT_SCOPE)

  file(RELATIVE_PATH relname "${ExternalData_SOURCE_ROOT}" "${name}${ext}")
  message(STATUS "Linked ${relname} to ExternalData ${algo}/${hash}")
endfunction()

function(_ExternalData_arg target arg options var_file)
  # Separate data path from the options.
  string(REPLACE "," ";" options "${options}")
  list(GET options 0 data)
  list(REMOVE_AT options 0)

620
621
622
623
624
  # Interpret trailing slashes as directories.
  set(data_is_directory 0)
  if("x${data}" MATCHES "^x(.*)([/\\])$")
    set(data_is_directory 1)
    set(data "${CMAKE_MATCH_1}")
625
626
  endif()

Brad King's avatar
Brad King committed
627
628
629
630
  # Convert to full path.
  if(IS_ABSOLUTE "${data}")
    set(absdata "${data}")
  else()
631
    set(absdata "${CMAKE_CURRENT_SOURCE_DIR}/${data}")
Brad King's avatar
Brad King committed
632
  endif()
633
  get_filename_component(absdata "${absdata}" ABSOLUTE)
Brad King's avatar
Brad King committed
634
635
636
637
638
639
640
641
642
643
644
645
646

  # Convert to relative path under the source tree.
  if(NOT ExternalData_SOURCE_ROOT)
    set(ExternalData_SOURCE_ROOT "${CMAKE_SOURCE_DIR}")
  endif()
  set(top_src "${ExternalData_SOURCE_ROOT}")
  file(RELATIVE_PATH reldata "${top_src}" "${absdata}")
  if(IS_ABSOLUTE "${reldata}" OR "${reldata}" MATCHES "^\\.\\./")
    message(FATAL_ERROR "Data file referenced by argument\n"
      "  ${arg}\n"
      "does not lie under the top-level source directory\n"
      "  ${top_src}\n")
  endif()
647
648
649
650
651
652
653
  if(data_is_directory AND NOT IS_DIRECTORY "${top_src}/${reldata}")
    message(FATAL_ERROR "Data directory referenced by argument\n"
      "  ${arg}\n"
      "corresponds to source tree path\n"
      "  ${reldata}\n"
      "that does not exist as a directory!")
  endif()
Brad King's avatar
Brad King committed
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
  if(NOT ExternalData_BINARY_ROOT)
    set(ExternalData_BINARY_ROOT "${CMAKE_BINARY_DIR}")
  endif()
  set(top_bin "${ExternalData_BINARY_ROOT}")

  # Handle in-source builds gracefully.
  if("${top_src}" STREQUAL "${top_bin}")
    if(ExternalData_LINK_CONTENT)
      message(WARNING "ExternalData_LINK_CONTENT cannot be used in-source")
      set(ExternalData_LINK_CONTENT 0)
    endif()
    set(top_same 1)
  endif()

  set(external "") # Entries external to the source tree.
  set(internal "") # Entries internal to the source tree.
670
  set(have_original ${data_is_directory})
671
  set(have_original_as_dir 0)
Brad King's avatar
Brad King committed
672
673

  # Process options.
674
  set(series_option "")
675
  set(recurse_option "")
Brad King's avatar
Brad King committed
676
677
678
  set(associated_files "")
  set(associated_regex "")
  foreach(opt ${options})
679
680
681
    # Regular expression to match associated files.
    if("x${opt}" MATCHES "^xREGEX:([^:/]+)$")
      list(APPEND associated_regex "${CMAKE_MATCH_1}")
682
    elseif(opt STREQUAL ":")
683
684
      # Activate series matching.
      set(series_option "${opt}")
685
686
687
    elseif(opt STREQUAL "RECURSE:")
      # Activate recursive matching in directories.
      set(recurse_option "${opt}")
Brad King's avatar
Brad King committed
688
689
690
691
692
693
694
695
696
    elseif("x${opt}" MATCHES "^[^][:/*?]+$")
      # Specific associated file.
      list(APPEND associated_files "${opt}")
    else()
      message(FATAL_ERROR "Unknown option \"${opt}\" in argument\n"
        "  ${arg}\n")
    endif()
  endforeach()

697
  if(series_option)
698
699
700
    if(data_is_directory)
      message(FATAL_ERROR "Series option \"${series_option}\" not allowed with directories.")
    endif()
701
702
703
    if(associated_files OR associated_regex)
      message(FATAL_ERROR "Series option \"${series_option}\" not allowed with associated files.")
    endif()
704
705
706
    if(recurse_option)
      message(FATAL_ERROR "Recurse option \"${recurse_option}\" allowed only with directories.")
    endif()
707
    # Load a whole file series.
Brad King's avatar
Brad King committed
708
    _ExternalData_arg_series()
709
710
711
712
713
714
715
716
717
  elseif(data_is_directory)
    if(associated_files OR associated_regex)
      # Load listed/matching associated files in the directory.
      _ExternalData_arg_associated()
    else()
      message(FATAL_ERROR "Data directory referenced by argument\n"
        "  ${arg}\n"
        "must list associated files.")
    endif()
718
  else()
719
720
721
    if(recurse_option)
      message(FATAL_ERROR "Recurse option \"${recurse_option}\" allowed only with directories.")
    endif()
722
723
724
725
726
727
    # Load the named data file.
    _ExternalData_arg_single()
    if(associated_files OR associated_regex)
      # Load listed/matching associated files.
      _ExternalData_arg_associated()
    endif()
Brad King's avatar
Brad King committed
728
729
730
  endif()

  if(NOT have_original)
731
732
733
734
735
736
737
738
    if(have_original_as_dir)
      set(msg_kind FATAL_ERROR)
      set(msg "that is directory instead of a file!")
    else()
      set(msg_kind AUTHOR_WARNING)
      set(msg "that does not exist as a file (with or without an extension)!")
    endif()
    message(${msg_kind} "Data file referenced by argument\n"
Brad King's avatar
Brad King committed
739
740
741
      "  ${arg}\n"
      "corresponds to source tree path\n"
      "  ${reldata}\n"
742
      "${msg}")
Brad King's avatar
Brad King committed
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
  endif()

  if(external)
    # Make the series available in the build tree.
    set_property(GLOBAL APPEND PROPERTY
      _ExternalData_${target}_FETCH "${external}")
    set_property(GLOBAL APPEND PROPERTY
      _ExternalData_${target}_LOCAL "${internal}")
    set("${var_file}" "${top_bin}/${reldata}" PARENT_SCOPE)
  else()
    # The whole series is in the source tree.
    set("${var_file}" "${top_src}/${reldata}" PARENT_SCOPE)
  endif()
endfunction()

macro(_ExternalData_arg_associated)
  # Associated files lie in the same directory.
760
761
762
763
764
  if(data_is_directory)
    set(reldir "${reldata}")
  else()
    get_filename_component(reldir "${reldata}" PATH)
  endif()
Brad King's avatar
Brad King committed
765
  if(reldir)
Daniel Pfeifer's avatar
Daniel Pfeifer committed
766
    string(APPEND reldir "/")
Brad King's avatar
Brad King committed
767
768
  endif()
  _ExternalData_exact_regex(reldir_regex "${reldir}")
769
770
  if(recurse_option)
    set(glob GLOB_RECURSE)
Daniel Pfeifer's avatar
Daniel Pfeifer committed
771
    string(APPEND reldir_regex "(.+/)?")
772
773
774
  else()
    set(glob GLOB)
  endif()
Brad King's avatar
Brad King committed
775
776
777
778

  # Find files named explicitly.
  foreach(file ${associated_files})
    _ExternalData_exact_regex(file_regex "${file}")
779
    _ExternalData_arg_find_files(${glob} "${reldir}${file}"
780
      "${reldir_regex}${file_regex}")
Brad King's avatar
Brad King committed
781
782
783
784
785
786
  endforeach()

  # Find files matching the given regular expressions.
  set(all "")
  set(sep "")
  foreach(regex ${associated_regex})
Daniel Pfeifer's avatar
Daniel Pfeifer committed
787
    string(APPEND all "${sep}${reldir_regex}${regex}")
Brad King's avatar
Brad King committed
788
789
    set(sep "|")
  endforeach()
790
  _ExternalData_arg_find_files(${glob} "${reldir}" "${all}")
Brad King's avatar
Brad King committed
791
792
793
794
795
endmacro()

macro(_ExternalData_arg_single)
  # Match only the named data by itself.
  _ExternalData_exact_regex(data_regex "${reldata}")
796
  _ExternalData_arg_find_files(GLOB "${reldata}" "${data_regex}")
Brad King's avatar
Brad King committed
797
798
799
800
endmacro()

macro(_ExternalData_arg_series)
  # Configure series parsing and matching.
801
802
803
  set(series_parse_prefix "")
  set(series_parse_number "\\1")
  set(series_parse_suffix "\\2")
Brad King's avatar
Brad King committed
804
  if(ExternalData_SERIES_PARSE)
805
806
807
808
809
810
811
    if(ExternalData_SERIES_PARSE_NUMBER AND ExternalData_SERIES_PARSE_SUFFIX)
      if(ExternalData_SERIES_PARSE_PREFIX)
        set(series_parse_prefix "\\${ExternalData_SERIES_PARSE_PREFIX}")
      endif()
      set(series_parse_number "\\${ExternalData_SERIES_PARSE_NUMBER}")
      set(series_parse_suffix "\\${ExternalData_SERIES_PARSE_SUFFIX}")
    elseif(NOT "x${ExternalData_SERIES_PARSE}" MATCHES "^x\\([^()]*\\)\\([^()]*\\)\\$$")
Brad King's avatar
Brad King committed
812
813
814
815
      message(FATAL_ERROR
        "ExternalData_SERIES_PARSE is set to\n"
        "  ${ExternalData_SERIES_PARSE}\n"
        "which is not of the form\n"
816
817
818
819
820
821
        "  (<number>)(<suffix>)$\n"
        "Fix the regular expression or set variables\n"
        "  ExternalData_SERIES_PARSE_PREFIX = <prefix> regex group number, if any\n"
        "  ExternalData_SERIES_PARSE_NUMBER = <number> regex group number\n"
        "  ExternalData_SERIES_PARSE_SUFFIX = <suffix> regex group number\n"
        )
Brad King's avatar
Brad King committed
822
823
824
    endif()
    set(series_parse "${ExternalData_SERIES_PARSE}")
  else()
825
    set(series_parse "([0-9]*)(\\.[^./]*)$")
Brad King's avatar
Brad King committed
826
827
828
829
  endif()
  if(ExternalData_SERIES_MATCH)
    set(series_match "${ExternalData_SERIES_MATCH}")
  else()
830
    set(series_match "[_.-]?[0-9]*")
Brad King's avatar
Brad King committed
831
832
833
  endif()

  # Parse the base, number, and extension components of the series.
834
  string(REGEX REPLACE "${series_parse}" "${series_parse_prefix};${series_parse_number};${series_parse_suffix}" tuple "${reldata}")
Brad King's avatar
Brad King committed
835
836
837
838
839
840
841
842
843
844
845
846
847
  list(LENGTH tuple len)
  if(NOT "${len}" EQUAL 3)
    message(FATAL_ERROR "Data file referenced by argument\n"
      "  ${arg}\n"
      "corresponds to path\n"
      "  ${reldata}\n"
      "that does not match regular expression\n"
      "  ${series_parse}")
  endif()
  list(GET tuple 0 relbase)
  list(GET tuple 2 ext)

  # Glob files that might match the series.
848
  # Then match base, number, and extension.
Brad King's avatar
Brad King committed
849
850
  _ExternalData_exact_regex(series_base "${relbase}")
  _ExternalData_exact_regex(series_ext "${ext}")
851
  _ExternalData_arg_find_files(GLOB "${relbase}*${ext}"
Brad King's avatar
Brad King committed
852
853
854
    "${series_base}${series_match}${series_ext}")
endmacro()

855
856
857
858
859
function(_ExternalData_arg_find_files glob pattern regex)
  cmake_policy(PUSH)
  cmake_policy(SET CMP0009 NEW)
  file(${glob} globbed RELATIVE "${top_src}" "${top_src}/${pattern}*")
  cmake_policy(POP)
860
  set(externals_count -1)
Brad King's avatar
Brad King committed
861
  foreach(entry IN LISTS globbed)
862
    if("x${entry}" MATCHES "^x(.*)(\\.(${_ExternalData_REGEX_EXT}))$")
863
864
865
866
867
868
      set(relname "${CMAKE_MATCH_1}")
      set(alg "${CMAKE_MATCH_2}")
    else()
      set(relname "${entry}")
      set(alg "")
    endif()
869
870
871
    if("x${relname}" MATCHES "^x${regex}$" # matches
        AND NOT "x${relname}" MATCHES "(^x|/)\\.ExternalData_" # not staged obj
        )
872
873
874
875
876
877
878
879
      if(IS_DIRECTORY "${top_src}/${entry}")
        if("${relname}" STREQUAL "${reldata}")
          set(have_original_as_dir 1)
        endif()
      else()
        set(name "${top_src}/${relname}")
        set(file "${top_bin}/${relname}")
        if(alg)
880
881
882
883
884
          if(NOT "${external_${externals_count}_file_name}" STREQUAL "${file}|${name}")
            math(EXPR externals_count "${externals_count} + 1")
            set(external_${externals_count}_file_name "${file}|${name}")
          endif()
          list(APPEND external_${externals_count}_algs "${alg}")
885
886
887
888
889
890
891
892
893
        elseif(ExternalData_LINK_CONTENT)
          _ExternalData_link_content("${name}" alg)
          list(APPEND external "${file}|${name}|${alg}")
        elseif(NOT top_same)
          list(APPEND internal "${file}|${name}")
        endif()
        if("${relname}" STREQUAL "${reldata}")
          set(have_original 1)
        endif()
Brad King's avatar
Brad King committed
894
895
896
      endif()
    endif()
  endforeach()
897
898
899
900
901
902
903
904
  if(${externals_count} GREATER -1)
    foreach(ii RANGE ${externals_count})
      string(REPLACE ";" "+" algs_delim "${external_${ii}_algs}")
      list(APPEND external "${external_${ii}_file_name}|${algs_delim}")
      unset(external_${ii}_algs)
      unset(external_${ii}_file_name)
    endforeach()
  endif()
Brad King's avatar
Brad King committed
905
906
907
  set(external "${external}" PARENT_SCOPE)
  set(internal "${internal}" PARENT_SCOPE)
  set(have_original "${have_original}" PARENT_SCOPE)
908
  set(have_original_as_dir "${have_original_as_dir}" PARENT_SCOPE)
Brad King's avatar
Brad King committed
909
910
911
912
913
914
915
916
917
918
919
920
endfunction()

#-----------------------------------------------------------------------------
# Private script mode interface

if(CMAKE_GENERATOR OR NOT ExternalData_ACTION)
  return()
endif()

if(ExternalData_CONFIG)
  include(${ExternalData_CONFIG})
endif()
921
922
923
if(NOT ExternalData_URL_TEMPLATES AND NOT ExternalData_OBJECT_STORES)
  message(FATAL_ERROR
    "Neither ExternalData_URL_TEMPLATES nor ExternalData_OBJECT_STORES is set!")
Brad King's avatar
Brad King committed
924
925
926
927
928
929
930
931
endif()

function(_ExternalData_link_or_copy src dst)
  # Create a temporary file first.
  get_filename_component(dst_dir "${dst}" PATH)
  file(MAKE_DIRECTORY "${dst_dir}")
  _ExternalData_random(random)
  set(tmp "${dst}.tmp${random}")
932
  if(UNIX AND NOT ExternalData_NO_SYMLINKS)
Brad King's avatar
Brad King committed
933
934
935
936
937
938
939
940
941
942
943
    # Create a symbolic link.
    set(tgt "${src}")
    if(relative_top)
      # Use relative path if files are close enough.
      file(RELATIVE_PATH relsrc "${relative_top}" "${src}")
      file(RELATIVE_PATH relfile "${relative_top}" "${dst}")
      if(NOT IS_ABSOLUTE "${relsrc}" AND NOT "${relsrc}" MATCHES "^\\.\\./" AND
          NOT IS_ABSOLUTE "${reldst}" AND NOT "${reldst}" MATCHES "^\\.\\./")
        file(RELATIVE_PATH tgt "${dst_dir}" "${src}")
      endif()
    endif()
944
945
    # Create link (falling back to copying if there's a problem).
    file(CREATE_LINK "${tgt}" "${tmp}" RESULT result COPY_ON_ERROR SYMBOLIC)
Brad King's avatar
Brad King committed
946
947
  else()
    # Create a copy.
948
    file(COPY_FILE "${src}" "${tmp}" RESULT result)
Brad King's avatar
Brad King committed
949
950
951
  endif()
  if(result)
    file(REMOVE "${tmp}")
952
    message(FATAL_ERROR "Failed to create:\n  \"${tmp}\"\nfrom:\n  \"${obj}\"\nwith error:\n  ${result}")
Brad King's avatar
Brad King committed
953
954
955
956
957
958
959
960
961
962
  endif()

  # Atomically create/replace the real destination.
  file(RENAME "${tmp}" "${dst}")
endfunction()

function(_ExternalData_download_file url file err_var msg_var)
  set(retry 3)
  while(retry)
    math(EXPR retry "${retry} - 1")
963
964
965
966
    if(ExternalData_TIMEOUT_INACTIVITY)
      set(inactivity_timeout INACTIVITY_TIMEOUT ${ExternalData_TIMEOUT_INACTIVITY})
    elseif(NOT "${ExternalData_TIMEOUT_INACTIVITY}" EQUAL 0)
      set(inactivity_timeout INACTIVITY_TIMEOUT 60)
Brad King's avatar
Brad King committed
967
968
969
970
971
972
973
974
975
976
    else()
      set(inactivity_timeout "")
    endif()
    if(ExternalData_TIMEOUT_ABSOLUTE)
      set(absolute_timeout TIMEOUT ${ExternalData_TIMEOUT_ABSOLUTE})
    elseif(NOT "${ExternalData_TIMEOUT_ABSOLUTE}" EQUAL 0)
      set(absolute_timeout TIMEOUT 300)
    else()
      set(absolute_timeout "")
    endif()
977
978
979
980
981
    set(show_progress_args)
    if (ExternalData_SHOW_PROGRESS)
      list(APPEND show_progress_args SHOW_PROGRESS)
    endif ()
    file(DOWNLOAD "${url}" "${file}" STATUS status LOG log ${inactivity_timeout} ${absolute_timeout} ${show_progress_args})
Brad King's avatar
Brad King committed
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
    list(GET status 0 err)
    list(GET status 1 msg)
    if(err)
      if("${msg}" MATCHES "HTTP response code said error" AND
          "${log}" MATCHES "error: 503")
        set(msg "temporarily unavailable")
      endif()
    elseif("${log}" MATCHES "\nHTTP[^\n]* 503")
      set(err TRUE)
      set(msg "temporarily unavailable")
    endif()
    if(NOT err OR NOT "${msg}" MATCHES "partial|timeout|temporarily")
      break()
    elseif(retry)
      message(STATUS "[download terminated: ${msg}, retries left: ${retry}]")
    endif()
  endwhile()
  set("${err_var}" "${err}" PARENT_SCOPE)
  set("${msg_var}" "${msg}" PARENT_SCOPE)
endfunction()

1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
function(_ExternalData_custom_fetch key loc file err_var msg_var)
  if(NOT ExternalData_CUSTOM_SCRIPT_${key})
    set(err 1)
    set(msg "No ExternalData_CUSTOM_SCRIPT_${key} set!")
  elseif(NOT EXISTS "${ExternalData_CUSTOM_SCRIPT_${key}}")
    set(err 1)
    set(msg "No '${ExternalData_CUSTOM_SCRIPT_${key}}' exists!")
  else()
    set(ExternalData_CUSTOM_LOCATION "${loc}")
    set(ExternalData_CUSTOM_FILE "${file}")
    unset(ExternalData_CUSTOM_ERROR)
    include("${ExternalData_CUSTOM_SCRIPT_${key}}")
    if(DEFINED ExternalData_CUSTOM_ERROR)
      set(err 1)
      set(msg "${ExternalData_CUSTOM_ERROR}")
    else()
      set(err 0)
      set(msg "no error")
    endif()
  endif()
  set("${err_var}" "${err}" PARENT_SCOPE)
  set("${msg_var}" "${msg}" PARENT_SCOPE)
endfunction()

1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
function(_ExternalData_get_from_object_store hash algo var_obj var_success)
  # Search all object stores for an existing object.
  foreach(dir ${ExternalData_OBJECT_STORES})
    set(obj "${dir}/${algo}/${hash}")
    if(EXISTS "${obj}")
      message(STATUS "Found object: \"${obj}\"")
      set("${var_obj}" "${obj}" PARENT_SCOPE)
      set("${var_success}" 1 PARENT_SCOPE)
      return()
    endif()
  endforeach()
endfunction()

function(_ExternalData_download_object name hash algo var_obj var_success var_errorMsg)
Brad King's avatar
Brad King committed
1041
  # Search all object stores for an existing object.
1042
  set(success 1)
Brad King's avatar
Brad King committed
1043
1044
1045
1046
1047
  foreach(dir ${ExternalData_OBJECT_STORES})
    set(obj "${dir}/${algo}/${hash}")
    if(EXISTS "${obj}")
      message(STATUS "Found object: \"${obj}\"")
      set("${var_obj}" "${obj}" PARENT_SCOPE)
1048
      set("${var_success}" "${success}" PARENT_SCOPE)
Brad King's avatar
Brad King committed
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
      return()
    endif()
  endforeach()

  # Download object to the first store.
  list(GET ExternalData_OBJECT_STORES 0 store)
  set(obj "${store}/${algo}/${hash}")

  _ExternalData_random(random)
  set(tmp "${obj}.tmp${random}")
  set(found 0)
  set(tried "")
  foreach(url_template IN LISTS ExternalData_URL_TEMPLATES)
    string(REPLACE "%(hash)" "${hash}" url_tmp "${url_template}")
    string(REPLACE "%(algo)" "${algo}" url "${url_tmp}")
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
    if(url MATCHES "^(.*)%\\(algo:([A-Za-z_][A-Za-z0-9_]*)\\)(.*)$")
      set(lhs "${CMAKE_MATCH_1}")
      set(key "${CMAKE_MATCH_2}")
      set(rhs "${CMAKE_MATCH_3}")
      if(DEFINED ExternalData_URL_ALGO_${algo}_${key})
        set(url "${lhs}${ExternalData_URL_ALGO_${algo}_${key}}${rhs}")
      else()
        set(url "${lhs}${algo}${rhs}")
      endif()
    endif()
1074
1075
    string(REGEX REPLACE "((https?|ftp)://)([^@]+@)?(.*)" "\\1\\4" secured_url "${url}")
    message(STATUS "Fetching \"${secured_url}\"")
1076
1077
1078
1079
1080
    if(url MATCHES "^ExternalDataCustomScript://([A-Za-z_][A-Za-z0-9_]*)/(.*)$")
      _ExternalData_custom_fetch("${CMAKE_MATCH_1}" "${CMAKE_MATCH_2}" "${tmp}" err errMsg)
    else()
      _ExternalData_download_file("${url}" "${tmp}" err errMsg)
    endif()
Daniel Pfeifer's avatar
Daniel Pfeifer committed
1081
    string(APPEND tried "\n  ${url}")
Brad King's avatar
Brad King committed
1082
    if(err)
Daniel Pfeifer's avatar
Daniel Pfeifer committed
1083
      string(APPEND tried " (${errMsg})")
Brad King's avatar
Brad King committed
1084
1085
1086
1087
1088
1089
1090
    else()
      # Verify downloaded object.
      _ExternalData_compute_hash(dl_hash "${algo}" "${tmp}")
      if("${dl_hash}" STREQUAL "${hash}")
        set(found 1)
        break()
      else()
Daniel Pfeifer's avatar
Daniel Pfeifer committed
1091
        string(APPEND tried " (wrong hash ${algo}=${dl_hash})")
Brad King's avatar
Brad King committed
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
        if("$ENV{ExternalData_DEBUG_DOWNLOAD}" MATCHES ".")
          file(RENAME "${tmp}" "${store}/${algo}/${dl_hash}")
        endif()
      endif()
    endif()
    file(REMOVE "${tmp}")
  endforeach()

  get_filename_component(dir "${name}" PATH)
  set(staged "${dir}/.ExternalData_${algo}_${hash}")

1103
  set(success 1)
Brad King's avatar
Brad King committed
1104
  if(found)
1105
1106
1107
1108
1109
1110
1111
1112
    # Atomically create the object.  If we lose a race with another process,
    # do not replace it.  Content-addressing ensures it has what we expect.
    file(RENAME "${tmp}" "${obj}" NO_REPLACE RESULT result)
    if (result STREQUAL "NO_REPLACE")
      file(REMOVE "${tmp}")
    elseif (result)
      message(FATAL_ERROR "Failed to rename:\n  \"${tmp}\"\nto:\n  \"${obj}\"\nwith error:\n  ${result}")
    endif()
Brad King's avatar
Brad King committed
1113
1114
1115
1116
1117
    message(STATUS "Downloaded object: \"${obj}\"")
  elseif(EXISTS "${staged}")
    set(obj "${staged}")
    message(STATUS "Staged object: \"${obj}\"")
  else()
1118
1119
1120
    if(NOT tried)
      set(tried "\n  (No ExternalData_URL_TEMPLATES given)")
    endif()
1121
1122
    set(success 0)
    set("${var_errorMsg}" "Object ${algo}=${hash} not found at:${tried}" PARENT_SCOPE)
Brad King's avatar
Brad King committed
1123
1124
1125
  endif()

  set("${var_obj}" "${obj}" PARENT_SCOPE)
1126
  set("${var_success}" "${success}" PARENT_SCOPE)
Brad King's avatar
Brad King committed
1127
1128
1129
endfunction()

if("${ExternalData_ACTION}" STREQUAL "fetch")
1130
  foreach(v ExternalData_OBJECT_STORES file name exts)
Brad King's avatar
Brad King committed
1131
1132
1133
1134
1135
    if(NOT DEFINED "${v}")
      message(FATAL_ERROR "No \"-D${v}=\" value provided!")
    endif()
  endforeach()

1136
1137
1138
1139
1140
1141
1142
1143
1144
1145
1146
1147
1148
  string(REPLACE "+" ";" exts_list "${exts}")
  set(succeeded 0)
  set(errorMsg "")
  set(hash_list )
  set(algo_list )
  set(hash )
  set(algo )
  foreach(ext ${exts_list})
    file(READ "${name}${ext}" hash)
    string(STRIP "${hash}" hash)

    if("${ext}" MATCHES "^\\.(${_ExternalData_REGEX_EXT})$")
      string(TOUPPER "${CMAKE_MATCH_1}" algo)
1149
      string(REPLACE "-" "_" algo "${algo}")
1150
1151
1152
    else()
      message(FATAL_ERROR "Unknown hash algorithm extension \"${ext}\"")
    endif()
Brad King's avatar
Brad King committed
1153

1154
1155
1156
    list(APPEND hash_list ${hash})
    list(APPEND algo_list ${algo})
  endforeach()
Brad King's avatar
Brad King committed
1157

1158
1159
1160
1161
1162
1163
1164
1165
1166
1167
1168
1169
1170
1171
1172
1173
  list(LENGTH exts_list num_extensions)
  math(EXPR exts_range "${num_extensions} - 1")
  foreach(ii RANGE 0 ${exts_range})
    list(GET hash_list ${ii} hash)
    list(GET algo_list ${ii} algo)
    _ExternalData_get_from_object_store("${hash}" "${algo}" obj succeeded)
    if(succeeded)
      break()
    endif()
  endforeach()
  if(NOT succeeded)
    foreach(ii RANGE 0 ${exts_range})
      list(GET hash_list ${ii} hash)
      list(GET algo_list ${ii} algo)
      _ExternalData_download_object("${name}" "${hash}" "${algo}"
        obj succeeded algoErrorMsg)
Daniel Pfeifer's avatar
Daniel Pfeifer committed
1174
      string(APPEND errorMsg "\n${algoErrorMsg}")
1175
1176
1177
1178
1179
1180
1181
1182
      if(succeeded)
        break()
      endif()
    endforeach()
  endif()
  if(NOT succeeded)
    message(FATAL_ERROR "${errorMsg}")
  endif()
Brad King's avatar
Brad King committed
1183
  # Check if file already corresponds to the object.
1184
  set(stamp "-hash-stamp")
Brad King's avatar
Brad King committed
1185
1186
1187
1188
1189
1190
1191
1192
1193
1194
1195
  set(file_up_to_date 0)
  if(EXISTS "${file}" AND EXISTS "${file}${stamp}")
    file(READ "${file}${stamp}" f_hash)
    string(STRIP "${f_hash}" f_hash)
    if("${f_hash}" STREQUAL "${hash}")
      set(file_up_to_date 1)
    endif()
  endif()

  if(file_up_to_date)
    # Touch the file to convince the build system it is up to date.
1196
    file(TOUCH "${file}")
Brad King's avatar
Brad King committed
1197
1198
1199
1200
1201
1202
1203
1204
1205
1206
1207
1208
1209
1210
1211
1212
  else()
    _ExternalData_link_or_copy("${obj}" "${file}")
  endif()

  # Atomically update the hash/timestamp file to record the object referenced.
  _ExternalData_atomic_write("${file}${stamp}" "${hash}\n")
elseif("${ExternalData_ACTION}" STREQUAL "local")
  foreach(v file name)
    if(NOT DEFINED "${v}")
      message(FATAL_ERROR "No \"-D${v}=\" value provided!")
    endif()
  endforeach()
  _ExternalData_link_or_copy("${name}" "${file}")
else()
  message(FATAL_ERROR "Unknown ExternalData_ACTION=[${ExternalData_ACTION}]")
endif()