Make ScanInclusiveByKey and ScanExclusiveByKey void functions.
These two algorithms do not return meaningful return values. Generic interface and implementation are both void. Remove erroneous return type and statement for CUDA backend.
Edited by Li-Ta Lo