Fix Parallel Scan implementation for TBB Device
The previous implementation assumed the identity value to be zero, which does not work for multiplication. Changed the interface to require an initial value for Exclusive Scan with custom operator (TBB Device only, for now).