Use std::Min/Max over fmin/fmax
We had a report that vtkm::Min/Max was significantly slower than other products. This was traced back to the fact that these functions were not completely inlining because they were calling fmin or fmax, and that resulted in an actual C library call. It turns out using the templated functions in the std namespace is faster.
This change has the VTK-m min/max functions use the std version in almost all circumstances. The one exception (so far) is that fmin and fmax are used for CUDA devices since the std functions are not declared to run on the device and the nvcc compiler treats these functions special.