This allows for easier host side logic when determining grid and block sizes, and allows for a smaller library side by moving some logic into compiled in functions.