Skip to content
GitLab
  • Menu
Projects Groups Snippets
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in / Register
  • VTK-m VTK-m
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 193
    • Issues 193
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 37
    • Merge requests 37
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Wiki
    • Wiki
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • VTK
  • VTK-mVTK-m
  • Merge requests
  • !134

[WIP] OpenMP Backend

  • Review changes

  • Download
  • Email patches
  • Plain diff
Closed Will Usher requested to merge Twinklebear/vtk-m:backend-openmp into master Aug 18, 2015
  • Overview 16
  • Commits 26
  • Changes 19

This is some initial work on putting together an OpenMP backend for VTK-m.

Specialized Algorithms:

  • Schedule: Just a #pragma omp parallel for around a for loop calling the kernel.

  • Synchronize: Does nothing, the backend follows a fork/join model for each algorithm so no tasks are left running in the background after returning from a method.

  • ScanInclusive: This is a port of a hand-vectorized scan optimized for MIC by Jeff Inman at LANL, it performs well but is ~35x slower than Intel's TBB scan on MIC. There's still some work left to do here to catch up to TBB, I took a quick glance at their code but would like to find a paper or better description of what they've implemented since it looks a bit complicated. The plots below show speedup vs. the serial backend of sort on CPU and MIC when working on 4194304 vtkm::Float32's.

cpu_scan_speedup mic_scan_speedup

  • Sort: This is a parallel quick sort, it's TBB also uses a parallel quicksort and we get similar performance, however there may be a few bugs to sort out here. If I reduce the grain size to some amount so that more than one thread is used in the Sort unit test it seems to hang after finishing the sort. I'm not sure why at the moment.

Let me know what comments you have, there's still a good bit of work left on this backend.

Assignee
Assign to
Reviewer
Request review from
Time tracking
Source branch: backend-openmp