Skip to content
Snippets Groups Projects

Optimized mask clustering

Merged Daniel Hugo Campora Perez requested to merge optimized_mask_clustering into master
  • It now finds 0.019493% more clusters (down from 0.07%)
  • The algorithm should be a tad faster
  • Added support for CMAKE_BUILD_TYPE option. Available options:
    • RelWithDebInfo (default)
    • Release
    • Debug
  • EstimateInputSize logic changed for adding candidates. Using masks now.
  • Removed sp_size from the GPU (it was unused).
  • Added constant candidate_ks for finding out active pixel numbers in a four-bit number (EstimateInputSize optimization).
  • Prefix Sum has been optimized, following the strategy of Merrill’s 2–level upsweep/downsweep.
  • When profiling Clustering alone, or the whole application, performance rate is now more stable and less picky about synchronization shenanigans.
  • A Handler class has been created, holding the stream, blocks and threads attributes. Any Handler should inherit from it.
  • Consolidate tracks is on by default now.
  • Found a good configuration for EstimateInputSize call. 70 kHz on 1080 Ti.
Edited by Daniel Hugo Campora Perez

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
Please register or sign in to reply
Loading