Optimize tdma_solver_vec and make it optionally asynchronous

Review changes
Download
Patches
Plain diff

Optimize tdma_solver_vec and make it optionally asynchronous

Dmitry Alexeevrequested to merge

nvidia-optimize-tdma into main Nov 07, 2024

Overview 0
Commits 3
Pipelines 3
Changes 1

What is the bug

This is a small optimization of tdma_solver_vec that improves its performance and makes it possible to use completely asynchronously. It's a prerequisite to the JSBACH CUDA graph implementation: https://gitlab.dkrz.de/jsbach/jsbach/-/merge_requests/173

How do you fix it

N/A

How urgent is the bugfix

I need it as soon as possible
I can wait for a couple of days
None of my current codes is directly affected

Mandatory steps before review

Gitlab CI passes (Hint: use make format for linting)
Bugfix is covered by additional unit tests
Mark the merge request as ready by removing Draft:

Mandatory steps before merge

Reviewed by a maintainer
Incorporate review suggestions
Remember to edit the commit message and select the proper changelog category (feature/bugfix/other)
Prior to merging, please remove any boilerplate from the MR description, retaining only the What is the bug and How do you fix it section to maintain

You are not supposed to merge this request by yourself, the maintainers of libiconmath take care of this action!

Edited Nov 19, 2024 by Pradipta Samanta

Merge request reports

Assignee Loading

Reviewers Loading

Request review from

Loading

Time tracking Loading

Loading

Imprint | Privacy Policy