Performance optimization for NEC SX AURORA
What is the bug
The subroutines init_zero_4d_[dp,sp,i4]
do not vectorize properly on NEC SX AURORA.
How do you fix it
Loop collapsing is enforced by a compiler directive for the subroutines init_zero_4d_[dp,sp,i4]
, in order to ensure proper vectorization on NEC SX AURORA.
How urgent is the bugfix
-
I need it as soon as possible -
I can wait for a couple of days -
None of my current codes is directly affected
Mandatory steps before review
-
Gitlab CI passes (Hint: use make format
for linting) -
Bugfix is covered by additional unit tests -
Mark the merge request as ready by removing Draft:
Mandatory steps before merge
-
Test coverage does not decrease -
Reviewed by a maintainer -
Incorporate review suggestions -
Prior to merging, please remove any boilerplate from the MR description, retaining only the What is the bug and How do you fix it section to maintain -
Remember to edit the commit message and select the proper changelog category (feature/bugfix/other)
You are not supposed to merge this request by yourself, the maintainers of fortan-support take care of this action!
Edited by Yen-Chen Chen