Skip to content
Snippets Groups Projects
Verified Commit 2cbccd73 authored by Daniel Reinert's avatar Daniel Reinert :grimacing: Committed by Yen-Chen Chen
Browse files

Performance optimization for NEC SX AURORA (!97)


The subroutines `init_zero_4d_[dp,sp,i4]` do not vectorize properly on NEC SX AURORA.

Loop collapsing is enforced by a compiler directive for the subroutines `init_zero_4d_[dp,sp,i4]`, in order to ensure proper vectorization on NEC SX AURORA.

Approved-by: default avatarYen-Chen Chen <yen-chen.chen@kit.edu>
Merged-by: default avatarYen-Chen Chen <yen-chen.chen@kit.edu>
Changelog: feature
parent 0c423645
No related branches found
No related tags found
No related merge requests found
......@@ -1201,6 +1201,7 @@ CONTAINS
#else
!$omp do collapse(4)
#endif
!NEC$ forced_collapse
DO i4 = 1, m4
DO i3 = 1, m3
DO i2 = 1, m2
......@@ -1235,6 +1236,7 @@ CONTAINS
#else
!$omp do collapse(4)
#endif
!NEC$ forced_collapse
DO i4 = 1, m4
DO i3 = 1, m3
DO i2 = 1, m2
......@@ -1269,6 +1271,7 @@ CONTAINS
#else
!$omp do collapse(4)
#endif
!NEC$ forced_collapse
DO i4 = 1, m4
DO i3 = 1, m3
DO i2 = 1, m2
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment