Matches in SemOpenAlex for { <https://semopenalex.org/work/W4236561641> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4236561641 abstract "In the multicore era it was possible to exploit the increase in on-chip parallelism by simply running multiple MPI processes per chip. Unfortunately, manycore processors' greatly increased thread- and data-level parallelism coupled with a reduced memory capacity demand an altogether different approach. In this paper we explore augmenting two NWChem modules, triples correction of the CCSD(T) and Fock matrix construction, with OpenMP in order that they might run efficiently on future manycore architectures. As the next NERSC machine will be a self-hosted Intel MIC (Xeon Phi) based supercomputer, we leverage an existing MIC testbed at NERSC to evaluate our experiments. In order to proxy the fact that future MIC machines will not have a host processor, we run all of our experiments in tt native mode. We found that while straightforward application of OpenMP to the deep loop nests associated with the tensor contractions of CCSD(T) was sufficient in attaining high performance, significant effort was required to safely and efficiently thread the TEXAS integral package when constructing the Fock matrix. Ultimately, our new MPI OpenMP hybrid implementations attain up to 65x better performance for the triples part of the CCSD(T) due in large part to the fact that the limited on-card memory limits the existing MPI implementation to a single process per card. Additionally, we obtain up to 1.6x better performance on Fock matrix constructions when compared with the best MPI implementations running multiple processes per card." @default.
- W4236561641 created "2022-05-12" @default.
- W4236561641 creator A5006616066 @default.
- W4236561641 creator A5010296675 @default.
- W4236561641 creator A5034688782 @default.
- W4236561641 creator A5057932953 @default.
- W4236561641 date "2014-10-10" @default.
- W4236561641 modified "2023-09-25" @default.
- W4236561641 title "Thread-Level Parallelization and Optimization of NWChem for the Intel MIC Architecture" @default.
- W4236561641 doi "https://doi.org/10.2172/1163233" @default.
- W4236561641 hasPublicationYear "2014" @default.
- W4236561641 type Work @default.
- W4236561641 citedByCount "0" @default.
- W4236561641 crossrefType "report" @default.
- W4236561641 hasAuthorship W4236561641A5006616066 @default.
- W4236561641 hasAuthorship W4236561641A5010296675 @default.
- W4236561641 hasAuthorship W4236561641A5034688782 @default.
- W4236561641 hasAuthorship W4236561641A5057932953 @default.
- W4236561641 hasBestOaLocation W42365616413 @default.
- W4236561641 hasConcept C111919701 @default.
- W4236561641 hasConcept C118524514 @default.
- W4236561641 hasConcept C119857082 @default.
- W4236561641 hasConcept C138101251 @default.
- W4236561641 hasConcept C145108525 @default.
- W4236561641 hasConcept C149635348 @default.
- W4236561641 hasConcept C153083717 @default.
- W4236561641 hasConcept C165696696 @default.
- W4236561641 hasConcept C173608175 @default.
- W4236561641 hasConcept C199360897 @default.
- W4236561641 hasConcept C26713055 @default.
- W4236561641 hasConcept C2780513914 @default.
- W4236561641 hasConcept C38652104 @default.
- W4236561641 hasConcept C41008148 @default.
- W4236561641 hasConcept C78766204 @default.
- W4236561641 hasConcept C83283714 @default.
- W4236561641 hasConcept C96972482 @default.
- W4236561641 hasConceptScore W4236561641C111919701 @default.
- W4236561641 hasConceptScore W4236561641C118524514 @default.
- W4236561641 hasConceptScore W4236561641C119857082 @default.
- W4236561641 hasConceptScore W4236561641C138101251 @default.
- W4236561641 hasConceptScore W4236561641C145108525 @default.
- W4236561641 hasConceptScore W4236561641C149635348 @default.
- W4236561641 hasConceptScore W4236561641C153083717 @default.
- W4236561641 hasConceptScore W4236561641C165696696 @default.
- W4236561641 hasConceptScore W4236561641C173608175 @default.
- W4236561641 hasConceptScore W4236561641C199360897 @default.
- W4236561641 hasConceptScore W4236561641C26713055 @default.
- W4236561641 hasConceptScore W4236561641C2780513914 @default.
- W4236561641 hasConceptScore W4236561641C38652104 @default.
- W4236561641 hasConceptScore W4236561641C41008148 @default.
- W4236561641 hasConceptScore W4236561641C78766204 @default.
- W4236561641 hasConceptScore W4236561641C83283714 @default.
- W4236561641 hasConceptScore W4236561641C96972482 @default.
- W4236561641 hasLocation W42365616411 @default.
- W4236561641 hasLocation W42365616412 @default.
- W4236561641 hasLocation W42365616413 @default.
- W4236561641 hasOpenAccess W4236561641 @default.
- W4236561641 hasPrimaryLocation W42365616411 @default.
- W4236561641 hasRelatedWork W1578182049 @default.
- W4236561641 hasRelatedWork W2117500226 @default.
- W4236561641 hasRelatedWork W2150756254 @default.
- W4236561641 hasRelatedWork W2170268965 @default.
- W4236561641 hasRelatedWork W2548780957 @default.
- W4236561641 hasRelatedWork W2564473412 @default.
- W4236561641 hasRelatedWork W2785520363 @default.
- W4236561641 hasRelatedWork W2981664121 @default.
- W4236561641 hasRelatedWork W3170632927 @default.
- W4236561641 hasRelatedWork W4289791131 @default.
- W4236561641 isParatext "false" @default.
- W4236561641 isRetracted "false" @default.
- W4236561641 workType "report" @default.