Publications
-
Performance Analysis of Pure MPI versus MPI+OpenMP for Jacobi
Iteration and a 3D FFT on the Cray XT5",
(Glenn R. Luecke, Olga Weiss, Marina Kraeva,
James Coyle, James Hoekstra)
Proceedings of the Cray Users Group, Edinburgh, Scotland, May 2010.
.
p>
-
Evaluating the Capability of Compilers and Tools to Detect
Serial and Run-time Errors",
(Glenn R. Luecke, James Coyle, James Hoekstra,
Marina Kraeva, Elizabeth Kleiman, Olga Weiss, Mi-Young Park, Andrey Wehe, Melissa Yahya)
refereed poster at Supercomputing 2009, Portland, Oregon, November 2--9.
-
Evaluating Error Detection Capabilities of UPC Compilers
(with James Coyle, James Hoekstra,
Marina Kraeva, Elizabeth Kleiman, Indranil Roy)
preprint.
-
The Importance of Run-Time Error Detection
(with James Coyle, James Hoekstra,
Marina Kraeva, Ying Xu, Mi-Young Park, Elizabeth Kleiman, Olga Weiss, Andre
Wehe, Melissa Yahya), Tools for High Performance Computing 2009, Proceedings of the
3rd International Workshop on Parallel Tools for High Performance Computing,
September 2009, ZIH, Dresden, Springer 2010.
-
Evaluating the Capability of Compilers and Tools to
Detect Serial and Parallel Run-time Errors
(with James Coyle, James Hoekstra,
Marina Kraeva, Ying Xu, Mi-Young Park, Elizabeth Kleiman, Olga Weiss, Andre
Wehe, Melissa Yahya), refereed poster at Supercomputing 2009, Portland, Oregon
November 2009.
-
Evaluating Error Detection Capabilities of UPC
Run-time Systems
(with James Coyle, James Hoekstra,
Marina Kraeva, Ying Xu, Mi-Young Park, Elizabeth Kleiman, Olga Weiss, Andre
Wehe, Melissa Yahya), Proceedings of the Partitioned Global Address Space (PGAS)
2009 Conference, Washington DC,
October 2009.
-
Measuring MPI Latency and Communication Rates for Small Messages
(with James Coyle, James Hoekstra, Marina Kraeva and
Ying Xu from the Shanghai Supercomputing Center, Shanghai, China),
March 31, 2009. Submitted for publication.
-
Using Nodes in a Cluster Efficiently
(with
Ying Li, and Martin Cuma from the University of Utah),
Benchmarking: An International Journal, Vol. 14, No. 6,
2007.
-
A Survey of Systems for Detecting Serial Run-Time Errors
(with
James Coyle, Jim Hoekstra, Marina Kraeva, Ying Li, Yanmei Wang, and Olga
Taborskaia), Concurrency and Computation:
Practice and Experience: Volume 18, pp 1885-1907, 2006.
-
Sending Non-Contiguous Data in MPI Programs
(with Yanmei Wang),
IT Technical Report April 2005.
-
MPI-CHECK for
C/C++ MPI Programs (with Pavel Krusina), IT Technical Report, November
2003.
-
Evaluating the
Performance of MPI-2 One-Sided Routines on a Cray SV1
(with Wei
Hu), IT Technical Report, December 2002.
-
The
Performance and Scalability of the NAS Parallel Benchmarks on a Cray SV1
(with Ying Li), technical report, June 7, 2002.
-
Scalability and Performance of MPI, HPF, and OpenMP on an SGI Origin 2000
(with Zhe Guan, Thomas Brandes), IT Technical Report, June 2002.
Or you can download the MSWord document file by clicking
here.
-
Deadlock Detection In MPI Programs
(with Yan Zou, James Coyle,
Jim Hoekstra, Marina Kraeva), Concurrency and Computation: Practice and
Experience. 2002, vol.14, pp 911-932.
Or you can download the MSWord document file by clicking
here.
-
MPI-CHECK: a Tool for Checking Fortran 90 MPI Programs
(with Hua
Chen, James Coyle, Jim Hoekstra, Marina Kraeva, Yan Zou), Concurrency and
Computation: Practice and Experience. 2003, vol. 15, pp 93-100.
Or you can download the MSWord document file by clicking
here.
- The Performance and Scalability of SHMEM and MPI-2 One-Sided
Routines on a SGI Origin 2000 and a Cray T3E-600, (pdf,
ps)
(with Silvia Spanoyannis, Marina Kraeva), Performance Evaluation &
Modeling of Computer Systems, December, 2002, http://dsg.port.ac.uk/Journals/PEMCS/
-
Performance and Scalability of MPI on PC Clusters
(pdf,
ps)
(with Jing Yuan, Silvia Spanoyannis, Marina Kraeva), Performance
Evaluation & Modeling of Computer Systems, November, 2002, and CONCURRENCY
AND COMPUTATION: PRACTICE AND EXPERIENCE, 2004: vol 16, pages 79-107.
- Comparing the Performance of MPICH with Cray's MPI and with SGI's
MPI, (pdf,
ps)
(with Lili Ju, Marina Kraeva), Performance Evaluation & Modeling of
Computer Systems, August, 2002, and CONCURRENCY AND COMPUTATION: PRACTICE
AND EXPERIENCE 2003: volume 15, pages 779-802.
-
Scalability and Performance of OpenMP and MPI on a 128-Processor SGI
Origin2000, (pdf,
ps)
(with Wei-Hua Lin), Concurrency and Computation: Practice and
Experience, 2001; 13, pp 905-928.
-
Evaluating the Performance of High Performance Fortran Compilers on a NEC
Cenju-4, Cray T3E, an IBM SP and an SGI Origin 2000
(with Ying Li
and Jen-Yao Hsu), IT Technical Report, August 2000.
- Comparing the Communication Performance and Scalability of a Linux
and an NT Cluster of PCs, a Cray Origin 2000, an IBM SP and a Cray T3E-600,
(pdf,
ps)
(with B. Raffin, J. Coyle) Proceedings of IEEE Computer Society
International Workshop on Cluster Computing, pp 26-35, December 2-4, 1999,
Melbourne, Australia.
- Comparing the Communication Performance and Scalability of a SGI
Origin 2000, a Cluster of Origin 2000's, and a Cray T3E-1200 Using SHMEM and
MPI Routines, (pdf,
ps)
(with J. Raffin, B. Coyle), Performance Evaluation & Modeling of
Computer Systems, October, 1999.
- The Performance of the MPI Collective Communication Routines for
Large Messages on the Cray T3E-600, the Cray Origin 2000, and the IBM SP, (pdf,
ps)
(with J. Raffin, B. Coyle), Performance Evaluation & Modeling of
Computer Systems, July, 1999.
- Comparing the Scalability of the Cray T3E-600 and the Cray Origin
2000 Using SHMEM Routines, (pdf,
ps)
(with J. Coyle and B. Raffin), Performance Evaluation & Modeling of
Computer Systems, December, 1998.
- Comparing the Performance of MPI on the Cray T3E-900, the Cray
Origin 2000 and the IBM P2SC, (pdf,
ps)
(with J. Coyle) , Performance Evaluation & Modeling of Computer
Systems, June, 1998.
- Performance Comparison of MPI, PGHPF/CRAFT and HPF Implementations
of the Cholesky Factorization on the Cray T3E-600 and IBM SP-2, (pdf,
ps)
(with Ying Li), Performance Evaluation & Modeling of Computer
Systems, Jan. 1998.
- Comparing the Performance of MPI on the Cray Research T3E-600 and
IBM SP-2, (pdf,
ps)
(with J. Coyle, W. Haque), Performance Evaluation & Modeling of
Computer Systems, September 1997.
-
High
Performance Fortran Versus Explicit Message Passing on the IBM SP-2 for the
Parallel LU, QR, and Cholesky Factorizations
(with J. Coyle)
Supercomputer, Vol. XIII(2), pp. 4-14, 1997.
-
Performance Comparison of Workstation Clusters for Scientific Computing
(with J. Coyle, W. Haque, J. Hoekstra, H. Jespersen) Supercomputer,
vol 12(2), pp. 4-20, Mar 1996.
- Parallel Inverse Iteration for Eigenvalue and Singular Value
Decompositions (with Hanson), SIAM News, pp. 14-16, May/June, 1994.
- Performance of Workstations and Mainframes for Scientific
Computing (with W. Haque, J. Coyle, J. Hoekstra, H. Jespersen),
Supercomputer, vol 9(6), pp. 10-19, Nov 1992.
- Parallel Cholesky Factorization Algorithms Using BLAS
J of
Supercomputing, Vol 6, 1992.
- Evaluation of Fortran Vector Compilers and Preprocessors
(with W. Haque, J. Coyle, J. Hoekstra, H. Jespersen), SOFTWARE - Practice
and Experience, vol 21(9), pp. 891-905, Sep 1991.
- I/O Considerations for Performance Enhancement under the MVS
Operating System (with W. Haque, J. Hoekstra, H. Jespersen),
Supercomputer, vol 8(5), pp. 41-50, Sep 1991.
- Performance Comparisons of the Cholesky Factorization Algorithms
Using Serial & Parallel BLAS on the Cray-2, Cray X/Y-MP, IBM 3090J, & the
HDS EX 60 (with Yun, & P. Smith), IMSL Technical Report 9003, 1990.
- Performance of Cholesky Factorization Algorithms using Level 2 & 3
BLAS on the Cray 2, Cray X-MP, IBM 3090 & HDS AS/EX Computers (with
Yun & P. Smith), Intern. Conf. on Supercomputing, Crete, Greece, June, 1989.
- Performance of the Blocked Cholesky Factorization Algorithm Using
Level 3 BLAS on the HDS/XL Vector Computer IMSL Technical Report
8802, 1989.
- A Comparative Study of KAP and VAST-2: Two Automatic Vector
Preprocessors with Fortran 8x Output Supercomputer 28, vol 5(6), pp.
15-25, Nov 1988.