Eigenvalue SoLvers for Petaflop-Applications (ELPA)
- http://elpa.mpcdf.mpg.de
- The ELPA library was originally created by the ELPA consortium, consisting of the following organizations:
- Max Planck Computing and Data Facility (MPCDF) formerly known as Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
- Bergische Universität Wuppertal, Lehrstuhl für angewandte Informatik,
- Technische Universität München, Lehrstuhl für Informatik mit Schwerpunkt Wissenschaftliches Rechnen ,
- Fritz-Haber-Institut, Berlin, Abt. Theorie,
- Max-Plack-Institut für Mathematik in den Naturwissenschaften, Leipzig, Abt. Komplexe Strukutren in Biologie und Kognition, and
- IBM Deutschland GmbH
Some parts and enhancements of ELPA have been contributed and authored by the Intel Corporation and Nvidia Corporation, which are not part of the ELPA consortium.
Maintainance and development of the ELPA library is done by the Max Planck Computing and Data Facility (MPCDF)
Futher support of the ELPA library is done by the ELPA-AEO consortium, consisting of the following organizations:
- Max Planck Computing and Data Facility (MPCDF) formerly known as Rechenzentrum Garching der Max-Planck-Gesellschaft (RZG),
- Bergische Universität Wuppertal, Lehrstuhl für angewandte Informatik,
- Technische Universität München, Lehrstuhl für Informatik mit Schwerpunkt Wissenschaftliches Rechnen ,
- Technische Universität München, Lehrstuhl für theoretische Chemie,
- Fritz-Haber-Institut, Berlin, Abt. Theorie
Contributions to the ELPA source have been authored by (in alphabetical order):
- Author
- T. Auckenthaler, Volker Blum, A. Heinecke, L. Huedepohl, R. Johanni, Werner Jürgens, Pavel Kus, and A. Marek
All the important information is in the elpa_api::elpa_t derived type
Abstract definition of the elpa_t type
Since ELPA needs (in case of MPI builds) that the matix is block-cyclic distributed the user has to ensure this distribution before calling ELPA. Experience shows, that it is very important that the user checks the return code of 'descinit' to check whether the block-cyclic distribution is valid. Note that ELPA relies on a valid block-cyclic distribution and might show unexpected behavior if this has not been ensured before calling ELPA.
A typical usage of ELPA might look like this:
Fortran synopsis
class(elpa_t), pointer :: elpaInstance
integer :: success
print *, "ELPA API version not supported"
stop
endif
if (success /= elpa_ok) then
print *,"Could not allocate ELPA"
endif
call elpaistance%set("na", na, success, success)
if (success /= elpa_ok) then
print *,"Could not set entry"
endif
call elpainstance%set("nev", nev, success, success)
call elpainstance%set("local_nrows", na_rows, success)
call elpainstance%set("local_ncols", na_cols, success)
call elpainstance%set("nblk", nblk, success)
call elpainstance%set("mpi_comm_parent", mpi_comm_world, success)
call elpainstance%set("process_row", my_prow, success)
call elpainstance%set("process_col", my_pcol, success)
success = elpainstance%setup()
if (succes /= elpa_ok) then
print *,"Could not setup ELPA object"
endif
call elpainstance%set("gpu", 1, success)
if (my_rank .eq. 0) call elpainstance%set("use_gpu_id", 1, success)
call elpainstance%set("solver", elpa_solver_2stage, success)
call elpainstance%set("real_kernel", elpa_2stage_real_avx_block2)
int elpa_init(int api_version)
elpa_t elpa_allocate(int *error)
Fortran module to use the ELPA library. No other module shoule be used.
Definition: elpa.F90:337
... set and get all other options that are desired
call elpa%store_settings(
"save_to_disk.txt", success)
call elpainstance%eigenvectors(a, ev, z, success)
void elpa_uninit(int *error)
void elpa_deallocate(elpa_t handle, int *error)
C synopsis
int error;
fprintf(stderr, "Error: ELPA API version not supported");
exit(1);
}
}
elpa_set(handle,
"local_nrows", na_rows, &error);
elpa_set(handle,
"local_ncols", na_cols, &error);
elpa_set(handle,
"mpi_comm_parent", MPI_Comm_c2f(MPI_COMM_WORLD), &error);
elpa_set(handle,
"process_row", my_prow, &error);
elpa_set(handle,
"process_col", my_pcol, &error);
if (my_rank == 0)
elpa_set(handle,
"use_gpu_id", 1, &error);
struct elpa_struct * elpa_t
Definition: elpa.h:10
@ ELPA_SOLVER_2STAGE
Definition: elpa_constants.h:28
@ ELPA_2STAGE_REAL_AVX_BLOCK2
Definition: elpa_constants.h:82
@ ELPA_OK
Definition: elpa_constants.h:139
int elpa_setup(elpa_t handle)
C interface for the implementation of the elpa_autotune_deallocate method.
#define elpa_set(e, name, value, error)
generic C method for elpa_set
Definition: elpa_generic.h:12
... set and get all other options that are desired
void elpa_store_settings(elpa_t handle, const char *filename, int *error)
C interface for the implementation of the elpa_store_settings method.
#define elpa_eigenvectors(handle, a, ev, q, error)
generic C method for elpa_eigenvectors
Definition: elpa_generic.h:49
the autotuning could be used like this:
Fortran synopsis
class(elpa_t), pointer :: elpa
class(elpa_autotune_t), pointer :: tune_state
integer :: success
print *, "ELPA API version not supported"
stop
endif
call elpa%set(
"na", na, success)
call elpa%set(
"nev", nev, success)
call elpa%set(
"local_nrows", na_rows, success)
call elpa%set(
"local_ncols", na_cols, success)
call elpa%set(
"nblk", nblk, success)
call elpa%set(
"mpi_comm_parent", mpi_comm_world, success)
call elpa%set(
"process_row", my_prow, success)
call elpa%set(
"process_col", my_pcol, success)
tune_state =>
elpa%autotune_setup(elpa_autotune_fast, elpa_autotune_domain_real, success)
call e%set("solver", elpa_solver_2stage, success)
call e%set("real_kernel", elpa_2stage_real_avx_block2, success)
... set and get all other options that are desired
iter = 0
do while (
elpa%autotune_step(tune_state, success))
iter = iter + 1
call e%eigenvectors(a, ev, z, success)
if (iter > max_iter) then
call elpa%autotune_save_state(tune_state,
"autotune_checkpoint.txt", success)
exit
endif
enddo
call elpa%autotune_set_best(tune_state, success)
call elpa%store(
"autotuned_object.txt", success)
void elpa_autotune_deallocate(elpa_autotune_t handle, int *error)
More examples can be found in the folder "test", where Fortran and C example programs are stored