Debrecen2

From Nano Group Budapest
Revision as of 17:09, 6 September 2016 by Thiering Gergo (talk | contribs) (→‎Precompiled Environment)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This system maintained by NIIF.

Detailed info about debrecen2 Nvidia GPU cluseter is at here

Detailed info about debrecen3 Intel Phi cluseter is at here

Specification

Name Debrecen2 (Leo) Debrecen3 Phi (Apollo)
Type HP SL250s HP Apollo 8000
core / node 16 24
Accelator / node 3x Nvidia K20x in 68 nodes (K40x in 16 nodes) 2x Intel(R) Xeon Phi(TM) MIC SE10/7120D
Memory / node 120 GB 120 GB
CPU 2× Intel Xeon E5-2650 v2 @ 2.60GHz 2× Intel Xeon E5-2670 v3 @ 2.30GHz
Architecture x86_64 / intel64 / em64t little-endian x86_64 / intel64 / em64t little-endian
Scheduler Slurm Slurm
MPI Intel MPI (impi), Open MPI (ompi) Intel MPI (impi), Open MPI (ompi)

Logging in

Set up the SSH access from Skynet, and mount its storage on skynet. (log into skynet and type:)

cd ~/shf3/mid/ssh
cp niif/debrecen2 debrecen2
cd ~/shf3/key/ssh

Then place your private NIIF key here, and rename it as:

mv <YOUR_NIIF_KEY> debrecen.sec

You might need to export the private key from putty in OPENSSH format, if you used puttygen to generate the keypair.

Precompiled Environment

From Skynet sync up the precompiled environment:

 cd /share/niif/debrecen2
 sshput -m debrecen2 -s .

Log into debrecen: You can do this with putty, etc if you don't like logging into debrecen from Skynet.

 sshto -m debrecen2

Then add the following into your .bash_profile:

 PATH=$PATH:$HOME/bin

VASP

We use intel MPI for VASP.

First please transfer the compiled VASP binary and projectors from skynet. Log into "skynet"

 cd /share/niif/
 sshput -t 2 -m debrecen2 -s vasp/5.4.1.03082016.impi
 sshput -t 2 -m debrecen2 -s vasp/5.4.1.03082016.impi.7.0cuda
 sshput -t 2 -m debrecen2 -s vasp/proj

Then add the following into your .bash_profile:

 export PATH=$PATH:$HOME/bin
 export VASP_PROJ_HOME=$HOME/vasp/proj
 
 module load intel/compiler/2016.1.150
 module load intel/mkl/2016.1.150
 module load intel/ipp/2016.1.150
 module load intel/daal/2016.1.150
 module load intel/tbb/2016.1.150
 module load intel/mpi/2016.1.150
 module load intel/mpi/4.1.0.027
 LD_LIBRARY_PATH=/opt/intel/compilers_and_libraries_2016.1.150/linux/mpi/intel64/lib:$LD_LIBRARY_PATH
 
 module unload cuda/6.5
 module load cuda/7.0.28

debrecen3 with CPU's only

Log into "debrecen2" You can find a sample job of an ozone molecule in $HOME/jobsamples

 cd $HOME/jobsamples/ozone

Please fill your email in the debrecen2 jobfile,

 mcedit debrecen2_cpu

then submit it:

 sbatch debrecen2_cpu

This job should finish in mere seconds. Please replace the partition to "prod" from "test" for actual large scale runs.

 #SBATCH --partition=prod-phi

You can easily increase the number of nodes:

 #SBATCH --nodes=4

Since we cannot utilize the Xeon Phi accelators, please use this computer till there are no users with Xeon Phi accelerated programs.

debrecen2 GPU port

Log into "debrecen2" You can find a sample job of an ozone molecule in $HOME/jobsamples

 cd $HOME/jobsamples/gpu

Please fill your email in the debrecen2 jobfile,

 mcedit debrecen2_gpu

then submit it:

 sbatch debrecen2_gpu

This job should finish in a hour.

You can easily increase the number of nodes:

 #SBATCH --nodes=4

Tips.: - The GPU port will launch 3 MPI threads per node. One MPI process per physical GPU.

- There is no Gamma only version available yet. You have to use the "standard" version "vasp_gpu" for gamma only calculations. The non-collinear binary is called "vasp_gpu_ncl"

- Use one GPU for each k-point. This improves the performance by ~30%. The connection between the GPU's seems too slow.

- Tipically one GPU node equals with 2-6 nodes approximately with ~20 intel CPU cores each.

- You can use both PBE, and hybrid (HSE06) functionals.

- prod-gpu-k40 queue contains GPU's with 12 GB memory on board, while prod-gpu-k20's have only 6 GB

Job Monitoring