Computing Service
THIS PAGE CONTAINS MANY OUTDATED CONTENT PLEASE UPDATE IT IF YOU CAN
Introduction
Security Tips
First, please follow our security advises:
- Use a Unix-based operating system: OS X, SuSE or Ubuntu
- Use automatic update
- Use a strict firewall and apply strong passwords. There is a console-based password manager for random passwords.
- Keep your ssh private keys in safe and use string key passwords as well.
- Use PGP to encrypt mail and chat
- Use strong encryption for private data
OS Security Lists
Wigner RCP Systems
Skynet
This is our main SGI cluster installed in May 2011. A late descendant of the venerable TPA batch systems and the CEDRUS environment.
SGI Manuals
Specification
Scheduler: Slurm
Type | SGI Rackable | SGI Rackable | SGI UV2000 | GPU node |
# of nodes | 36 (cluster) | 4 (cluster) | 16 (1) (SMP machine) | 1 (one machine) |
# of CPUs / node | 2 | 2 | 1 | 2 |
# of cores / CPU | 4 | 6 | 6 (96 total) | 4 |
Memory / node | 36 GB | 64 GB | 64 GB (1024 GB total) | 36 GB |
Memory / core | 4 GB | 5 GB | 10 GB | 4 GB |
CPU | Intel Xeon E5620 @ 2.40 GHz | Intel Xeon E5-2620 @ 2.00 GHz | Intel Xeon E5-4610 @ 2.40 GHz | Intel Xeon E5620 @ 2.40 GHz |
Architecture | x86_64 / intel64 / em64t little-endian | x86_64 / intel64 / em64t little-endian | x86_64 / intel64 / em64t little-endian | x86_64 / intel64 / em64t little-endian |
Interconnect between nodes | Infiniband QDR X4 40 Gb/sec | Infiniband QDR X4 40 Gb/sec | NUMAlink 6 | - |
Rmax | 2.8 TFlops | 0.8 TFlops | 1.8 TFlops | 0.1 TFlops + 1 TFlops from GPUs |
# of GPUs | none | none | none | X2 Nvidia M2075 |
MPI | SGI MPT 2.08 | SGI MPT 2.08 | SGI MPT 2.08 | none (use openMPI) |
Purpose | Large scale MPI paralell jobs | Bigmem, and SMP only jobs | GPU jobs |
Assignments
Project ID | Priority | Description | Participiants |
---|---|---|---|
diamond | high | Nano diamonds | Hugo Pinto, Tamás Simon, Gergő Thiering |
sic | normal | Silicon Carbide | Viktor Ivády, Krisztián Szász, Bálint Somogyi, Tamás Hornos |
diavib | low | Márton Vörös, Tamás Demjén | |
solar | low | Solar cells | Márton Vörös |
Before Login
SSH Client Setup
At the first login you have to accept the host key. Please check the host fingerprint to avoid MITM attacks! On your client machine set the following in $HOME/.ssh/config
:
VisualHostKey yes
Genuine fingerprint of the system: File:Skynet fp.gif File:Fingerprint skynet.gif
Explicit Connection Check
In order to have an explicit connection check before every login set in the corresponding MID file:
mid_ssh_port_check="ping nmap"
The former will check the connection and the latter the port state. To check the connection without login:
sshmgr -c MID
After Login
Install Shell Framework
cd $HOME git clone git://github.com/hornos/shf3.git
Source and setup the Shell Framework in $HOME/.profile
:
source $HOME/shf3/bin/shfrc # set the prompt shf3/ps1 SKYNET[\\h] # set framework features shf3/alias yes shf3/screen yes shf3/mc/color yes # screen workaround if shf3/is/screen ; then source "/etc/profile.d/modules.sh" fi # tab complete source $HOME/shf3/bin/complete
Parallel Compressor
Enable the parallel compressor for the framework:
cd $HOME echo "sys_zip_xg_gz=pigz" > shf3/lib/sys/zip/xg/config.$USER
Module Environment
ESZR is our unified computing environment. Enable ESZR system modules in $HOME/.profile
:
# common module use /site/eszr/mod/common module load eszr/site module load eszr/sys/wrcp/skynet module load eszr/sys/wrcp/skynet.mpt # site specific module use /site/eszr/mod/site module load sgi/2011 source ${ESZR_ROOT}/env/alias
Available module commands:
Command | Alias | Descritpion |
---|---|---|
module avail | mla | Show available modules |
module list | mls | List loaded modules |
module display | mdp | About the module |
module load/unload MODULE | mld/mlu MODULE | Load / unload MODULE |
ESZR
ESZR the unified directory structure and module environment. The environment can be check by:
eszr
Directories
Accessing the scratch:
cd $ESZR_SCRATCH
Accessing the storage:
cd $ESZR_DATA
Synchronizing a directory to the storage:
dirsync DIR $ESZR_DATA
Sharing a directory with your Unix group:
dirshare DIR
Backup
Remote backup via ssh ca be made by editing the MID:
sshmgr -e MID
Enter backup variables in the MID file:
mid_ssh_backup_dir="${ESZR_DATA}/MID/${mid_ssh_user}" mid_ssh_backup_src="LIST"
where LIST is a space separated list of directories in your remote home to be saved. Then run:
sshtx backup MID
Compressing
Always compress data files saving storage space. You can use the following compressing programs:
Mode | Compress File | Compress Directory | Extract File | Extract Directory |
---|---|---|---|---|
Serial | gzip -9 FILE | gzip -9 -r DIR | gzip -d FILE.gz | gzip -d -r DIR |
Parallel | pigz -9 FILE | pigz -9 -r DIR | pigz -d FILE.gz | pigz -d -r DIR |
SSH File Transfer
Copy files or directories to a MID:
sshtx put MID space separated list of files
Receive files or directories from a MID:
sshtx get MID space separated list of files
SSH Mount
Mount:
sshmount MID
Unmount:
sshumount MID
Scheduler
The job scheduler is Slurm. In Slurm each user is assigned with one or more account which you have to set in the queue file.
General information about the partitions:
sinfo -l
Partition | Allowed Groups | Purpose |
---|---|---|
devel | pdevel | Development (1 node) |
batch | pbatch | Production |
General information on jobs:
squeue -l or sjstat or qstat
Pending job priorities:
sprio -l
Slurm accounts and priorities:
sshare -l
Job accounting:
sacct
Detailed user statistics for the last month:
eszracct -u userid
Job Setup
Setup the Queue file and edit the parameters:
cd $HOME/shf3/mid/que cp templates/wrcp/skynet . mcedit skynet
Job template is in $HOME/shf3/mid/que/templates/wrcp/skynet.job
Interactive Jobs
There are two ways of running interactive multi-threaded jobs in the queue: i) array jobs (many single thread) ii) OpenMP jobs. To run an array job:
runarr queue:sockets:cores command
where queue
is the queue MID, sockets
is the number of CPU sockets in a node, cores
is the number of cores per socket. The command
can be a shell script as well. In the shell script/program you can ask the local rank from the environment:
RANK=$SLURM_LOCALID
OpenMP jobs are very similar:
runomp queue:sockets:cores command
Be aware that jobs are running in your shell like any other program but executed by the queue on a compute node. Do not run large and long workload as an interactive job!
Job Monitoring
Average node utilization of a job:
jobmon JOBID
Per node utilization:
pcpview -j JOBID
Check the last 3 columns of cpu:
us - user load sy - system load id - idle
The user load should be around the maximum and the other two around 0. Maximum utilization is 100.
Node utilization chart:
pcpview -c -j JOBID
Maximum utilization is 8 (# of cores per node).
Parallel Modes
The parallel mode is set by the MODE
key in the job file. MPI modes have an MPI selector for the corresponding MPI subsystem.
MODE | Description |
---|---|
omp | OpenMP only |
mpi/MPI | MPI-only with the selected MPI subsystem |
mpiomp/MPI | MPI-OMP hybris with the selected MPI subsystem |
where
MPI | Description |
---|---|
mpt | SGI MPT MPI. [Manual] |
impi | Intel MPI. [Manual] |
ompi | Open MPI [Manual] |
Resource Specification
Three types of parallel mode is supported: MPI-only, OMP-only, MPI-OMP hybrid. Shell framework will set environment variables for OMP and parameters for the mpirun according to the following table. Number of OMP threads can be override by THRDS
. In case of SGE you can also specify the total number of slots per node by SLTPN
. In case of ESZR by setting the resource keys to eszr
the framework will use the default ESZR system settings and you need to set only the MPI mode.
Parallel Mode | # of MPI procs | # of MPI procs per node | # of OMP threads per MPI proc. |
---|---|---|---|
MPI-only (mpi) | NODES × SCKTS × CORES | SCKTS × CORES | 1 |
OMP-only (omp) | -- | -- | SCKTS × CORES |
MPI-OMP hybrid (mpiomp) | NODES × SCKTS | SCKTS | CORES |
CPU binding
With the BIND
key you can bind processes to CPUs. Please refer to the manual of the MPI subsystem. Usually, it is enough to set the parameters in the table below.
MPI | MPI-only | MPI-OMP hybrid |
---|---|---|
SGI MPT (mpt) | dplace -s 1 | omplace -s 1 |
Intel MPI (impi) | -binding pin=yes | -binding pin=yes |
Open MPI (ompi) | -bind-to-core -bycore | -- |
Compiling with Intel Compilers
Fortran compiler for serial is ifort
, for MPI parallel is mpiifort
. Load the corresponding compiler and/or parallel environment module by:
mld MODULE
Parallel environments are mutually exclusive. You can check the compiler by:
which mpiifort
MODULE | Mode | Parallel Environment | Target Systems |
---|---|---|---|
sgi/mpt/2.04 | mpt | SGI MPT | skynet, debrecen, pecs |
intel/mpi/4.0.3.008 | impi | Intel MPI | skynet, szeged, budapest |
MODULE | Compiler | Target Systems | Recommended Options |
---|---|---|---|
intel/2011sp1u2 | Intel 2011 SP 1 Update 2 | szeged, budapest | -O2 -xSSE2 -ip -vec-report0 |
intel/2011sp1u2 | Intel 2011 SP 1 Update 2 | skynet, debrecen, pecs | -O2 -xSSE4.2 -ip -vec-report0 -override_limits |
Static Linking Intel 2011
Makefile parameters for some the most usual cases or you can use the Intel link advisor.
Link parameters for Intel MKL Lapack:
MKL_PATH = $(MKLROOT)/lib/intel64 IFC_PATH = $(INTEL_IFORT_HOME)/lib/intel64 # link flags LDFLAGS = $(MKL_PATH)/libmkl_lapack95_lp64.a \ -Wl,--start-group \ $(MKL_PATH)/libmkl_intel_lp64.a \ $(MKL_PATH)/libmkl_intel_thread.a \ $(MKL_PATH)/libmkl_core.a \ -Wl,--end-group \ $(IFC_PATH)/libiomp5.a -lpthread
Link parameters for Intel MKL Scalapack with SGI MPT:
MKL_PATH = $(MKLROOT)/lib/intel64 IFC_PATH = $(INTEL_IFORT_HOME)/lib/intel64 # link flags LDFLAGS = $(MKL_PATH)/libmkl_scalapack_lp64.a \ $(MKL_PATH)/libmkl_blacs_sgimpt_lp64.a \ $(MKL_PATH)/libmkl_lapack95_lp64.a \ -Wl,--start-group \ $(MKL_PATH)/libmkl_intel_lp64.a \ $(MKL_PATH)/libmkl_intel_thread.a \ $(MKL_PATH)/libmkl_core.a \ -Wl,--end-group \ $(IFC_PATH)/libiomp5.a -lpthread
Compile and link parameters for Intel MKL FFT:
FFTW_PATH = $(INTEL_MKL_HOME) FFTW_INC = $(FFTW_PATH)/include/fftw FFLAGS = -I$(FFTW_INC) LDFLAGS = $(FFTW_PATH)/lib/intel64/libfftw3xf_intel.a
Profiling with Amplifier
To profile an OMP program set in the job file: PROF="amplxe-cl -collect hotspots"
To check the collected data:
amplxe-cl -report hotspots -r r000hs
Static Linking Intel 10.1
Makefile parameters for some the most usual cases.
Link parameters for Intel MKL Lapack:
MKL_PATH = $(INTEL_MKL_HOME)/lib/em64t IFC_PATH = $(INTEL_IFORT_HOME)/lib # link flags LDFLAGS = $(MKL_PATH)/libmkl_lapack95_lp64.a \ -Wl,--start-group \ $(MKL_PATH)/libmkl_intel_lp64.a \ $(MKL_PATH)/libmkl_intel_thread.a \ $(MKL_PATH)/libmkl_core.a \ -Wl,--end-group \ $(IFC_PATH)/libiomp5.a -lpthread
Link parameters for Intel MKL Scalapack with SGI MPT:
MKL_PATH = $(INTEL_MKL_HOME)/lib/em64t IFC_PATH = $(INTEL_IFORT_HOME)/lib # link flags LDFLAGS = $(MKL_PATH)/libmkl_scalapack_lp64.a \ $(MKL_PATH)/libmkl_blacs_sgimpt_lp64.a \ $(MKL_PATH)/libmkl_lapack95_lp64.a \ -Wl,--start-group \ $(MKL_PATH)/libmkl_intel_lp64.a \ $(MKL_PATH)/libmkl_intel_thread.a \ $(MKL_PATH)/libmkl_core.a \ -Wl,--end-group \ $(IFC_PATH)/libiomp5.a -lpthread
Compile and link parameters for Intel MKL FFT:
FFTW_PATH = $(INTEL_MKL_HOME) FFTW_INC = $(FFTW_PATH)/include/fftw FFLAGS = -I$(FFTW_INC) LDFLAGS = $(FFTW_PATH)/lib/em64t/libfftw3xf_intel.a
Visualization
Application | Required Modules | Shell Manager | Description |
---|---|---|---|
VMD | cuda/4.1.28 vmd/1.9.1 | vmdmgr | General visualizer |
NSC Systems
Triolith
This system is maintaned by NSC.
Specification
Type | HP Proliant SL230s |
# of CPUs / node | 2 |
# of cores / CPU | 8 |
Memory / node | 32/128 GB |
Memory / core | 2/8 GB |
CPU | Intel Xeon E5-2660 @ 2.2 GHz |
Architecture | x86_64 / intel64 / em64t little-endian |
Scheduler | Slurm |
MPI | Open MPI (ompi) |
ESZR | none |
Shell Environment
Precompiled Environment
From Skynet sync up the precompiled environment:
cd /share/nsc/triolith sshput
The precompiled environment does not contain the program binaries and data libraries. To transfer a PROGRAM:
cd /share/nsc/triolith/pkg sshtx put triolith PROGRAM
Shell Framework
Login to triolith:
sshin triolith
On triolith:
cd $HOME git clone git://github.com/hornos/shf3.git
Source and setup the Shell Framework in $HOME/.bash_profile
:
source $HOME/shf3/bin/shfrc # set the prompt shf3/ps1 TRIOLITH[\\h] # set framework features shf3/alias yes shf3/screen yes shf3/mc/color yes
function status() { sjstat -c sinfo }
Parallel Compressor
Enable the parallel compressor for the framework:
cd echo "sys_zip_xg_gz=pigz" > shf3/lib/sys/zip/xg/config.$USER
Module Environment
Set modules in $HOME/.bash_profile
:
module use $HOME/local/modulefiles alias mla="module avail" alias mls="module list" alias mld="module load" module load impi/4.0.3.008 module load sys/triolith
Logout and login again.
Job Setup
Setup the Queue file and edit the parameters:
cd $HOME/shf3/mid/que cp templates/nsc/triolith . mcedit triolith
Job template is in $HOME/shf3/mid/que/templates/nsc/triolith.job
NIIF Systems
NIIF systems are maintained by NIIF. There are useful guides about it in their wikipedia, nut mostly in hungarian.
Debrecen
This system maintained by NIIF.
Specification
Type | SGI ICE8400EX |
# of CPUs / node | 2 |
# of cores / CPU | 6 |
Memory / node | 47 GB |
Memory / core | 3.9 GB |
CPU | Intel Xeon X5680 @ 3.33 GHz SMT on |
Architecture | x86_64 / intel64 / em64t little-endian |
Scheduler | Slurm |
MPI | SGI MPT (mpt) |
ESZR | local |
Shell Environment
Precompiled Environment
From Skynet sync up the precompiled environment:
cd /share/niif/debrecen sshput
The precompiled environment does not contain the program binaries and data libraries. To transfer a PROGRAM:
cd /share/niif/debrecen/pkg sshtx put debrecen PROGRAM
Shell Framework
Login to debrecen:
sshin debrecen
On debrecen:
cd $HOME git clone git://github.com/hornos/shf3.git
Source and setup the Shell Framework in $HOME/.profile
:
source $HOME/shf3/bin/shfrc # set the prompt shf3/ps1 DEBRECEN[\\h] # set framework features shf3/alias yes shf3/screen yes shf3/mc/color yes # screen workaround if shf3/is/screen ; then source "/etc/profile.d/modules.sh" fi # tab complete source $HOME/shf3/bin/complete
Parallel Compressor
Enable the parallel compressor for the framework:
cd $HOME echo "sys_zip_xg_gz=pigz" > shf3/lib/sys/zip/xg/config.$USER
Module Environment
ESZR is our unified computing environment. Enable ESZR system modules in $HOME/.profile
:
# reset module purge module use ${HOME}/site/eszr/mod/common module load eszr/local module load eszr/env/local module load eszr/sys/niif/debrecen module load eszr/sys/niif/debrecen.mpt module use ${HOME}/site/eszr/mod/local module load sgi/2011 source ${ESZR_ROOT}/env/alias
Logout and login again.
Scheduler
The job scheduler is SGE.
General information about the queues:
qstat -g c
Queue | Allowed Groups | Purpose |
---|---|---|
test.q | ALL | Test queue (2 nodes) |
debrecen.q | ALL | Production |
General information on jobs:
qstat -u "*"
Job Setup
Setup the Queue file and edit the parameters:
cd $HOME/shf3/mid/que cp templates/niif/debrecen . mcedit debrecen
Job template is in $HOME/shf3/mid/que/templates/niif/debrecen.job
Job Monitoring
Average node utilization of a job:
jobmon JOBID
Per node utilization:
pcpview -j JOBID
Check the last 3 columns of cpu:
us - user load sy - system load id - idle
The user load should be around the maximum and the other two around 0. Node utilization chart:
pcpview -c -j JOBID
Maximum utilization is 50% since SMT is enabled, in the chart it is 12 (# of cores per node).
Special Options
If you need to allocate a full node but want to start arbitrary number of MPI processes set in the job file:
SLTPN=12
which will specify the total number of SGE slots per node and the total number of slots will be NODES*SLTPN
.
Pécs
The system is maintained by NIIF.
Specification
Type | SGI UV 1000 |
# of CPUs / node | 2 |
# of cores / CPU | 6 |
Memory | 6 TB |
CPU | Intel Xeon X7542 @ 2.66 GHz SMT off |
Architecture | x86_64 / intel64 / em64t little-endian |
Scheduler | Slurm |
MPI | SGI MPT (mpt) |
ESZR | local |
Shell Environment
Precompiled Environment
From Skynet sync up the precompiled environment:
cd /share/niif/pecs sshput
The precompiled environment does not contain the program binaries and data libraries. To transfer a PROGRAM:
cd /share/niif/pecs/pkg sshtx put pecs PROGRAM
Shell Framework
Login to pecs:
sshin pecs
On pecs:
cd $HOME git clone git://github.com/hornos/shf3.git
Source and setup the Shell Framework in $HOME/.profile
:
source $HOME/shf3/bin/shfrc # set the prompt shf3/ps1 PECS[\\h] # set framework features shf3/alias yes shf3/screen yes shf3/mc/color yes # screen workaround if shf3/is/screen ; then source "/etc/profile.d/modules.sh" fi # tab complete source $HOME/shf3/bin/complete
Parallel Compressor
Enable the parallel compressor for the framework:
cd $HOME echo "sys_zip_xg_gz=pigz" > shf3/lib/sys/zip/xg/config.$USER
Module Environment
ESZR is our unified computing environment. Enable ESZR system modules in $HOME/.profile
:
# reset module purge module use ${HOME}/site/eszr/mod/common module load eszr/local module load eszr/env/local module load eszr/sys/niif/pecs module load eszr/sys/niif/pecs.mpt module use ${HOME}/site/eszr/mod/local module load sgi/2011 source ${ESZR_ROOT}/env/alias
Logout and login again.
Scheduler
The job scheduler is SGE.
General information about the queues:
qstat -g c
Queue | Allowed Groups | Purpose |
---|---|---|
test.q | ALL | Test queue |
pecs.q | ALL | Production |
General information on jobs:
qstat -u "*"
Job Setup
Setup the Queue file and edit the parameters:
cd $HOME/shf3/mid/que cp templates/niif/pecs . mcedit debrecen
Job template is in $HOME/shf3/mid/que/templates/niif/pecs.job
Job Monitoring
Currently, monitoring is possible only by chart. You also have to enable the numainfo
script in the queue file. In the job's submit directory:
pcpview -c -j StdOut
Maximum utilization in the chart is 6 (# of cores per node).
Special Options
The UV is a ccNUMA SMP machine and you allocate CPU sockets and cores on one node. It is mandatory to set in the job file:
NODES=1
The total number of SGE slots will be SCKTS*CORES
.
Szeged
The system is maintained by NIIF.
Specification
Type | HP CP4000BL |
# of CPUs / node | 4 |
# of cores / CPU | 12 |
Memory / node | 132 GB |
Memory / core | 2.75 GB |
CPU | AMD Opteron 6174 @ 2.2GHz |
Architecture | x86_64 / intel64 / em64t little-endian |
Scheduler | Slurm |
MPI | Intel (impi) |
ESZR | local |
Shell Environment
Precompiled Environment
From Skynet sync up the precompiled environment:
cd /share/niif/szeged sshput
The precompiled environment does not contain the program binaries and data libraries. To transfer a PROGRAM:
cd /share/niif/szeged/pkg sshtx put szeged PROGRAM
Shell Framework
Login to szeged:
sshin szeged
On szeged:
cd $HOME git clone git://github.com/hornos/shf3.git
Source and setup the Shell Framework in $HOME/.bash_profile
:
source $HOME/shf3/bin/shfrc # set the prompt shf3/ps1 SZEGED[\\h] # set framework features shf3/alias yes shf3/screen yes shf3/mc/color yes # screen workaround if shf3/is/screen ; then source "/etc/profile.d/modules.sh" fi # tab complete source $HOME/shf3/bin/complete
Parallel Compressor
Enable the parallel compressor for the framework:
cd $HOME echo "sys_zip_xg_gz=pigz" > shf3/lib/sys/zip/xg/config.$USER
Module Environment
ESZR is our unified computing environment. Enable ESZR system modules in $HOME/.bash_profile
:
# reset module purge module use ${HOME}/site/eszr/mod/common module load eszr/local module load eszr/env/local module load eszr/sys/niif/szeged module use ${HOME}/site/eszr/mod/local source ${ESZR_ROOT}/env/alias
Scheduler
The job scheduler is SGE.
General information about the queues:
qstat -g c
Queue | Allowed Groups | Purpose |
---|---|---|
test.q | ALL | Test queue (2 nodes) |
szeged.q | ALL | Production |
General information on jobs:
qstat -u "*"
Job Setup
Setup the Queue file and edit the parameters:
cd $HOME/shf3/mid/que cp templates/niif/szeged . mcedit szeged
Job template is in $HOME/shf3/mid/que/templates/niif/szeged.job
Special Options
If you need to allocate a full node but want to start arbitrary number of MPI processes set in the job file:
SLTPN=48
which will specify the total number of SGE slots per node and the total number of slots will be NODES*SLTPN
. You can use the following combinations in MPI-OMP mode if you run out of memory. Set hybrid mode in the job file:
MODE=mpiomp/impi
and the socket/core number accoring to your needs.
SCKTS (# of MPI proces / node) | CORES (# of OMP threads / MPI proc) | Memory / MPI proc |
---|---|---|
2 | 24 | 66 GB |
4 | 12 | 33 GB |
8 | 6 | 16.5 GB |
12 | 4 | 8.3 GB |
24 | 2 | 4.3 GB |
Budapest
The system is maintained by NIIF.
Specification
Type | HP CP4000BL |
# of CPUs / node | 2 |
# of cores / CPU | 12 |
Memory / node | 66 GB |
Memory / core | 2.75 GB |
CPU | AMD Opteron 6174 @ 2.2GHz |
Architecture | x86_64 / intel64 / em64t little-endian |
Scheduler | Slurm |
MPI | Intel (impi) |
ESZR | local |
Shell Environment
Precompiled Environment
From Skynet sync up the precompiled environment:
cd /share/niif/budapest sshput
The precompiled environment does not contain the program binaries and data libraries. To transfer a PROGRAM:
cd /share/niif/budapest/pkg sshtx put budapest PROGRAM
Shell Framework
Login to budapest:
sshin budapest
On budapest:
cd $HOME git clone git://github.com/hornos/shf3.git
Source and setup the Shell Framework in $HOME/.bash_profile
:
source $HOME/shf3/bin/shfrc # set the prompt shf3/ps1 BUDAPEST[\\h] # set framework features shf3/alias yes shf3/screen yes shf3/mc/color yes # screen workaround if shf3/is/screen ; then source "/etc/profile.d/modules.sh" fi # tab complete source $HOME/shf3/bin/complete
Parallel Compressor
Enable the parallel compressor for the framework:
cd $HOME echo "sys_zip_xg_gz=pigz" > shf3/lib/sys/zip/xg/config.$USER
Module Environment
ESZR is our unified computing environment. Enable ESZR system modules in $HOME/.bash_profile
:
# reset module purge module use ${HOME}/site/eszr/mod/common module load eszr/local module load eszr/env/local module load eszr/sys/niif/budapest module use ${HOME}/site/eszr/mod/local source ${ESZR_ROOT}/env/alias
Scheduler
The job scheduler is SGE.
General information about the queues:
qstat -g c
Queue | Allowed Groups | Purpose |
---|---|---|
test.q | ALL | Test queue (2 nodes) |
budapest.q | ALL | Production |
General information on jobs:
qstat -u "*"
Job Setup
Setup the Queue file and edit the parameters:
cd $HOME/shf3/mid/que cp templates/niif/budapest . mcedit budapest
Job template is in $HOME/shf3/mid/que/templates/niif/budapest.job
Special Options
If you need to allocate a full node but want to start arbitrary number of MPI processes set in the job file:
SLTPN=24
which will specify the total number of SGE slots per node and the total number of slots will be NODES*SLTPN
. You can use the following combinations in MPI-OMP mode if you run out of memory. Set hybrid mode in the job file:
MODE=mpiomp/impi
and the socket/core number accoring to your needs.
SCKTS (# of MPI proc / node) | CORES (# of OMP threads / MPI proc) | Memory / MPI proc |
---|---|---|
2 | 12 | 33 GB |
4 | 6 | 16.5 GB |
8 | 3 | 8.3 GB |