Parallel computing on HPC clusters

工欲善其事，必先利其器

Parallel computing with Julia

Julia has become a popular programming language. In this article, I will use multi-threading both on multi-core PCs and HPC clusters.

First, I give an example of parallel computing with for loop:

# Julia code
using CSV, DataFrames, Base.Threads, BenchmarkTools

M = 50
n = 100
d = 3
sig = 1

error = zeros(M)
@btime @threads for i in 1:M
    error[i] = simulation(n, d, sig)
end

df = DataFrame(error, :auto)

CSV.write("measure$(n)_$(d)_$(sig).csv", df)

In this example, we implement 50 Monte Calo simulations with function simulation() in parallel. @btime a benmark to print out the runtime of the for loop.

On a multi-core PC, we can run the Julia file code.jl with 8 threads as follows

julia -t 8 code.jl

On a HPC cluster, we can run Julia file with 8 threads as follows

#!/bin/bash
#SBATCH -p batch
#SBATCH -t 00:10:00
#SBATCH -N 1       
#SBATCH -n 1     
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=4G
#SBATCH --output=code_%A.out
#SBATCH--mail-type=FAIL
#SBATCH--mail-type=END
#SBATCH--mail-user=xxxx@gmail.com

echo "SLURM_JOBID: " $SLURM_JOBID
echo "SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID
echo "SLURM_ARRAY_JOB_ID: " $SLURM_ARRAY_JOB_ID

module load julia
# Run Julia
srun julia -t $SLURM_CPUS_PER_TASK code.jl

In this slurm file, please modify xxxx@gmail.com to your email address where you can receive a notification when the computation is done.

Parallel computing with Python

Python is also a very popular programming language. In the example, I will use multi-processing on HPC clusters. On multi-core PCs, we can simply set cpus = 4 in the Python file.

First, I give an example of parallel computing with map function:

import numpy as np
import pandas as pd
import random
import os
from multiprocessing import Pool

if 'SLURM_CPUS_PER_TASK' in os.environ:
    cpus = int(os.environ['SLURM_CPUS_PER_TASK'])
    print("Dectected %s CPUs through slurm"%cpus)
else:
    # None means that it will auto-detect based on os.cpu_count()
    cpus = None
    print("Running on default number of CPUs (default: all=%s)"%os.cpu_count())

def process(arg):

    mse = simulation(n, d, sig)
    return mse

if __name__ == '__main__':
    random.seed(0)
    n=100
    d=3
    sig = 1

    with Pool(cpus) as p:
        df = pd.DataFrame(p.map(process, range(50)))
        df.to_csv('mse' + '{0}_{1}_{2}.csv'.format(n, d, sig))

In this example, we implement 50 Monte Calo simulations with function simulation() in parallel.

On a HPC cluster, we can run Python file as follows

#!/bin/bash
#SBATCH -p batch
#SBATCH -t 00:10:00
#SBATCH -N 1       
#SBATCH -n 1     
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=4G
#SBATCH --output=code_%A.out
#SBATCH--mail-type=FAIL
#SBATCH--mail-type=END
#SBATCH--mail-user=xxxx@gmail.com

echo "SLURM_JOBID: " $SLURM_JOBID
echo "SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID
echo "SLURM_ARRAY_JOB_ID: " $SLURM_ARRAY_JOB_ID

module purge
module load anaconda
# Run Python
srun python3 code.py