Delavnica Comtrade 15.3.2017

This is ARC workshop. Grid jobs will be tested on ARNES cluster Jost using ARC middleware, which is supported on most clusters in Sling.

Install the ARC client

ARC client is available for most Linux distributions and for MacOS. It works partially with Windows, only when https protocol is supported. See this howto to install the client on your machine. Then follow the instructions here.

Install your certificate

Request your certificate on the SiGNET webpage.

Install the certificate on your computer by using this script. To install the certificate manually, follow the instructions below:

# mkdir ~/.arc
# openssl pkcs12 -in certificate.p12 -clcerts -nokeys -out usercert.pem
# openssl pkcs12 -in certificate.p12 -nocerts -out userkey.pem
# chmod 400 userkey.pem
# chmod 644 usercert.pem
# mv user*.pem ~/.arc

Virtual organization and authorization

To access the grid, each user has to be a member of a virtual organization. Virtual organization attributes roles to users and defines policy. Different clusters support different VO-s. In Sling, all clusters support the national VO, called

If you already have your own SiGNET certificate, join the VO on this webpage:

User interface

You can use ARC client on your own computer, or you can use the client installed on a virtual machine, provided by Arnes for this workshop. The username and password will be assigned to you by the staff.

Connect to the virtual machine using your credentials

  ssh demo$

Save your certificate file and key to the ~/.arc/ folder, as described in the beginning of this workshop:


ARC client settings

In order to make the submission process as easy as possible, settings can be saved in  ~/.arc/client.conf. To use ARC at Arnes, use the following configuration:

  vi .arc/client.conf



If you want to use HTTPS instead of GRIDFTPD protocol, use these settings:


You can check other possible settings here.

To specify the protocol while submitting the job, use the -S switch:

#to use GRIDFTP protocol
arcsub -c -S org.nordugrid.gridftpjob test.xrsl

#to use HTTPS protocol
arcsub -c -S org.ogf.glue.emies.activitycreation test.xrsl

To see the supported protocols on the cluster, use arcinfo:

arcinfo -c settings

mkdir -p ~/.arc/vomses/
cat <<end > ~/.arc/vomses/
"" "" "15001" \
mkdir -p ~/.arc/vomsdir
cat <<end > ~/.arc/vomsdir/

If your certificate was installed successfully, you are ready to go.

Useful commands

arcproxy #create proxy
arcproxy -S
arcsub  #send grid job and input files to the cluster
arcsub -c test.xrsl
arcstat #check the status of your job
arcstat <JOBID>
arcstat -a #all
arccat #check the current status of a job (stderr, gm.log)
arcget #transfer results
arcget -a #all
arcget <JOBID> #single job by id
arcget -c #all jobs running on Jost
arcget -S FINISHED #all finished jobs
arckill #cancel the job
arckill <JOBID>
arcls  #show directories on storage
arcrenew #renew your proxy (while still active)
arcsync #sync your joblist with the list on the server
arccp #copy files to the external storage
arcrm #remove document from storage

Some useful examples:

$ arcproxy -S #create proxy for
$ arcproxy –I #check proxy information
$ arcinfo #check cluster status
$ arcsub –c test.xrsl #send a test job to the cluster
$ arcsub –c test.xrsl -d DEBUG #submitting in debug mode
$ arcstat JOBID ali arcstat -a #check job status
$ arccat JOBID ali arccat -a #check current job status
$ arcget JOBID ali arcget –a #transfer results


Simple grid job submission

Before running your grid jobs on the cluster, some general info about the nodes would be useful. This test job will acquire environment variables on the worker node, where your job will be executed.


  (executable = /usr/bin/env)
  (jobname = "test")


Send the test job to the cluster:

arcsub -c -S org.nordugrid.gridftpjob test.xrsl

The same can be achieved by arctest command:

arctest -c -J 2

Arctest command is used for basic testing of ARC client and server.

  • Test job 1: calculates primer numbers for a number of minutes (-r 5) and outputs the result in stderr. Source is downloaded from an HTTP/FTP server and the program is compiled before running.
  • Test job 2: lists all environment variables at the worker node
  • Test job 3: copies a file from HTTP into a local file
  • arctest –certificate will print basic info about your certificate

Job with input files

First, let’s create two input files file1 and file2

echo "This is file 1" >> file1
echo "This is file 2" >> file2

Then create a bash script using those two input files:

cat file1 file2 > file3

Now, let’s write a description file (file.xrsl) for this job:

(inputFiles=("file1" "")("file2" "")("" ""))
(outputFiles=("file3" ""))
(walltime="5 minutes")

Send the job to the cluster, using debug mode:

arcsub -c -S org.nordugrid.gridftpjob test.xrsl -d DEBUG

Job with software

This is an example of a simple program that will be sent to the cluster with the job

Job description sum.xrsl

("" "")
("" "")
(outputfiles=("sum.out"  " ")
(runtimeenvironment = "APPS/COMTRADE/DEFAULT")


sum = 0
print "Display numbers: "
for x in ["1", "1050","164999"]: print x
print "Calculate the numbers "
for y in [1,1050,164999]: sum=sum+y
print sum

Execution script

python > sum.out


Programs are installed on the cluster on shared storage and can be used in your job by specifying the runtime environment (RTE). To see the available runtime environments on the cluster, use this command:

ldapsearch -x -h -p 2135 -b 'Mds-Vo-name=local,o=grid' \ 
| grep nordugrid-cluster-runtimeenvironment

They are also displayed on the grid monitor, see and click on the cluster.

We will test the jobs with RTE requirements in the next to exercises, when specifying the MPI environment for the job.

Remember, you can also specify multiple runtime environments in the same job description.

Parallel job with OPENMP

First we will run a helloworld OpenMP job on a single server. We will use 4 threads to run the program.

First we the program hello-omp.c:

#include <omp.h>
#include <stdio.h>
#include <stdlib.h>

int main (int argc, char *argv[]) 
int nthreads, tid;

/* Fork a team of threads giving them their own copies of variables */
#pragma omp parallel private(nthreads, tid)

  /* Obtain thread number */
  tid = omp_get_thread_num();
  printf("Hello World from thread = %d\n", tid);

  /* Only master thread does this */
  if (tid == 0) 
    nthreads = omp_get_num_threads();
    printf("Number of threads = %d\n", nthreads);

  }  /* All threads join master thread and disband */


Then we need an execution script

mpicc -fopenmp hello-omp.c -o hellomp
mpirun -np 1 hellomp > hello-omp.out

Now we need a description file hello-omp.xrsl:

(count = 8)
(contpernode = 8)
("" "")
("hello-omp.c" "hello-omp.c")
(outputfiles=("hello-omp.out"  " ")
(runtimeenvironment = "APPS/COMTRADE/OPENMPI-2.0.2")

This is a result of the job:

Hello World from thread = 3
Hello World from thread = 5
Hello World from thread = 7
Hello World from thread = 6
Hello World from thread = 1
Hello World from thread = 2
Hello World from thread = 4
Hello World from thread = 0
Number of threads = 8


Parallel job with MPI

Job description hellompi.xrsl:

(count = 4)
(jobname = "hellompi")
(inputfiles =
  ("" "")
  ("hellompi.c" "")
(outputfiles = 
  ("hellompi.out" "")
(executable = "")
(stdout = "hellompi.log")
(join = yes)
(walltime = "15 minutes")
(gmlog = log)
(memory = 2000)
(runtimeenvironment = "APPS/COMTRADE/OPENMPI-2.0.2")

Program hellompi.c:

/* C Example */
#include <stdio.h>
#include <mpi.h> 

int main (argc, argv)
     int argc;
     char *argv[];
  int rank, size;

  MPI_Init (&argc, &argv);      /* starts MPI */
  MPI_Comm_rank (MPI_COMM_WORLD, &rank);        /* get current process id */
  MPI_Comm_size (MPI_COMM_WORLD, &size);        /* get number of processes */
  printf( "Hello world from process %d of %d\n", rank, size );
  return 0;

Execution script

echo "Compiling example"
mpicc -o hello hellompi.c
echo "Done."
echo "Running example:"
mpiexec -np 1 ${PWD}/hello > hellompi.out
echo "Done."

AD1: There is more. Try the same job example by using infiniband network and by specifying the program to use 2 cores per node.

Advanced exercises

Massive job submission and Arcrunner

First example for massive job submission is as follows:

We need an xRSL template, it will be included in the submission script:


import os, sys

jobDescription = '''&(
(cpuTime='5 minutes')
(count = 1)
(inputFiles=('' ''))

This is the python script to submit the jobs:


import os, sys

jobDescription = '''&(
(cpuTime='5 minutes')
(inputFiles=('' ''))

totalJobs = 4

for i in range(totalJobs):
	# Removing newlines from jobDescription and convert
	# to a string for use with arcsub
	jobDescriptionString = "".join(jobDescription.split("\n"))
	os.system('arcsub -c -S org.nordugrid.gridftpjob\
 -o joblist.xml --jobdescrstring="%s"' \
% (jobDescriptionString % i))

JobName will be adapted for each job: job0000-job000n-1
TotalJobs is set to 4, therefore 4 grid jobs will be sent.

To run the command, we use a for loop:

for i in range(totalJobs):

Xrsl is used as a string, which is used with arcsub command:

jobDescriptionString = "".join(jobDescription.split("\n"))

We can save the job ID-s to the job.list file and then monitor the job status with: arcstat -j job.list

Submit the jobs to the system:

os.system('arcsub -c --jobdescrstring="%s"' \
% (jobDescriptionString % i))

% is used for naming purposes.

echo "This is a massive job submission test."

We can submit the jobs:


Check the status:

arcstat -i joblist.xml

Download the results

arcget -i joblist.xml


CSC in Finland wrote a simple submission script, called Arcrunner, that enables massive job submission, monitoring the jobs and retrieving the results, when the jobs are finished. It can be downloaded from here.

First unzip the program:

cd arcrunner/bin

Change the jobmanager path to the location where you extracted the arcrunner file:

set jobmanagerpath=("~/arcrunner")

And to add it to your commands:

 export PATH=~/arcrunner/bin:$PATH

The minimum input to use it is:

 arcrunner -xrsl job_descriptionfile.xrsl

This are the options:

arcrunner options:
Option             Description
-xrsl file_name    The common xrsl file name that defines the jobs.
-R file_name       Text file containing the names of the clusters to be used.
-W integer         Maximum number of jobs in the grid waiting to run.
-Q integer         The max time a job stays in a queue before being resubmitted.
-S integer         The max time a job stays in submitted state before being resubmitted.
-J integer         Maximum number of simultaneous jobs running in the grid.


Sending a helloworld job using CUDA.

First we need the program

// This is the REAL "hello world" for CUDA!
// It takes the string "Hello ", prints it, then passes it to CUDA with an array
// of offsets. Then the offsets are added in parallel to produce the string "World!"
// By Ingemar Ragnemalm 2010
#include <stdio.h>

const int N = 7;
const int blocksize = 7;

void hello(char *a, int *b)
 a[threadIdx.x] += b[threadIdx.x];

int main()
 char a[N] = "Hello ";
 int b[N] = {15, 10, 6, 0, -11, 1, 0};

 char *ad;
 int *bd;
 const int csize = N*sizeof(char);
 const int isize = N*sizeof(int);

 printf("%s", a);

 cudaMalloc( (void**)&ad, csize );
 cudaMalloc( (void**)&bd, isize );
 cudaMemcpy( ad, a, csize, cudaMemcpyHostToDevice );
 cudaMemcpy( bd, b, isize, cudaMemcpyHostToDevice );

 dim3 dimBlock( blocksize, 1 );
 dim3 dimGrid( 1, 1 );
 hello<<<dimGrid, dimBlock>>>(ad, bd);
 cudaMemcpy( a, ad, csize, cudaMemcpyDeviceToHost );
 cudaFree( ad );

 printf("%s\n", a);

We need a script to run the program:

nvcc -o helloworld

Now we prepare a job description file cuda.xrsl:

(jobname = "hellocuda")
(inputfiles =
  ("" "")
  ("" "")
(outputfiles = 
  ("hellocuda.out" "")
(executable = "")
(stdout = "hellocuda.log")
(join = yes)
(walltime = "15 minutes")
(gmlog = log)
(memory = 2000)
(runtimeenvironment = "APPS/COMTRADE/GPU")


Grid job in Singularity container

RTE-s in grid solve many scientific problems, but they are limited. We have implemented lightweight virtualization on the cluster so that other operating systems are also supported and grid users have more flexibility.

We will run a test job on the cluster in Singularity container. Let’s create a simple bash script to check the environment on the container:

cat /etc/lsb-release

And now continue with your job description:

(inputFiles=("" ""))
(walltime="5 minutes")

Send the job to the cluster and see the results.