Logging into apcssh without a password
You already have an APC account with a login name, which I shall call APClogin. I assume you have already succesfully logged into your APC account using your password. Write down your password somewhere, case you need it.
Why would you need to avoid entering a password? Typing in your password each time you log in is tedious. You may also need to call ssh from within a shell script.
If you are using Mac OS X, you can do all of the steps below in Terminal. If you are using Windows, you need an ssh client such as PuTTY .
LocalLogin stands for the login name on your local machine.
Here’s how to do it.
1: Generate the authentication keys
Type the following in your terminal window.
$ ssh-keygen -t rsa
You will get a message saying:
Generating public/private rsa key pair.
Enter file in which to save the key (/Users/LocalLogin/.ssh/id_rsa):
If you wish to change the default location, go ahead and specify a file path. Better to keep it simple, and just press Enter.
You will get this message asking for a password (“passphrase”). Do not enter one. Just press Enter, twice.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
If you did everything properly you will get a message giving the file path to the keys, and the key fingerprint:
Your identification has been saved in /Users/LocalLogin/.ssh/id_rsa.
Your public key has been saved in /Users/LocalLogin/.ssh/id_rsa.pub.
The key fingerprint is:
The key's randomart image is:
| . |
| . . O .|
| . o o O + |
| S o B * =|
| ... . + o = = |
| Ho. . o = . .|
|o=+O.o .. o + o |
|o=O+*. ..+ .|
2: Create a .ssh directory on apcssh
Next, you need to create a .ssh directory on apcssh.in2p3.fr by typing:
$ ssh APClogin@apcssh.in2p3.fr mkdir -p .ssh
You will be asked for your password (that is why you need to have it written down somewhere). Type it in.
3. Append your local public key to the authorised keys on apcssh
Enter the line below. You will then be asked for your password, which you need to enter.
$ cat .ssh/id_rsa.pub | ssh APClogin@apcssh.in2p3.fr 'cat >> .ssh/authorized_keys'
Now you should be able to log into apccsh.in2p3.fr using the usual ssh command without entering a password.
Doing the same for the APC cluster
If the above works, you can log into apcclm following the same steps, except that you need to log into apccssh first.
1. Log into apccsh (which you can now do without a password)
2. Generate the authentication keys
3. Create a .ssh directory on apcclm by typing
$ ssh APClogin@apcclm mkdir -p .ssh
4. And you’re done
Montepython on the APC cluster
The official documentation is here http://monte-python.readthedocs.io/en/latest but it glosses over some important details. You may find more information here: http://www.iac.es/congreso/cosmo2017/media/montepython.pdf
Installing Montepython is quite straightforward if you follow the installation guide. Just make sure that that your version of Python is 2.7. There are some syntax changes in Python 3 which prevent the code from installing.
Running Montepython on your local machine is easy if you follow the official documentation. For the code to be any use, however, you need to output chains with thousands of points. And that means running it on the APC cluster.
Here are some helpful tips.
The graphical backend
Montepython and the CLASS Python wrapper use Matplotlib. You need to log in with the -Y option for both apcssh and apcclm:
$ ssh -Y APClogin@apcssh.in2p3.fr
$ ssh -Y apcclm
When you run Montepython on the cluster using a script, you will need to set this environment variable in the script itself (see below).
External programs within CLASS
If you modify CLASS by calling an external program (let’s call it PowerSpectrumExtension.py) to calculate some quantity, remember to make it executable by running
chmod +x PowerSpectrumExtension.py
You need to write a script that gets the job done. This is described here https://www.apc.univ-paris7.fr/FACeWiki/pmwiki.php?n=Apc-cluster.Scheduler.
When you run jobs on a cluster, you are sharing resources with the other users. If you ask for resources (memory, number of nodes) that are unavailable, or ask for too much, your job will be sent to the back of the queue, or aborted.
Here’s an example of a message for an aborted run:
There are not enough slots available in the system to satisfy the 4 slots
that were requested by the application:
Either request fewer slots for your application, or make more slots available
You also need to set the right environment variables for the required libraries
This is an example of a script which ran succesfully on the APC cluster:
#PBS -N JOBNAME
#PBS -o $PBS_JOBID.out
#PBS -e $PBS_JOBID.err
#PBS -q furious
#PBS -m bea
#PBS -M firstname.lastname@example.org
#PBS -l nodes=1:ppn=32,mem=64GB,walltime=200:00:00
/usr/local/openmpi/bin/mpirun -np 4 env MPLBACKEND=Agg montepython/Montepython.py run -p input/lcdm.param -o chains/planck/lcdm -N 20000 --silent
The –silent command suppresses Montepython’s screen output (which you don’t need when you submit a cluster job).
Here are some good resources explaingin qsub settings:
Analysing the chains
Once the run has terminated, output the plots and information by running:
env MPLBACKEND=Agg montepython/Montepython.py info [path]/[to]/[chains]/*.txt --want-covmat
The option –want-covmat outputs the covariance matrix.
Make sure to include env MPLBACKEND=AGG or you will get the usual matplotlib display problems.