This lesson is being piloted (Beta version)

Hands On DM@LHC : Pre-workshop Material

Before working through the tutorial, please be diligent and read through this page and ensure you are familiar with all of these computing skills. They are essential for any HEP physicist and will benefit you throughout your career. However, if you are pressed for time in the next days then be sure to install and familiarize yourself with Docker.

Install This Now! : Docker

Docker is a powerful tool that allows you to perform a virtualization of your environment but completely in software. It allows you to bundle up the installation of tools for use by others in a uniform way and we will be using it throughout this bootcamp. Installing docker is absolutely necessary and there are directions to do this in each operating system. For those of you that are using a Windows operating system, if you already have docker running and are comfortable using it, that is fine. However, if you do not, then be aware that its usage on Windows can be challenging and none of the tutors know how to use such a setup. Therefore, we highly reccomend that you reconsider your decision to use the Windows operating system as a high energy physicist.

It is highly recommended that you DO NOT use Windows. Few individuals use this OS within the HEP community as most tools are designed for Unix-based systems. If you do have a Windows machine, consider making your computer a dual-boot machine - Link to Directions

Download Docker for Windows instructions.

Docker Desktop for Windows is the Community Edition (CE) of Docker for Microsoft Windows. To download Docker Desktop for Windows, head to Docker Hub.

Please read the relevant information on these pages, it should take no more than 5 minutes.


Download Docker for MacOS instructions.

Docker is a full development platform for creating containerized apps, and Docker Desktop for Mac is the best way to get started with Docker on a Mac. To download Docker Desktop for MacOS, head to Docker Hub.

Please read the relevant information on these pages, it should take no more than 5 minutes.
Another common way to install packages on Mac OSX is via the homebrew package manager. In the case of docker, you can easily install docker by setting up homebrew and executing brew cask install docker.


Downloading and installing Docker for Linux may be slightly more difficult but please contact the organisers or tutors as soon as possible so they can help with any problems.

Here are the instructions for two popular Linux distributions:

  • CentOS
  • Ubuntu

  • Instructions for other Linux distributions can be found on the Docker docs pages.

    Be sure to read the Docker documentation on post-installation steps for Linux and managing Docker as a non-root user. This will allow you to edit files when you have started up the Docker container from your image.

    Know the Basics

    There is a wealth of documentation on the basics of docker all over the internet. However, a local tutorial, written by Matthew Fiecker who is a HEP colleague, can be found here - Link To Tutorial. Please take 2 hours to work through the basics here. Then confirm that you appreciate what the following commands do

    docker pull smeehan12/dmatlhc2019-tutorial
    docker run --rm -it -v $PWD:/contur/local smeehan12/dmatlhc2019-tutorial bash
    

    It would be a good idea to spend some time thinking about this since you will need to be using both of these commands for the tutorial itself.

    Know This Stuff Before the Tutorial

    Throughout this workshop we will assuming previous knowledge in Unix, Python, C++, as well as a new tool called Docker, which you have hopefully installed above. Luckily these topics are heavily documented already and there is a lot of material available so if you feel you are not very confident with any of these prerequisites please read the links below and work through the examples before you arrive. It is not necessary for you to be an expert but it would be beneficial to have some experience and to feel comfortable working with them. For many of you, this may just be going through and confirming your familiarity with these topics, but for others, please take this as an opportunity/excuse to learn these fundamental skills.

    Prerequisite Checklist

    • Unix and the Terminal : Review and understand the contents of the Software Carpentry Tutorial.
    • Essential C++ : Ensure you understand the concepts in the “Bare Minimum” checklist.
    • Python : Review and understand the contents of the Software Carpentry Tutorial.
    • Git : The GitHub service is very common and you should know it (and sign up for an account if you haven’t already). The service used by CERN is GitLab and if you have a CERN computing account, you can log on here as well.

    Unix and Shells

    In Unix operating systems like Linux or OSX, the shell (“the terminal”) is a program that interprets commands and acts as an intermediary between the user and the inner workings of the operating system. It can help you navigate your files and directories. You can create (touch and mkdir), copy (cp) and delete (rm) files and repositories. Even more powerful things are possible such as working remotely with ssh, the transfer of files with scp and shell scripting tasks you do regularly.

    What are those commands?

    If you are wondering what any of the commands are in the above paragraph, then you can always use the shell to learn more by using the man command. Read more about the ssh command by trying the following

    man ssh
    

    More importantly, if you are wondering how any of these commands work, then it is imperative that you spend a day reviewing how the shell works prior to the bootcamp.

    Software Carpentry provides two very nice in-depth tutorials on the shell

    A few other very good go-to resources that are good to work through

    Please take some time before the workshop to read through these.

    The bare minimum

    Though the essential aspects and “bare minimum” is very subjective, a few of the most basic concepts that you should be comfortable with are the following.

    • Navigating files and directories (pwd,ls and cd)
    • Create, copy and delete files and directories (mkdir, touch, mv, cp and rm)
    • Executing shell scripts (source script.sh)
    • Environmental Variables (echo $ENVVAR)

    You can use this as the “bare minimum” checklist for this bootcamp and you will be expected to be familiar with these concepts throughout the bootcamp.

    Essential C++

    Throughout your studies and career in particle physics, you will find a large amount of your code and software will be written in C++. Within ATLAS for example, the software is very C++ based and many analysis frameworks are constructed using C++. The problem with teaching C++ in the context of research is that it is such a broad programming language that it becomes very difficult to adequately explain “the essential” things that you need to know. It would be very worth your while to enroll in a formal university level introduction to C++ course and this will pay dividends in your research. Fortunately, there is a large amount of C++ material available and hopefully you can gain some basic understanding of C++ essentials if you are not familiar already. And for the purposes of this bootcamp, a certain amount of requisite knowledge is necessary, as highlighted below. A few good references are :

    The bare minimum

    Though the essential aspects and “bare minimum” is very subjective, a few of the most basic concepts that you should be comfortable with are the following.

    You can use this as the “bare minimum” checklist for this bootcamp and you will be expected to be familiar with these concepts throughout the bootcamp.

    Python

    Python is a popular programming language that runs on an interpreter system. This means that unlike C++, you do not need to explicitly compile the code prior to running it. Instead the code can be executed as soon as it is written. This makes it very appropriate for writing scripts and testing things quickly. Python was designed for readability, and has some similarities to the English language with influence from mathematics (e.g. sets). Furthermore, it is possible to link C++ compiled libraries with python code, allowing you to form an “interface” between the two, thereby taking advantage of the best of both worlds.

    It is rare to not come across Python in some form when working in particle physics and data analysis, particularly when using advanced techniques in machine learning. In the ATLAS experiment, python is used in many places, most notably as a mechanism by which we “steer” our C++ code. This is done in the form of jobOptions or a steering macro. However, in this tutorial, we will not explicitly be using jobOptions to execute our code. This is something that is rather left for the ATLAS software tutorial.

    Software Carpentry provides a very nice introductory tutorial to python - Link to Tutorial. All participants should review this tutorial and ensure they are comfortable with its contents prior to the bootcamp.

    Beyond this, a few additional resources that are very useful

    MadGraph is written in python, and so being familiar with the syntax will be very useful here!

    GitHub and GitLab

    You may be familiar with Git (if not, you will learn about it from Software Carpentries) and even have a GitHub account. Many other equivalent services such as this, that serve as places where one can house remote repositories exist (e.g. bitbucket and atlassian) and the one that is used at CERN is GitLab.

    Log in to GitLab

    Confirm that your credentials work properly by logging into GitLab on your web browser. For this, just use the same username and password that you use for everything else CERN-related.

    Once you do this, explore around the site a bit. Fill in your profile information and upload a picture to serve as your avatar.