34329 execution environments for parallel architectures



course goals

The course basic goal is to study and to design the services that the system will offer in order to achieve an efficient parallel execution. We will envision the application needs depending on the application execution patterns and depending on the architecture on top of which the application is running. The main discussion is centered on the operating environment and how this environment can affect the application performance. Other topics are: which services are more appropriate; in which way we have to offer the services; which are the best configuration parameter values for a given environment and how can we modify these parameters. Each course we will focus in some emergent technologies as SMP's and cluster computing, multicores or heterogeneous multiprocessors.


course outline

 


course methodology

This course has an important experimental component. The main basic concepts previously known and we will assume they have been deeply studies in a previous basic course. With this assumption, we will start from the beginning of the course discussing about main existing research works, those works needed to have a good horizontal background concerning parallel architectures.

Approximately half  of the classes will guided by the professors presentations and the rest will be guided by the participation of students. Professors will propose research papers to make group discussions and will also propose some projects to be done individually or by groups. Both, paper discussion and shepherding of these works will by done during the practical classes. 


course material

·         introduction to parallel execution environment

o      @InCollection{feitelson04,  Author = "Dror G. Feitelson and Larry Rudolph and Uwe Schwiegelshohn",  Title = "Parallel Job Scheduling --- A Status Report",  BookTitle = "Job Scheduling Strategies for Parallel Processing",  Publisher = "Springer Verlag",  Year = "2004",  Editor = "Dror G. Feitelson and Larry Rudolph and Uwe Schwiegelshohn",  Pages = "1--16",  Note = "Lect. Notes Comput. Sci. vol.~3277",}

·         OS architectures, abstractions and evolution

·         runtime for parallel systems

 Slides

http://docencia.ac.upc.edu/doctorat/ENGRAP/EEAP/EEAP1.pdf
http://docencia.ac.upc.edu/doctorat/ENGRAP/EEAP/EEAP1-4.pdf
http://docencia.ac.upc.edu/doctorat/ENGRAP/EEAP/EEAP2.pdf
http://docencia.ac.upc.edu/doctorat/ENGRAP/EEAP/EEAP2-4.pdf
http://docencia.ac.upc.edu/doctorat/ENGRAP/EEAP/EEAP3.pdf
http://docencia.ac.upc.edu/doctorat/ENGRAP/EEAP/EEAP3-4.pdf
http://docencia.ac.upc.edu/doctorat/ENGRAP/EEAP/EEAP4.pdf
http://docencia.ac.upc.edu/doctorat/ENGRAP/EEAP/EEAP4-4.pdf
PAPERS

o       http://www.openmp.org i l'especificacio d'OpenMP:           http://www.openmp.org/drupal/mp-documents/spec25.pdf

o      M. Gonzalez, J. Oliver, X. Martorell, E. Ayguadé, J. Labarta and N. Navarro. OpenMP Extensions for Thread Groups and Their Run-time Support. 13th International Workshop on Languages and Compilers for Parallel Computing (LCPC'2000), New York (USA). pp. 317-331. August, 2000 (associat)

·         virtualization support

papers

o      "Virtualizing I/O Devices on VMware Workstation's Hosted Virtual Machine Monitor" http://docencia.ac.upc.edu/doctorat/ENGRAP/sugerman.pdf

o      "Constructing Services with Interposable Virtual Hardware"
http://denali.cs.washington.edu/pubs/distpubs/papers/denali_nsdi.pdf

o      "Xen and the Art of Virtualization"
http://docencia.ac.upc.edu/doctorat/ENGRAP/2003-xensosp.pdf

o      "Singularity: Rethinking the Software Stack"
http://delivery.acm.org/10.1145/1250000/1243424/p37-hunt.pdf?key1=1243424&key2=3440300911&coll=GUIDE&dl=GUIDE&CFID=35424782&CFTOKEN=44158952

·         job scheduling in SMP clusters  

papers

o      @inproceedings{689506,  author = {Joseph Skovira and Waiman Chan and Honbo Zhou and David A. Lifka},  title = {The EASY - LoadLeveler API Project},  booktitle = {IPPS '96: Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing},  year = {1996},  isbn = {3-540-61864-3},  pages = {41--47},  publisher =  Springer-Verlag}, address = {London, UK}, } http://www.cs.huji.ac.il/~feit/parsched/jsspp96/p-96-3.pdf

o      @inproceedings{ feitelson96packing,    author = "Dror G. Feitelson",    title = "{Packing Schemes for Gang Scheduling}",    booktitle = "Job Scheduling Strategies for Parallel Processing~-- Proceedings of the {IPPS}'96 Workshop",    volume = "1162",   publisher = "Springer",    editor = "Dror G. Feitelson and Larry Rudolph",    pages = "89--110",    year = "1996",    url = "citeseer.ist.psu.edu/feitelson96packing.html" }

o      @misc{ jette02slurm,  author = "M. Jette and M. Grondona",  title = "SLURM: Simple Linux Utility for Resource Management",  text = "Jette, M.A., Grondona, M.A.: SLURM: Simple Linux Utility for Resource Management.    Lawrence Livermore National Laboratory, CA, UCRL-MA-147996-REV (August 28,    2002)",  year = "2002",  url = "citeseer.ist.psu.edu/jette02slurm.html" }

o      @InCollection{frachtenberg05,Author = "Eitan Frachtenberg and Dror G. Feitelson",   Title = "Pitfalls in Parallel Job Scheduling Evaluation",   BookTitle = "Job Scheduling Strategies for Parallel Processing",  Publisher = "Springer Verlag",  Year = "2005",  Editor = "Dror G. Feitelson and Eitan Frachtenberg and Larry Rudolph and Uwe Schwiegelshohn",  Pages = "257--282",  Note = "Lect. Notes Comput. Sci. vol.~3834",}

o      @InCollection{tsafrir05,  Author = "Dan Tsafrir and Yoav Etsion and Dror G. Feitelson",  Title = "Modeling User Runtime Estimates",  BookTitle = "Job Scheduling Strategies for Parallel Processing",  Publisher = "Springer Verlag",  Year = "2005",  Editor = "Dror G. Feitelson and Eitan Frachtenberg and Larry Rudolph and Uwe Schwiegelshohn",  Pages = "1--35",  Note = "Lect. Notes Comput. Sci. vol.~3834",}

SLIDES

                Job Scheduling in HPC Systems


evaluation

The course evaluation will be done based on student participation in paper discussion and mainly in the realization of the projects. This projects will be specified during the firsts weeks of the course, realized (individually or in group depending on the project) during the course, and presented to the rest of the course during the last two weeks. All the parts of the project (specification, realization, and presentation) are components of the evaluation.

·         Research works

o      Implementation of a CpuManager in Altix

o      Dynamic interposition tools/mechanism in Altix

o      Implementing a job scheduler simulator

o      Adding local policies as plug-in in SLURM

 


how to learn