Theory and practice of deploying linear solvers on large heterogenous computational systems

Prelegent: Dr Katarzyna Świrydowicz (Pacific Northwest National Laboratory, USA)
Data i godzina:  8 stycznia 2024 r., g. 10:30

Abstract: A need to solve an extremely large, sparse linear system commonly occurs in applications that are otherwise HPC-ready. While often overlooked, the repeated estimations of the solutions to linear systems can constitute ninety or more percent of the total application time. In the talk, I will share my experience gained while work on linear solvers developed during multiple subprojects of the Exascale Computing Project, which was a large scientific effort, spanning multiple US universities and national laboratories, and sponsored by the US Department of Energy. I will explain the limitations and challenges in the usage of linear solvers in applications. I will also present a variety of theoretical- and implementation-based techniques that have lead to reliable, GPU-ready linear solvers.

 

Szeregowanie zadań typu malleable

Prelegent: Prof. dr hab. inż. Maciej Drozdowski (Politechnika Poznańska)
Data i godzina:  31 stycznia 2023 r., g. 12:00

Streszczenie: Przedmiotem wystąpienia będą problemy szeregowania zadań typu malleable, tzn. takich które można wykonywać na wielu procesorach jednocześnie, a ponadto liczba używanych procesorów może się zmieniać w czasie. Przedstawione zostaną sformułowanie ogólne, wybrane przypadki wielomianowe, problemy otwarte i NP-trudne, stosowane podejścia.

 

Szeregowanie zadań dla k-dzielnego grafu ograniczeń

Prelegent: Dr Krzysztof Turowski (Uniwersytet Jagielloński w Krakowie)
Data i godzina:  13 grudnia 2022 r., g. 10:30

Streszczenie: Problem szeregowania zadań z grafem ograniczeń, wprowadzony przez Hansa Bodlaendera, Klausa Jansena i Gerharda Woegingera w 1994 roku, polega na przydziale zadań do maszyn z dodatkowym warunkiem, aby zadania sąsiadujące w grafie ograniczeń nie zostały przydzielone do wykonania na tej samej maszynie. W referacie zostaną przedstawione problemy oraz rezultaty otrzymane dla szczególnego przypadku, gdy graf ograniczeń jest grafem pełnym k-dzielnym dla różnego rodzaju maszyn (identycznych, jednorodnych, dowolnych), typów zadań (jednostkowe, dowolnej długości), kryteriów (maksymalny i średni czas zakończenia zadania), a także w różnym ujęciu liczby partycji w grafie (jako część instancji vs. parametr problemu). Zaprezentowane zostaną dowody i różnorodne techniki użyte do ich wyprowadzenia (m.in. programowanie dynamiczne, programowanie liniowe),  a także szereg problemów otwartych, które nadal oczekują na rozwiązanie.

Inhibitory w świecie RNA

Prelegent: Mgr Jarosław Synak (Instytut Chemii Bioorganicznej PAN)
Data i godzina:  6 grudnia 2022 r., g. 10:30

Streszczenie: Współczesne komórki posiadają ogromną liczbę mechanizmów regulujących, które pozwalają dostosować się do zmiennych warunków zewnętrznych, kontrolują podział, a także stabilizują poziomy poszczególnych substancji. Jak wszystko, musiało to kiedyś mieć swój początek, dlatego proponujemy koncepcję takiego mechanizmu w Świecie RNA. Jest on na tyle prosty, że mógł wykształcić się spontanicznie, jednak na tyle złożony, żeby pełnić swoją funkcję – regulować poziom reagentów. Współautorami prezentowanych wyników są Agnieszka Rybarczyk i Jacek Błażewicz.

 

Using Unused: Non-Invasive Dynamic FaaS Infrastructure with HPC-Whisk

Prelegent: Dr Bartłomiej Przybylski (Pracownia Algorytmiki)
Data i godzina:  8 listopada 2022 r., g. 11:00

Streszczenie: Modern HPC workload managers and their careful tuning contribute to the high utilization of HPC clusters. However, due to inevitable uncertainty it is impossible to completely avoid node idleness. Although such idle slots are usually too short for any HPC job, they are too long to ignore them. Function-as-a-Service (FaaS) paradigm promisingly fills this gap, and can be a good match, as typical FaaS functions last seconds, not hours. Here we show how to build a FaaS infrastructure on idle nodes in an HPC cluster in such a way that it does not affect the performance of the HPC jobs significantly. We dynamically adapt to a changing set of idle physical machines, by integrating open-source software Slurm and OpenWhisk. We designed and implemented a prototype solution that allowed us to cover up to 90% of the idle time slots on a 50k-core cluster that runs production workloads.

This work is going to be presented during The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC’22).

Automatic generation of FPTASes for stochastic monotone dynamic programs made easier, or Delegating algorithm design to dynamic programming (re)formulations

Prelegent: Dr Nir Halman (Bar-Ilan University, Israel)
Data i czas:  7 września 2022 r., g. 12:00

Abstract: In this lecture, we go one step further in the automatic generation of FPTASes for multi-stage stochastic dynamic programs with scalar state and action spaces, where the cost-to-go functions have a monotone structure in the state variable.  While there exist a few frameworks for automatic generation of FPTASes, so far none of them is general and simple enough to be extensively used. We believe that our framework has these two attributes, and has great potential to attract interest from both the operations research and theoretical computer science communities. (Joint work with Tzvi Alon).

 

A log-linear (2+5/6)-approximation algorithm for parallel machine scheduling with a single orthogonal resource

Prelegent: Dr Bartłomiej Przybylski (Pracownia Algorytmiki)
Data i godzina:  16 listopada 2021 r., g. 10:30

Streszczenie: As the gap between compute and I/O performance tends to grow, modern High-Performance Computing (HPC) architectures include a new resource type: an intermediate persistent fast memory layer, called burst buffers. This is just one of many kinds of renewable resources which are orthogonal to the processors themselves, such as network bandwidth or software licenses. Ignoring orthogonal resources while making scheduling decisions just for processors may lead to unplanned delays of jobs of which resource requirements cannot be immediately satisfied. We focus on a classic problem of makespan minimization for parallel-machine scheduling of independent sequential jobs with additional requirements on the amount of a single renewable orthogonal resource. We present an easily-implementable log-linear algorithm that we prove is (2 + 5/6)-approximation. In simulation experiments, we compare our algorithm to standard greedy list-scheduling heuristics and show that, compared to LPT, resource-based algorithms generate significantly shorter schedules. (joint work with A. Naruszko and K. Rządca)

This is a retake of the presentation presented during the 27th International European Conference on Parallel and Distributed Computing (Euro-Par).

Maximizing the total weight of on-time jobs on parallel machines subject to a conflict graph

Prelegent: Dr Joanna Berlińska (Pracownia Algorytmiki)
Data i godzina:  9 listopada 2021 r., g. 10:30

Streszczenie: We consider scheduling on parallel machines under the constraint that some pairs of jobs cannot be processed concurrently. Each job has an associated weight, and all jobs have the same deadline. The objective is to maximize the total weight of on-time jobs. The problem is known to be strongly NP-hard in general. A polynomial-time algorithm for scheduling unit execution time jobs on two machines is proposed. The performance of a broad family of approximation algorithms for scheduling unit execution time jobs on more than two machines is analyzed. For the case of arbitrary job processing times, two integer linear programming formulations are proposed and compared with two formulations known from the earlier literature. An iterated variable neighborhood search algorithm is also proposed and evaluated by means of computational experiments.

 

Data-driven scheduling in serverless computing to reduce response time

Prelegent: Dr Bartłomiej Przybylski (Pracownia Algorytmiki)
Data i godzina:  29 czerwca 2021 r., g. 12:30

Streszczenie: In Function as a Service (FaaS), a serverless computing variant, customers deploy functions instead of complete virtual machines or Linux containers. It is the cloud provider who maintains the runtime environment for these functions. FaaS products are offered by all major cloud providers (e.g. Amazon Lambda, Google Cloud Functions, Azure Functions); as well as standalone open-source software (e.g. Apache OpenWhisk) with their commercial variants (e.g. Adobe I/O Runtime or IBM Cloud Functions). We take the bottom-up perspective of a single node in a FaaS cluster. We assume that all the execution environments for a set of functions assigned to this node have been already installed. Our goal is to schedule individual invocations of functions, passed by a load balancer, to minimize performance metrics related to response time. Deployed functions are usually executed repeatedly in response to multiple invocations made by end-users. Thus, our scheduling decisions are based on the information gathered locally: the recorded call frequencies and execution times. We propose a number of heuristics, and we also adapt some theoretically-grounded ones like SEPT or SERPT. Our simulations use a recently-published Azure Functions Trace. We show that, compared to the baseline FIFO or round-robin, our data-driven scheduling decisions significantly improve the performance.

This is a retake of the presentation presented during 2021 21th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing (CCGrid 2021).