Jordà Polo


Senior Researcher and lead of the Data-Centric Computing group at the Barcelona Supercomputing Center (BSC). Aiming to improve the efficiency of datacenters and optimize the performance of data-driven workloads.

Research interests


Multi-constraint Scheduling of MapReduce Workloads. Ph.D. Dissertation.

Projects and collaborations

Intel: Edge Cloud (2018-2021)

Performance Characterization of Video Analytics Workloads in Heterogeneous Edge Infrastructures. CCPE, 2021.

Created a framework to maximize the efficiency of Edge Cloud deployments in collaboration with Intel. The framework automated the characterization of Edge workloads, connecting hardware-level and application-level configuration choices and KPIs. Currently used by Intel to shape future Edge platforms and onboard new clients up to 10x faster.

IBM Research: Cloud (2014-2019)

Topology-aware GPU Scheduling for Learning Workloads in Cloud Environments. SC, 2017.

Research collaboration with the Container Cloud Platform group at IBM T.J. Watson Research Center in NY. Designed novel scheduling policies for heterogeneous Cloud environments. These policies introduced topology and disaggregation awareness, and resulted in up to 30% higher throughput and QoS.

IBM Research: Serverless (2019-2021)

Performance Evaluation of Data-Centric Workloads in Serverless. CLOUD, 2021.

Lead the effort to enable data-intensive applications in Serverless environments in collaboration with IBM Research. This includes the creation of performance models and scheduling policies aware of data exchanges between functions in Kubernetes/Knative, resulting in performance improvements of up to 4.32×.

BSC Life Sciences: Workload optimization (2015-2019)

Considerations in using OpenCL on GPUs and FPGAs for Throughput-oriented Genomics Workloads. FGCS, 2019.


Designed a new algorithm for SMUFIN, a method to find cancer mutations. The original version took more than 10 hours and multiple TBs of memory per patient, running in tens of nodes. The new algorithm allowed exploiting data parallelism and modern memory/storage stacks, and resulted in up to 7.5x higher throughput and 5.5x smaller energy consumption.

MIT: Codesign (2017-2019)

Enabling Genomics Pipelines in Commodity Personal Computers with Flash Storage. Frontiers in Genetics, 2021.

Research collaboration with Prof. Arvind and his team at MIT, making it possible to run the SMUFIN algorithm on an affordable commodity machine. Achieved the same throughput at only one third (36%) the hardware cost and half (45%) the energy compared to an enterprise-class server.

Intel: Disaggregation (2016-2018)

Disaggregating Non-volatile Memory for Throughput-oriented Workloads. EuroPar Workshop, 2018.

Explored the impact of disaggregation in the context of Rack Scale Design and NVMe Over Fabrics, and proposed techniques to adapt workload bandwidth and maximize efficiency while keeping QoS.

Ph.D. Students


Last update: September 2021