Performance Evaluation of Serverless Applications and Infrastructures

Thesis Preview: This page summarizes the contributions of my upcoming PhD thesis but is not yet finalized.

My PhD thesis aims to enable reproducible performance evaluation of serverless applications and its underlying cloud infrastructure.

Included Papers

[𝛂] Function-as-a-Service Performance Evaluation

Function-as-a-Service Performance Evaluation: A Multivocal Literature Review

This JSS'20 journal paper describes a multivocal literature review (MLR) covering 112 performance studies of Function-as-a-Service (FaaS) platforms. It consolidates the results from 61 industrial and 51 academic performance studies and provides actionable recommendations on reproducible FaaS experimentation. The study concludes that future work needs to go beyond over-simplified micro-benchmarks benchmarks and focus on more realistic application-level benchmarks and workloads.

[𝛃] Serverless Application Characteristics

The State of Serverless Applications: Collection, Characterization, and Community Consensus

This TSE'21 journal paper (extending the IEEE Software article and technical report) studies the state of serverless applications. It contributes the largest collection of 89 serverless applications to date, systematically characterizes these applications along 16 characteristics, and presents a meta-study across 10 related studies towards building a community consensus about typical serverless applications.

[𝛄] Serverless Application Benchmark

Let’s Trace It: Fine-Grained Serverless Benchmarking using Synchronous and Asynchronous Orchestrated Applications

This contribution, under submission at an A* conference, proposes a comprehensive application-level benchmark suite, designs novel algorithms for fine-grained latency breakdown analysis based on distributed tracing, conducts a large-scale empirical performance study covering five common performance factors, and releases a FAIR replication package of the software, data, and results. It addresses research gaps identified in Paper 𝛂 by presenting solutions that are building on the insights from Paper 𝛃. The results show that the median end-to-end latency of serverless applications is often dominated not by function computation but by external service calls, orchestration, or trigger-based coordination.

[𝛅] Cross-provider Application Benchmarking

CrossFit: Fine-grained Benchmarking of Serverless Application Performance across Cloud Providers

This contribution, under submission at a conference, presents an approach for detailed and fair cross-provider performance benchmarking of serverless applications based on a provider-independent tracing model. Further, an empirical study demonstrates how detailed distributed tracing enables drill-down analysis to explain performance differences between two leading cloud providers. It addresses research gaps identified in Paper 𝛂 and refines a specific application scenario from Paper 𝛄. The results for an asynchronous application reveal extensive trigger delays and show how increasing and bursty workloads affect performance stability, median latency, and tail latency.

[𝛆] Serverless Function Trigger Benchmark

TriggerBench: A Performance Benchmark for Serverless Function Triggers

This contribution, under submission as a short paper at a conference, quantifies the effect of serverless function triggers on trigger latency. Trigger latency is the delay to transition between an invoker and receiver function given a specific trigger type. It addresses a gap that was identified in Paper 𝛂 and raised as performance problem in Paper 𝛄 and 𝛅.

[ΞΆ] Cloud Benchmark Suite

A Cloud Benchmark Suite Combining Micro and Applications Benchmarks

This QUODS'18 workshop paper presents a new execution methodology that combines micro and application benchmarks into an integrated benchmark suite for IaaS clouds and reports results on cost-performance tradeoff, performance stability, and resource utilization.

[Ξ·] Cloud Application Performance Estimation

Estimating Cloud Application Performance Based on Micro-Benchmark Profiling

This CLOUD'18 conference paper develops a cloud benchmarking methodology that uses micro-benchmarks to profile applications and subsequently predicts how an application performs on a wide range of cloud services. A study with a leading cloud provider quantitatively evaluated the estimation model with 38 metrics from 23 micro-benchmarks and 2 applications from different domains. It builds upon the benchmark suite from Paper ΞΆ and highlights the connection between micro- and application benchmarks discussed in Paper 𝛂.

[ΞΈ] Software Microbenchmarking in the Cloud

Software microbenchmarking in the cloud. How bad is it really?

This EMSE'19 journal paper quantifies the effects of cloud environments on the variability of software performance test results and to what extent slowdowns can still be reliably detected even in a public cloud. It presents large-scale experiments across multiple providers, programming languages, software microbenchmarks, instance types, and execution methods that reveal substantial differences in variability between benchmarks and instance types. This contribution focuses on reproducibility concerns raised by Paper 𝛂.