An exploratory study of trade-offs in traditional vs. serverless stream processing.

Potential supervisors: 
Description: 

Motivation

Streaming applications continuously process data to deliver streams of up-to-date results. When run by Stream Processing Engines (SPE) (e.g., Apache Flink), streaming applications are defined as graphs of streams and operators. Such graph-based model of streaming analysis can be encapsulated in the model of serverless analysis, such as the one defined by Apache OpenWhisk, which executes functions (fx) in response to events at any scale. A benefit of platforms such as OpenWhisk is their transparent managing of the infrastructure, servers and scaling using containers (e.g., Docker), which can ease the integration of streaming pipelines with other analysis tools/platforms.

Challenge

This thesis targets the design and implementation of stateful streaming applications both on existing SPEs as well as on top of Apache OpenWhisk, with an empirical evaluation of their performance. During the thesis, you will (1) get familiar with the data streaming processing paradigm and with related concepts, (2) design and implement benchmark streaming applications and (3) test and compare their results on different platforms.

You can conduct this thesis individually or as team of two students. The content can be adapted accordingly, depending also on the total number of students interested in this thesis. For details and further questions please contact us. 

Attachments: 
Date range: 
September, 2021 to October, 2025