The Eleventh International Workshop on Load Testing and Benchmarking of Software Systems (LTB 2023)

Workshop Agenda

Note: The workshop starts at 9:00 am Coimbra Time (UTC) on April 16, 2023.

Click on the title to see details.

Session 1: Load Testing Analytics
09:00 - 09:10	Opening
09:10 - 10:00	Keynote	Weiyi Shang (Concordia University) Advancing State-of-the-art Log Analytics Infrastructures Abstract: Software systems usually record important runtime information in their logs. Logs help practitioners understand system runtime behaviors and diagnose field failures. As logs are usually very large in size, automated log analysis is needed to assist practitioners in their software operation and maintenance efforts. The success of adopting log analysis in practice often depends on sophisticated infrastructures. In particular, to enable fast queries and to save storage space, such infrastructure split log data into small blocks (e.g., 16KB), then index and compress each block separately. Afterwards a log parsing step converts the raw logs from unstructured text to a structured format before applying subsequent log analysis techniques. My lab specializes in the development of approaches for advancing the infractures of log analysis. In this talk, I provide an overview of some of our recent work on automated techniques of log compression and parsing. Bio: Weiyi Shang is a Concordia University Research Chair at the Department of Computer Science. His research interests include AIOps, big bata software engineering, software log analytics and software performance engineering. He serves as a Steering committee member of the SPEC Research Group. He is ranked among the top worldwide established SE research stars in a recent bibliometrics assessment of software engineering scholars. He is a recipient of various premium awards, including the CSCAN Outstanding Early-Career Computer Science Researcher Prize, the SIGSOFT Distinguished paper award at ICSE 2013, best paper award at WCRE 2011 and the Distinguished reviewer award for the Empirical Software Engineering journal. His research has been adopted by industrial collaborators (e.g., BlackBerry and Ericsson) to improve the quality and performance of their software systems that are used by millions of users worldwide. Contact him at shang@encs.concordia.ca; http://users.encs.concordia.ca/~shang. Slides
10:00 - 10:30	Research Talk	Sebastian Frank (University of Hamburg / University of Stuttgart), Alireza Hakamian (University of Stuttgart), Denis Zahariev (University of Stuttgart) and André van Hoorn (University of Hamburg) Verifying Transient Behavior Specifications in Chaos Engineering Using Metric Temporal Logic and Property Specification Patterns Abstract: Chaos Engineering is an approach for assessing the resilience of software systems, i.e., their ability to withstand unexpected events, adapt accordingly, and return to a steady state. The traditional Chaos Engineering approach only verifies whether the system is in a steady state and considers no statements about state changes over time and timing. Thus, Chaos Engineering conceptually does not consider transient behavior hypotheses, i.e., specifications regarding the system behavior during the transition between steady states after a failure has been injected. We aim to extend the Chaos Engineering approach and tooling to support the specification of transient behavior hypotheses and their verification. We interviewed three Chaos Engineering practitioners to elicit requirements for extending the Chaos Engineering process. Our concept uses Metric Temporal Logic and Property Specification Patterns to specify transient behavior hypotheses. We then developed a prototype that can be used stand-alone or to complement the established Chaos Engineering framework Chaos Toolkit. We successfully conducted a correctness evaluation comprising 160 test cases from the Timescales benchmark and demonstrate the prototype’s applicability in Chaos Experiment settings by executing three chaos experiments. Slides
10:30 - 11:00	Coffee Break
Session 2: Cloud Performance and Benchmarking
11:00 - 11:30	Industry Talk	Boris Zibitsker (BEZNext) and Alex Lupersolsky (BEZNext) Modeling Expands Benchmarking Results to Optimize Performance and Financial Decisions in the Hybrid Multi-Cloud World Abstract: During the journey to the cloud, many organizations use the industry standard TPC and customized benchmarks to determine how the differences in cloud architecture, configuration, and sophistication of DBMS optimizers affect the CPU service time, the number of I/O operations, and megabytes processed per request in the different cloud platforms comparing with the on-premises environment. Unfortunately, the benchmarks do not represent customer workloads well and do not answer important “what-if” questions to make the right decisions during the journey to the cloud. Almost 80% of organizations migrating to the cloud do not finish it on time and within the budget. Some legacy applications need so many changes in their database design and codebase that they are left running on-premises. Meanwhile, lines of business and business departments are developing new analytic applications incorporating machine learning and artificial intelligence algorithms. During testing these applications work well, but after deployment, they work slower, do not scale, and cost more than expected. Data scientists supporting specific line of business often demand services offered only by a specific cloud data platform. As a result, some business workloads stay on-premises, and others operate on multiple cloud data platforms. Performance and financial optimization in this type of complex hybrid multi-cloud environment brings many challenges and high risk of performance and financial surprises. This paper will review BEZNext methodology and results of implementing software, optimizing the performance and financial decisions during different phases of the journey to the cloud. The methodology includes three phases: observe and inform, optimize and recommend and verify and control. During the Observe and Inform step collected measurement data are aggregated into business workloads (groups of users and applications). For each business workload performance and financial anomalies, root causes, and seasonality patterns were determined on each cloud data platform. Inform also extracts configuration parameters such as cloud instance types, number of instances, virtual warehouse (VW) type, and the number of VWs. Some of these parameters change often depending on the number of concurrent queries or utilization of resources. Inform auto-discovers of compute and storage configuration changes hour by hour. During the optimization step we use modeling and optimization to evaluate options and present recommendations with realistic expectations. The modeling and optimization technology predicts the minimum configuration and cost needed to meet the organization’s SLGs for each business workload, and compares available cloud providers’ options, including instance and storage types and pricing models. Results of the observe and inform performance analysis include workloads' performance, resource utilization and data usage hourly profiles. Workload forecasting estimates the expected growth in the number of users and volume of data based on the analysis of historical data and customer business plans. For new applications, we use the information provided by customers. One of the most critical parameters is each workload's Service level Goal (SLG). Typically, the primary performance SLG is the average response time. Optimize and recommend phase uses the output of the observe and inform phase to optimize recommendations. Several examples are based on implementing our software at Fortune 100 companies. We review how modeling and optimization algorithms expand the benchmark results to continuously meet SLGs for all business workloads with the lowest cost. We demonstrate how to optimize development and operational decisions before new applications deployment and discuss how modeling and optimization assist with selecting the most appropriate cloud platform and optimizing cloud migration decisions. We will review how to evaluate options and optimize dynamic capacity management decisions. We also calculate power consumption and carbon footprint for recommended configuration of each cloud data platform. During the verification and control step, we compare the actual results with the expected ones and organize a continuous performance and financial control process. One of the examples shows the verification of results and organizing a closed-loop, feedback-controlled system. The results of implementing recommendations are verified during the control step. Slides
11:30 - 12:30	Keynote	Samuel Kounev (University of Wuerzburg) Serverless Computing Revisited: Evolution, State-of-the-Art, and Performance Challenges Abstract: Market analysts are agreed that serverless computing has strong market potential, with projected compound annual growth rates varying between 21% and 28% through 2028 and a projected market value of $36.8 billion by that time. Although serverless computing has gained significant attention in industry and academia over the past years, there is still no consensus about its unique distinguishing characteristics and precise understanding of how these characteristics differ from classical cloud computing. For example, there is no wide agreement on whether serverless is solely a set of requirements from the cloud user’s perspective or it should also mandate specific implementation choices on the provider side, such as implementing an autoscaling mechanism to achieve elasticity. Similarly, there is no agreement on whether serverless is just the operational part, or it should also include specific programming models, interfaces, or calling protocols. In this talk, we seek to dispel this confusion by evaluating the essential conceptual characteristics of serverless computing as a paradigm, while putting the various terms around it into perspective. We examine how the term serverless computing, and related terms, are used today. We explain the historical evolution leading to serverless computing, starting with mainframe virtualization in the 1960 through to Grid and cloud computing all the way up to today. We review existing cloud computing service models, including IaaS, PaaS, SaaS, CaaS, FaaS, and BaaS, discussing how they relate to the serverless paradigm. In the second part of talk, we focus on performance challenges in serverless computing both from the user's perspective (finding the optimal size of serverless functions) as well as from the provider's perspective (ensuring predictable and fast container start times coupled with fine-granular and accurate elastic scaling mechanisms). Bio: Samuel Kounev is a Professor of Computer Science holding the Chair of Software Engineering at the University of Würzburg. His research is aimed at the engineering of software for building dependable, efficient, and resilient distributed systems, including cloud-based systems, cyber-physical systems, and scientific computing applications. Research topics of the Chair span the areas of: (1) Software Architecture, focussing on the design, modeling, and simulation of distributed system architectures, (2) Systems Benchmarking, focussing on experimental analysis of performance, scalability, energy efficiency, dependability, and resilience properties, (3) Cyber Security, focussing the design, testing, and evaluation of adaptive security architectures and homomorphic computing techniques, and (4) Predictive Data Analytics, focussing on the software engineering of workflows and tools for time series forecasting, anomaly detection, and critical event prediction. Kounev’s research is inspired by the vision of self-aware computing systems, to which he has been one of the major contributors shaping its development. Samuel Kounev studied Mathematics and Computer Science at the University of Sofia from which he holds a MSc degree with distinction (2000). He moved to TU Darmstadt (Germany) in 2001 starting a PhD (Dr.-Ing.) in computer science, which he completed in 2005 with distinction (summa cum laude) and an award for outstanding scientific achievements by the "Vereinigung von Freunden der TU Darmstadt." He was a research fellow at the University of Cambridge (2006-2008) and Visiting Professor at UPC Barcelona (summer 2006 and 2007). In 2009, Kounev received the DFG Emmy-Noether-Career-Award (1MM EUR) for excellent young scientists, establishing his research group "Descartes" at Karlsruhe Institute of Technology (KIT). Since 2014, Samuel Kounev is a Full Professor holding the Chair of Software Engineering at the University of Würzburg, where he has served in various roles including Dean (2019-2021) and Vice Dean (2017-2019) of the Faculty of Mathematics and Computer Science, Managing Director of the Institute of Computer Science (2016-2017), and Member of the Faculty Board (2015-2021). Samuel Kounev is Founder and Elected Chair of the SPEC Research Group within the Standard Performance Evaluation Corporation (SPEC). This group has over 50 member organizations from around the world and serves as a platform for collaborative research efforts in the area of quantitative system evaluation and analysis, fostering the interaction between academia and industry. He has also served as Co-Founder and Steering Committee Chair of several conferences in the field, including the ACM/SPEC International Conference on Performance Engineering (ICPE) and the IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS). His research has lead to over 300 publications (with an h-index of 45) and multiple scientific and industrial awards including 7 Best Paper Awards, SPEC Presidential Award for "Excellence in Research", Google Research Award, ABB Research Award, VMware Academic Research Award, etc. Slides
12:30 - 14:00	Lunch Break
Session 3: Load Testing and Performance Culture
14:00 - 14:30	Industry Talk	Filipe Oliveira (Redis) Simple Ways to Jumpstart a Performance Culture Abstract: This conference presentation proposal focuses on "Simple ways to Jumpstart a Performance Culture". The presentation aims to provide insights and practical strategies for creating a performance-driven organizational culture. It will cover key concepts such as setting performance goals, establishing clear performance expectations, providing ongoing feedback, recognizing and rewarding top performers, and creating a continuous improvement mindset. This presentation will discuss the importance of leaders in fostering a performance culture, and the role of communication in promoting transparency and accountability. You don't need deep performance expertise to attend it, but we believe that even the more advanced performance teams will benefit from this presentation and final discussion. Slides
14:30 - 15:00	Industry Talk	Andrei Pokhilko (Komodor) Anatomy and Classification of Load Testing Tools Abstract: Load testing has become an integral part of modern software production, and industry has created a number of tools to address the need. The variety of tools makes it difficult for a regular user to choose the right one for a particular use-case. Additionally, developers of load testing tools often don’t provide state-of-the-art capabilities in their products. In this talk, the author will reveal the typical functional components of load testing tools, with an attempt to have some classification for the tool types. Tool developers may use the functionality breakdown to improve their product's roadmap, while users can apply classification to make educated choices of their performance measurement instrument. Slides
15:00 - 15:30	Industry Talk	Vishnu Murty K (Dell Technologies) Distributed WorkLoad Generator for Performance/Load Testing Using Opensource Technologies Abstract: In DellEMC Enterprise Servers Validation Organization, we perform Load testing using different workloads (Web, FTP, Database, Mail, etc.) on Servers/Storage Devices to identify their performance of them under heavy load. Knowing how DellEMC Enterprise Servers perform under heavy load (% CPU, % Memory, % Network, % Disk) is extremely valuable and critical. This is achieved with the help of Load Testing Tools. Load testing tools available in the market come with their own challenges like Cost, Learning Curve and Workloads Support. Here in this talk, we are going to demonstrate how we have built JAAS (JMeter As A Service) Distributed WorkLoad Testing solution using Containers and opensource tools(Jmeter and Python) and how this solution plays a crucial role in Delivering Servers Validation efforts. Slides
15:30 - 16:00	Coffee Break
Session 4: Database Performance
16:00 - 17:00	Keynote	David Daly (MongoDB) Understanding and Improving Software Performance at MongoDB Abstract: It is important for developers to understand the performance of a software project as they develop new features, fix bugs, and try to generally improve the product. At MongoDB we have invested into building a performance infrastructure to support our developers. It automates the provisioning of systems under test, the running of performance tests against those systems, collecting many metrics from the tests and system under test, and making sense of all the results.Our performance infrastructure and processes are continually changing. As the system has become more powerful we have used it more and more: adding new tests, new configurations, and new measurements. Tools and processes that work on one scale of use start to break down at higher scales and we must adapt and update. If we do a good job, we keep pace with the rising constraints. If we do a great job, we make the system fundamentally better even as we scale the system. In this talk we describe our performance testing environment at MongoDB and its evolution over time. The core of our environment is a focus on automating everything, integrating into our continuous integration (CI) system (Evergreen), controlling as many factors as possible, and making everything as repeatable and consistent as possible. After describing that core, we will discuss the scaling challenges we have faced, before relating what we have done to address those scaling challenges and improve the system overall. Bio: David is a staff engineer at MongoDB focused on server performance. He focuses on increasing the understanding of how MongoDB's software performs for its customers. In practice this includes: Asking hard questions about MongoDB performance and then trying to answer them (or having someone else try to answer them); Challenging assumptions and commonly accepted wisdom around MongoDB performance; Encouraging everyone at MongoDB to think about performance, including adding new performance tests relevant to their ongoing work (e.g., adding new performance tests for new features or refactors); And explaining the current state of performance to others. He helped build and design MongoDB's performance testing infrastructure from the bottom up. At various times this required focusing on complete end-to-end automation, control of test noise and variability, working around test noise, and building processes to make sure that issues identified by the infrastructure were properly recognized and addressed. Slides
17:00 - 17:30	Research Talk	Jörg Domaschka (Institute of Information Resource Management, Ulm University \| benchANT GmbH), Simon Volpert (Institute of Information Resource Management, Ulm University), Kevin Maier (Institute of Information Resource Management, Ulm University), Georg Eisenhart (Institute of Information Resource Management, Ulm University) and Daniel Seybold (benchANT GmbH) Using eBPF for Database Workload Tracing: An Explorative Study Abstract: Database management systems (DBMS) are crucial architectural components of any modern distributed application. Yet, ensuring a smooth, high-performant operation of a DBMS is a black art that requires tweaking many knobs and is heavily dependent on the experienced workload. Misconfigurations at production systems have an heavy impact on the overall delivered service quality and hence, should be avoided at all costs. Replaying production workload on test and staging systems to estimate the ideal configuration are a valid approach. Yet, this requires traces from the production systems. While many DBMS's have built-in support to capture such traces these have a non negligible impact on performance. eBPF is a Linux kernel feature claiming to enable low-overhead observability and application tracing. In this paper, we evaluate different eBPF-based approaches to DBMS workload tracing for PostgreSQL and MySQL. The results show that using eBPF causes lower effort than the built-in mechanisms. Hence, eBPF can be a viable baseline for building a generic tracing framework. Yet, our current results also show that addition optimization and fine-tuning is needed to further lower the performance overhead. Slides

Call for Papers

Software systems (e.g., smartphone apps, desktop applications, telecommunication infrastructures and enterprise systems, etc.) have strict requirements on software performance. Failing to meet these requirements may cause business losses, customer defection, brand damage, and other serious consequences. In addition to conventional functional testing, the performance of these systems must be verified through load testing or benchmarking to ensure quality service.

Load testing examines the behavior of a system by simulating hundreds or thousands of users performing tasks at the same time. Benchmarking compares the system's performance against other similar systems in the domain. The workshop is not limited by traditional load testing; it is open to any ideas of re-inventing and extending load testing, as well as any other way to ensure systems performance and resilience under load, including any kind of performance testing, resilience / reliability / high availability / stability testing, operational profile testing, stress testing, A/B and canary testing, volume testing, and chaos engineering.

Load testing and benchmarking software systems are difficult tasks that require a deep understanding of the system under test and customer behavior. Practitioners face many challenges such as tooling (choosing and implementing the testing tools), environments (software and hardware setup), and time (limited time to design, test, and analyze). Yet, little research is done in the software engineering domain concerning this topic.

Adjusting load testing to recent industry trends, such as cloud computing, agile / iterative development, continuous integration / delivery, micro-services, serverless computing, AI/ML services, and containers poses major challenges, which are not fully addressed yet.

This one-day workshop brings together software testing and software performance researchers, practitioners, and tool developers to discuss the challenges and opportunities of conducting research on load testing and benchmarking software systems. Our ultimate goal is to grow an active community around this important and practical research topic.

We solicit two tracks of submissions:

Research or industry papers:

Short papers (maximum 4 pages)
Full papers (maximum 8 pages)

Presentation track for industry or research talks:

Extended abstract (maximum 700 words)

Research/Industry papers should follow the standard ACM SIG proceedings format and need to be submitted electronically via EasyChair (LTB 2023 track). Extended abstracts for the presentation track need to be submitted as "abstract only'' submissions via EasyChair as well. Accepted papers will be published in the ICPE 2023 Companion Proceedings. Submissions can be research papers, position papers, case studies, or experience reports addressing issues including but not limited to the following:

Efficient and cost-effective test executions
Rapid and scalable analysis of the measurement results
Case studies and experience reports on load testing and benchmarking
Leveraging cloud computing to conduct large-scale testing
Load testing and benchmarking on emerging systems (e.g., adaptive/autonomic systems, AI, big data systems, and cloud services)
Continuous performance testing
Load testing and benchmarking in the context of agile software development process
Using performance models to support load testing and benchmarking
Building and maintaining load testing and benchmarking as a service
Efficient test data management for load testing and benchmarking
Context-driven performance testing
Performance / load testing as an integral part of the performance engineering process
Load testing serverless computing platforms and the unique challenges caused by granular and short-lived containers

Instructions for Authors from ACM

By submitting your article to an ACM Publication, you are hereby acknowledging that you and your co-authors are subject to all ACM Publications Policies, including ACM's new Publications Policy on Research Involving Human Participants and Subjects. Alleged violations of this policy or any ACM Publications Policy will be investigated by ACM and may result in a full retraction of your paper, in addition to other potential penalties, as per ACM Publications Policy.

Please ensure that you and your co-authors obtain an ORCID ID, so you can complete the publishing process for your accepted paper. ACM has been involved in ORCID from the start and we have recently made a commitment to collect ORCID IDs from all of our published authors. The collection process has started and will roll out as a requirement throughout 2022. We are committed to improve author discoverability, ensure proper attribution and contribute to ongoing community efforts around name normalization; your ORCID ID will help in these efforts.

Important Dates

Paper Track (research and industry papers):

Abstract submission:	~~January 16, 2022, AOE~~ January 23, 2023, AOE;
Paper submission:	January 23, 2023, AOE;
Author notification:	February 13, 2023;
Camera-ready version:	February 20, 2023

Presentation Track:

Extended abstract submission:	February 17, 2023, AOE;
Author notification:	February 24, 2023, AOE;
Workshop date:	April 16, 2023

Organization:

Chairs:

Alexander Podelko	AWS, USA
Heng Li	Polytechnique Montréal, Canada
Changyuan Lin	York University, Canada

Program Committee:

Jinfu Chen	Wuhan University, China
Lizhi Liao	Concordia University, Canada
Diego Elias Costa	UQAM, Canada
Michele Tucci	Charles University, Czech Republic
Daniele Di Pompeo	University of L'Aquila, Italy
Yiming Tang	Concordia University, Canada
Sen He	Augusta University, USA
Klaus-Dieter Lange	HPE, USA
Gerson Sunyé	University of Nantes, France

The Eleventh International Workshop onLoad Testing and Benchmarking of Software Systems (LTB 2023)