ELTE logo ELTE Eötvös Loránd University
ANNALES Universitatis Scientiarum Budapestinensis de Rolando Eötvös Nominatae
Sectio Computatorica

Volumes » Volume 55 (2023)

https://doi.org/10.71352/ac.55.019

Performance impact of network security features on log processing with Spark

Attila Péter Boros, Péter Lehotay–Kéry and Attila Kiss

Abstract. Various industries maintain a large number of machines to run their production lines and services. These types of systems process and produce massive amounts of data to provide high quality and availability for their customer services. Therefore, these systems should constantly be inspected, to not only provide continuously the standard levels achieved but also be upgraded to keep up with the market competition. Our aim is to examine Apache Spark and to find one of the most suitable configurations that perform best on our challenges and can be further applied in real, live scenarios. In addition, despite that several studies in this field were already done, none of them considers the security factor of Spark during computation when predicting run time.
The presented work entails testing Apache Spark for log processing in standalone cluster setups with a varying number of workers on different submitted tasks. We also examine the performance impact of enabling authentication in the network communication between cluster nodes with these setups. Our results show that increasing the number of executor nodes and simplifying the underlying algorithm does not always influence performance in a positive manner as expected. Furthermore, securing network communication between Spark processes increases the overall execution time of submitted jobs noticeably.

Full text PDF
Journal cover