SmartMonit: Real-time Big Data Monitoring System

Abstract

Modern big data process systems are becoming very complex in terms of large-scale, high-concurrency and Multiple- talents. Thus, many failures and performance reductions only happen at run-time and are very difficult to capture. Moreover, some issues may only be triggered when some components are executed. To analyze the root cause of these types of issues, we have to capture the dependencies of each component in real-time. In this paper, we propose SmartMonit a real-time Big data monitoring system which collects infrastructure information such as the process status of each task. At the same time, we develop a real-time stream processing framework to analyze the coordination among tasks to tasks and infrastructures to tasks. This coordination information is essential for troubleshooting the reasons for failures and performance reduction, especially the ones propagated from other causes.

Publication
The 38th International Symposium on Reliable Distributed Systems (SRDS 2019) (Demo). [CCF B; Core A]

Related