Multi-Layer Framework for Fault Management in IoT Systems
dependability, fault tolerance, fault management, multi-layer, IoT.
Internet of Things (IoT) has attracted attention by projecting the vision of everything is
connected and poses many challenges. In this domain, systems are highly distributed, heterogeneous,
and large scale, so the presence of faults is usually inevitable. In the human
body, a disease causes a biological change in the human being state, altering, decreasing
or disabling certain functions of the organism. In the medical sciences, pathology
studies the nature, origin, cause and effects of a disease. Bringing this concept to computer
systems, specifically IoT, failures can result in a deviation in the correct functioning
of the system. Consequently, it becomes vital to identify possible sources of faults and
their effects to develop or apply suitable approaches to avoid, treat, and tolerate them.
Although considerable effort has been made to enhance dependability in IoT, there is
a lack of understanding of its architectural nature and multidisciplinarity. Therefore, a
restrictive systemic view of the occurrence of failures represents a relevant direction to
IoT dependability. In this thesis, we explore and understand fault pathology in an IoT
system. We propose a fault origin model for IoT systems. Also, due to multidisciplinarity
and complexity of IoT systems, requirements arise for modeling fault propagation
between architectural components leading to a possible system failure. This model details
how a failure interacts intra-layer, inter-layer, and inter-systems. Besides identifying the
fault origin and propagation, it is important to use strategies that help the existing fault
tolerance mechanisms. However, given the complexity and heterogeneity present in IoT
systems, proposes a unified way to manage faults is a challenge to be overcome. Thus, this
thesis proposes a transversal framework for fault management that allows interoperability
and integration (exchange of information) between all abstractions present in the system
stack.