Integrating Edge Intelligence with Optimized AI Deployment on a Serverless Platform for Smart Cities
Smart Cities, Edge Computing, Fog Computing, Servless Middleware.
The development of Smart City applications requires distributed architectures capable of handling real-time data, high load variability, and limited computational resources at the edge. Running Artificial Intelligence (AI) models on edge devices is essential to ensure low latency, but it still faces challenges due to the constrained processing, memory, and energy capabilities of these devices. Although approaches such as TinyML and FogML support embedded execution, fog and cloud layers are still required for more demanding tasks such as model reconfiguration and complex processing. In this context, this dissertation presents SAPPARCHI 2.0, an evolution of a serverless middleware platform designed for Smart City applications. The proposal aims to integrate Edge Intelligence, resilient over-the-air (OTA) updates, and a dynamic offloading mechanism to optimize the use of distributed computational resources across Edge, Fog, and Cloud layers. To this end, four research questions were defined, addressing local execution of machine learning models, continuous updates on IoT devices, adaptive task allocation, and system evaluation metrics. The adopted methodology included a systematic literature review, experimental platform development, and performance evaluation based on controlled tests. The SAPPARCHI 2.0 architecture was designed to enable local processing through Actions (serverless functions), organized into Tasks, Microservices, and Applications, supporting an osmotic execution model. Furthermore, an OTA update mechanism was implemented with rollback support and digital signature validation, ensuring application continuity even during firmware replacements. Experimental results demonstrated that local execution significantly reduces latency compared to Fog and Cloud, while remaining feasible even on constrained devices such as the ESP32. The system also proved resilient to failures during OTA updates and capable of latency-based offloading to maintain quality of service. The evaluation metrics included latency, energy consumption, and scalability.