TY - JOUR
T1 - Meta Reinforcement Learning for UAV-Assisted Energy Harvesting IoT Devices in Disaster-Affected Areas
AU - Dhuheir, Marwan
AU - Erbad, Aiman
AU - Al-Fuqaha, Ala
AU - Seid, Abegaz Mohammed
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2024
Y1 - 2024
N2 - Over the past decade, Unmanned Aerial Vehicles (UAVs) have attracted significant attention due to their potential applications in emergency-response applications, including wireless power transfer (WPT) and data collection from Internet of Things (IoT) devices in disaster-affected areas. UAVs are more attractive than traditional techniques due to their maneuverability, flexibility, and low deployment costs. However, using UAVs for such critical tasks comes with challenges, including limited resources, energy constraints, and the need to complete missions within strict time frames. IoT devices in disaster areas have limited resources (e.g., computation, energy), so they depend on the UAVs' resources to accomplish vital missions. To address these resource problems in a disaster scenario, we propose a meta-reinforcement learning (RL)-based energy harvesting (EH) framework. Our system model considers a swarm of UAVs that navigate an area, providing wireless power and collecting data from IoT devices on the ground. The primary objective is to enhance the quality of service for strategic locations while allowing UAVs to dynamically join and leave the swarm (e.g., for recharging). In this context, we formulate the problem as a non-linear programming (NLP) optimization problem aimed at maximizing the total EH IoT devices and determining the optimal trajectory paths for UAVs while adhering to the constraints related to the maximum time duration, the UAVs' maximum energy consumption, and the minimum data rate to achieve a reliable transmission. Due to the complexity of the problem, the combinatorial nature of the formulated problem, and the difficulty of obtaining the optimal solution using conventional optimization problems, we propose a lightweight meta-RL solution capable of solving the problem by learning the system dynamics. We conducted extensive simulations and compared our approach with two state-of-the-art models using traditional RL algorithms represented by a deep Q-network algorithm, a Particle Swarm Optimization (PSO) algorithm, and one greedy solution. Our simulation results show that the proposed Meta-RL algorithm can outperform the IoT EH of the DQN, PSO algorithm, and the greedy solution by 25%, 32%, and 45%, respectively. The results of our simulations also demonstrate that our proposed approach outperforms the competitive solutions in terms of efficiently covering strategic locations with a high satisfaction rate and high accuracy.
AB - Over the past decade, Unmanned Aerial Vehicles (UAVs) have attracted significant attention due to their potential applications in emergency-response applications, including wireless power transfer (WPT) and data collection from Internet of Things (IoT) devices in disaster-affected areas. UAVs are more attractive than traditional techniques due to their maneuverability, flexibility, and low deployment costs. However, using UAVs for such critical tasks comes with challenges, including limited resources, energy constraints, and the need to complete missions within strict time frames. IoT devices in disaster areas have limited resources (e.g., computation, energy), so they depend on the UAVs' resources to accomplish vital missions. To address these resource problems in a disaster scenario, we propose a meta-reinforcement learning (RL)-based energy harvesting (EH) framework. Our system model considers a swarm of UAVs that navigate an area, providing wireless power and collecting data from IoT devices on the ground. The primary objective is to enhance the quality of service for strategic locations while allowing UAVs to dynamically join and leave the swarm (e.g., for recharging). In this context, we formulate the problem as a non-linear programming (NLP) optimization problem aimed at maximizing the total EH IoT devices and determining the optimal trajectory paths for UAVs while adhering to the constraints related to the maximum time duration, the UAVs' maximum energy consumption, and the minimum data rate to achieve a reliable transmission. Due to the complexity of the problem, the combinatorial nature of the formulated problem, and the difficulty of obtaining the optimal solution using conventional optimization problems, we propose a lightweight meta-RL solution capable of solving the problem by learning the system dynamics. We conducted extensive simulations and compared our approach with two state-of-the-art models using traditional RL algorithms represented by a deep Q-network algorithm, a Particle Swarm Optimization (PSO) algorithm, and one greedy solution. Our simulation results show that the proposed Meta-RL algorithm can outperform the IoT EH of the DQN, PSO algorithm, and the greedy solution by 25%, 32%, and 45%, respectively. The results of our simulations also demonstrate that our proposed approach outperforms the competitive solutions in terms of efficiently covering strategic locations with a high satisfaction rate and high accuracy.
KW - Energy harvesting
KW - UAVs
KW - UAVs positions
KW - energy consumption
KW - meta-reinforcement learning
KW - strategic locations
UR - http://www.scopus.com/inward/record.url?scp=85188449316&partnerID=8YFLogxK
U2 - 10.1109/OJCOMS.2024.3377706
DO - 10.1109/OJCOMS.2024.3377706
M3 - Article
AN - SCOPUS:85188449316
SN - 2644-125X
VL - 5
SP - 2145
EP - 2163
JO - IEEE Open Journal of the Communications Society
JF - IEEE Open Journal of the Communications Society
ER -