Multi-Armed Bandit Learning for Full-Duplex UAV Relay Positioning for Vehicular Communications
Utilizing unmanned aerial vehicles (UAVs) in wireless communications can help to improve the capacity of terrestrial networks. In this paper, a novel method is proposed to position a UAV in an optimal location to relay the information from a vehicle to a base station (BS). The proposed method uses predefined locations for the UAV and treats them as the actions for a multi-armed bandit (MAB) framework. The upper confidence bound (UCB) algorithm is used to solve the MAB problem. The results show that this method can identify an optimal location for the UAV to maximize the sum rate of the network.