Learning With the Bandit: A Cooperative Spectrum Selection Scheme for Cognitive Radio Networks
Distributed spectrum allocation in Cognitive Radio (CR) systems requires each Secondary User (SU) to learn the optimal spectrum policy which maximizes the network performance while minimizing the impact to the Primary Users (PUs). To this aim, each SU must rely on local sensing information which however can be biased by interference and fading effects on the received signal. Thus, if each SU works in isolation, the convergence to the system-wide optimal policy can not be guaranteed. In this paper, the authors formulate the spectrum allocation problem as a cooperative learning task in which each SU can learn the spectrum availability of each channel and share such knowledge with the other SUs.