Date Added: Oct 2009
This paper considers the following stochastic control problem that arises in opportunistic spectrum access: a system consists of n channels where the state ("Good" or "Bad") of each channel evolves as independent and identically distributed Markov processes. A user can select exactly k channels to sense and access (based on the sensing result) in each time slot. A reward is obtained whenever the user senses and accesses a "Good" channel. The objective is to design a channel selection policy that maximizes the expected discounted total reward accrued over a finite or infinite horizon.