Date Added: Apr 2012
The authors consider a large-scale cyber network with N components. Each component is either in a healthy state (0) or an abnormal state (1). Due to intrusions, the state of each component transits from 0 to 1 over time according to an arbitrary stochastic process. The objective is to design a dynamic probing strategy that minimizes the long-term network cost incurred at all abnormal components. They formulate the problem as a Restless Multi-Armed Bandit (RMAB) process. They show that this class of RMAB is indexable and Whittle index can be obtained in closed-form. For homogeneous networks, they show that Whittle index policy achieves the optimal performance with a simple structure that does not require any prior knowledge on the intrusion processes.