Online Reinforcement Learning for Dynamic Multimedia Systems
Source: Cornell University
In the previous work, the authors proposed a systematic cross-layer framework for dynamic multimedia systems, which allows each layer to make autonomous and foresighted decisions that maximize the system's long-term performance, while meeting the application's real-time delay constraints. The proposed solution solved the cross-layer optimization offline, under the assumption that the multimedia system's probabilistic dynamics were known a priori, by modeling the system as a layered Markov decision process. In practice, however, these dynamics are unknown a priori and therefore must be learned online. In this paper, they address this problem by allowing the multimedia system layers to learn, through repeated interactions with each other, to autonomously optimize the system's long-term performance at run-time.