Recent works have valid the chance of increase the energy potency in Radio Access Networks (RANs). Done by dynamically turning on/off some Base Stations (BSs). This paper, to extend the analysis over Base Stations switching operations that ought to match up with traffic load variations. The authors have a tendency to first of all formulate the traffic variations as a Markov decision process. After that minimize the energy consumption of RANs, and to design a reinforcement learning framework primarily based BSs switching operation scheme.