Highlights • We proposed a novel learning-based combination of sub-frame selection algorithm. To deal with that, part of the total sub-frames construct PSF matrix, from which combination of sub-frames are selected based on their probability to communicate with the edge QoS UEs. The dimension of the PSF matrix depends on the number of sub-frames and low QoS UEs. • We introduced the concept of cooperative game of learning automata which is appropriate for cooperative multiagent system for UEs to learn different combinations of sub-frame history from other neighboring UEs at time t , and updated their probability as indicative of the stochastic characteristics of the selected sub-frame to influence the selection at t + 1, assuming the fact that most UEs in HetNet do not change their location radically in few network time difference. • Power allocation for the selected combination of sub-frames is formulated to allocate maximum transmission power for sub-frames on which the interferences are lower, while the high interferer HPN/RRH is being muted such that to increase the EE. • Energy optimization problem is formulated as non-convex optimization problem subject to the interference, and limited power available. A non-linear fractional programming is used to solve the non-convex problem, upon which we then develop an efficient iterative algorithm. • The simulation results evaluate the proposed sub-frame selection algorithm and power allocation strategy implemented for energy efficiency. Furthermore, our research evaluates the overall EE, convergence and system capacity of heterogeneous multi-cloud RAN. Abstract The future network in a real environment is assumed to involve a radio access network comprising several clouds, as opposed to the single-cloud scenario in the recent cloud radio access network (CRAN) literature. Hence, this research presents the more favorable multi-cloud scenario to address the limited coverage and processing abilities of CRAN, in which inter-tier and inter-cloud interferences are considered and resource allocation is presented to enhance both the spectral and energy efficiencies. In order to mitigate interferences and enhance the spectral and energy efficiency (EE), we adopted protective sub-frame (PSF) based interference mitigation. To deal with that, part of the total sub-frames construct PSF matrix from which the combinations of sub-channels are selected to communicate with the edge low quality of service (QoS) users. In line with this, we introduced a novel sub-frame selection algorithm using the concept of cooperative game of learning automata, in which UEs can learn past sub-frame selection history from other neighboring UEs and updated their probability as indicative of the stochastic characteristics of the selected sub-frame to influence the next sub-frame selection. Accordingly, power allocation for the selected combination of sub-frames is formulated to allocate transmission power based on the received SINR. Furthermore, the objective problem is formulated as a non-convex energy-efficient resource assignment problem and the non-convex problem is efficiently converted to convex feasible problem utilizing the nonlinear fractional programming, upon which we then develop an efficient iterative algorithm. Finally, we evaluated the performance of our proposed model in terms of convergence, EE, and overall system capacity. For that, simulation is conducted and results confirm that the corresponding cooperative learning automata-based sub-frame selection enhances the EE significantly with faster convergence and improved system capacity. [ABSTRACT FROM AUTHOR]