Abstract:Dynamic pricing is one of the most effective ways to encourage customers to change their consumption pattern. Therefore, Reinforcement Learning-based Optimizing Dynamic Pricing(RLODP) algorithm is proposed for energy management in a hierarchical electricity market by considering both service provider's profit and customers' costs. Using Reinforcement Learning, the SP can adaptively determine the retail electricity price. Dynamic pricing problem is formulated as a discrete finite Markov Decision Process(MDP), and Q-learning is adopted to solve this decision-making problem. Simulation results show that the RLODP algorithm can reduce energy costs for customers, balance the energy supply and the demands in the electricity market.