Abstract:Wireless Sensor Network(WSN) in the power system can sense and collect the status of the working equipment and environmental data in real time, which is an important technology to promote the development of smart grid. Aiming at the special requirements of network survival time, transmission delay, and transmission packet loss rate of WSN in substation scenarios, a WSN routing scheme based on reinforcement learning is proposed. The sending process of packets in WSN is abstracted as a Markov Decision Process(MDP), the rewards are reasonably set according to the optimization objective, and the optimal routing solution method based on Q-learning is given. Simulation results and numerical analysis show that the proposed scheme outperforms the benchmark scheme in terms of network survival time, transmission delay, and packet loss rate.