Francisco Rodríguez Sánchez | Tecnológico Nacional de México/IT Tuxtla Gutiérrez |
Ildeberto Santos-Ruiz | Tecnológico Nacional de México/IT Tuxtla Gutiérrez |
Joaquín Domínguez-Zenteno | Tecnológico Nacional de México/IT Tuxtla Gutiérrez |
Francisco-Ronay López-Estrada | Tecnológico Nacional de México/IT Tuxtla Gutiérrez |
https://doi.org/10.58571/CNCA.AMCA.2022.019
Resumen: This article presents the general formulation and terminology of reinforcement learning (RL) from the perspective of Bellmans equations based on a reward function, its learning methods and algorithms. The important key in RL is the calculation of value-state and value state-action functions, useful to find, compare and improve policies for learning agent through different methods based on values and policies such as Q-learning. The deep deterministic policy gradient (DDPG) learning algorithm based on an actor-critic structure is also described as one of the ways of training the RL agent. RL algorithms can be used to design closed loop controllers. Through simulation, using the DDPG algorithm, an example of the application of the inverted pendulum is proposed in simulation, demonstrating that the training is carried out in a reasonable time, showing the role and importance of RL algorithms, like tools that combined with control can address this type of problems.
¿Cómo citar?
Francisco Rodríguez Sánchez, Ildeberto Santos-Ruiz, Joaquín Domínguez-Zenteno & Francisco-Ronay López-Estrada. Control Applications Using Reinforcement Learning: An Overview. Memorias del Congreso Nacional de Control Automático, pp. 67-72, 2022. https://doi.org/10.58571/CNCA.AMCA.2022.019
Palabras clave
Control Inteligente; Redes Neuronales; Cómputo para Control
Referencias
- Avila, L., De Paula, M., Carlucho, I., and Reinoso, C.S. (2019). Mppt for pv systems using deep reinforcement learning algorithms. IEEE Latin America Transactions, 17(12), 2020–2027.
- Bertsekas, D. (2019). Reinforcement learning and optimal control. Athena Scientific.
- Degrave, J., Felici, F., Buchli, J., Neunert, M., Tracey, B., Carpanese, F., Ewalds, T., Hafner, R., Abdolmaleki, A., de Las Casas, D., et al. (2022). Magnetic control of tokamak plasmas through deep reinforcement learning. Nature, 602(7897), 414–419.
- García, J. and Shafie, D. (2020). Teaching a humanoid robot to walk faster through safe reinforcement learning. Engineering Applications of Artificial Intelligence, 88, 103360.
- Gordon, G.J. (1996). Chattering in sarsa (_). CMU Learning Lab Technical Report.
- Kiran, B.R., Sobh, I., Talpaert, V., Mannion, P., Al Sallab, A.A., Yogamani, S., and P´erez, P. (2021). Deep reinforcement learning for autonomous driving: A survey. IEEE Transactions on Intelligent Transportation Systems.
- Lewis, F.L., Vrabie, D., and Syrmos, V.L. (2012). Optimal control. John Wiley & Sons.
- Li, B., Ma, F., and Wu, Y. (2020). Missile attitude control based on deep reinforcement learning. In 2020 IEEE 16th International Conference on Control & Automation (ICCA), 931–936. IEEE.
- Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D. (2015). Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971.
- Liu, D., Wei, Q., Wang, D., Yang, X., and Li, H. (2017). Adaptive dynamic programming with applications in optimal control. Springer.
- Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M. (2013). Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602.
- Pan, T., Guo, R., Lam, W.H., Zhong, R., Wang, W., and He, B. (2021). Integrated optimal control strategies for freeway traffic mixed with connected automated vehicles: A model-based reinforcement learning approach. Transportation research part C: emerging technologies, 123, 102987.
- Recht, B. (2019). A tour of reinforcement learning: The view from continuous control. Annual Review of Control, Robotics, and Autonomous Systems, 2, 253–279.
- Song, S., Kidzi´nski, L., Peng, X.B., Ong, C., Hicks, J., Levine, S., Atkeson, C.G., and Delp, S.L. (2021). Deep reinforcement learning for modeling human locomotion control in neuromechanical simulation. Journal of neuroengineering and rehabilitation, 18(1), 1–17.
- Sutton, R.S. and Barto, A.G. (2018). Reinforcement learning: An introduction. MIT press.
- Tan, F., Yan, P., and Guan, X. (2017). Deep reinforcement learning: From q-learning to deep q-learning. In International Conference on Neural Information Processing, 475–483. Springer.
- Tu, Y., Fang, H., Yin, Y., and He, S. (2021). Reinforcement learning-based nonlinear tracking control System design via ldi approach with application to trolley system. Neural Computing and Applications, 1–8.
- Wang, Z.T., Ashida, Y., and Ueda, M. (2020). Deep reinforcement learning control of quantum cartpoles. Physical Review Letters, 125(10), 100401.
- Watkins, C.J.C.H. (1989). Learning from delayed rewards. PhD thesis, Cambridge University.
- Zhang, T., Wang, R., Wang, Y., and Wang, S. (2021). Locomotion control of a hybrid propulsion biomimetic underwater vehicle via deep reinforcement learning. In 2021 IEEE International Conference on Real-time Computing and Robotics (RCAR), 211–216. IEEE.