Learning Differentiable Decision Trees for Reinforcement Learning: Q-Learning or Policy Gradient?
In an earlier post, I laid out reasons that we might want to use a differentiable decision tree for reinforcement learning. In this post, I cover some experiments comparing two approaches for learning the parameters of these models: Q-Learning and Policy Gradient, showing that Q-Learning may not be a great choice if you want to apply differentiable decision trees in your own work.
Read More