A Smarter Solution to Speeding Up AI Training | HackerNoon

We show that classical value iteration (VI) is suboptimal and that the anchoring mechanism accelerates VI to be optimal, matching a complexity lower bound up to a constant factor of 4.
Our results suggest that the classical foundations of dynamic programming and reinforcement learning may be improved by examining them through the lens of optimization complexity theory.
Read at Hackernoon