Primal-Dual Method for Reinforcement Learning and Markov Decision Processes