Predictive Representation Learning For Language Modeling