Latent Thinking Optimization

Code for supervising and improving latent reasoning by treating hidden-state correctness signals as a latent reward model.

Latent Thinking Optimization investigates reasoning systems whose intermediate thoughts remain in hidden representations rather than being verbalized as chain-of-thought text.

The project trains a classifier to distinguish latent trajectories that lead to correct and incorrect answers, then uses that classifier as a latent reward model. Its probabilistic optimization procedure steers the model’s hidden reasoning process using those learned signals.

GitHub repo Read paper