Discretizing Reward Models

(arxiv.org)

2 points | by gmays 4 hours ago

0 comments