See other notes on pdf on ereader

This paper is an overview of the problems that arise during RLHF. It classifies them based on where in the process the problems occur, and on how tractable they are to solve.

Paper highlights the limitations of RLHF and suggests that multiple layers are needed. Comes at it from a AI Alignment perspective.

Paper club discussion

Structure of Review

Notes

Untitled

Questions