RLHF (reinforcement learning from human feedback)
Published
RLHF, or reinforcement learning from human feedback, is a training method where human judgments of a model's outputs are used to align the model with what people prefer. The human-judgment part needs no coding: anyone can compare answers and pick the better one. That preference data is what researchers later use to align the model.
Learn more: Tap to Train.