Tuning a model using human preference rankings to make its outputs more helpful and aligned.
← All terms