AI · GTM Glossary
Reinforcement Learning from Human Feedback (RLHF)
Reinforcement Learning from Human Feedback (RLHF)
Humans rate AI responses with thumbs up / thumbs down. The AI learns from this what 'good' means. This is how ChatGPT became polite and helpful — not through programming, but through human feedback.
Auf Deutsch
Menschen bewerten KI-Antworten mit Daumen hoch / Daumen runter. Die KI lernt daraus, was 'gut' bedeutet. So wurde ChatGPT höflich und hilfreich – nicht durch Programmierung, sondern durch menschliches Feedback.
Ready to break into startup GTM?
Apply once, for free, and get matched with startups hiring junior sales, generalist, commercial and techy talent in Berlin, Munich and across Germany.
Apply free