RT @_philschmid: Mini-R1: Reproduce @deepseek_ai R1 „aha moment“ a RL tutorial! Recreate an RL “aha moment” using Group Relative Policy Opt…
— Abhay 🇸🇬🇮🇳 (@Abhay08)
Jan 31, 2025
Subscribe
Login
0 Comments