generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Pull requests: huggingface/trl
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix bug: When chat_template is not defined exception will occur
#3340
opened Apr 22, 2025 by
dignfei
Loading…
2 tasks
Fix train and eval mode checking in GRPOTrainer
#3337
opened Apr 22, 2025 by
I-l-l-I
Loading…
1 of 5 tasks
[GRPO] adds experimental support for the SSR replay buffer
#3325
opened Apr 18, 2025 by
edbeeching
•
Draft
[vllm] support base_url parameter for vLLM client initialization
#3324
opened Apr 18, 2025 by
re-imagined
Loading…
Allow for saving the PPOTrainer value model (critic model)
#3308
opened Apr 16, 2025 by
AMindToThink
Loading…
[DPO] Model forward pass padding side fix
#3307
opened Apr 16, 2025 by
LeonEricsson
Loading…
2 of 5 tasks
PPO value_model can't be None, so it shouldn't be Optional
#3300
opened Apr 15, 2025 by
AMindToThink
Loading…
Modified GRPOTrainer to accumulate gradient within a single training batch
#3288
opened Apr 13, 2025 by
jarrelscy
Loading…
3 of 5 tasks
[NOT MEANT TO BE MERGED] Log correct/incorrect lengths
#3263
opened Apr 8, 2025 by
qgallouedec
•
Draft
[🐯+GRPO] Support FSDP + Fix bug when using LigerGRPO with DDP
#3260
opened Apr 8, 2025 by
shivam15s
Loading…
1 of 5 tasks
feat(trainer): Support multi-role & consecutive turns in DataCollatorForCompletionOnlyLM (#3223)
#3224
opened Apr 3, 2025 by
Kirili4ik
Loading…
4 tasks done
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-03-22.