⏩ Train on completion only #3329

qgallouedec · 2025-04-20T23:37:01Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

HuggingFaceDocBuilderDev · 2025-04-20T23:41:31Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

LeonEricsson · 2025-04-21T11:09:08Z

Suggestion to update the DataCollator example to:

    Examples:
    ```python
    >>> from trl import DataCollatorForLanguageModeling
    >>> collator = DataCollatorForLanguageModeling(pad_token_id=0)
    >>> examples = [
    ...     {"input_ids": [1, 2, 3, 4], "completion_mask": [0, 0, 1, 1]},
    ...     {"input_ids": [5, 6, 7], "completion_mask": [0, 1, 1]}
    ... ]
    >>> collator(examples)
    {'input_ids': tensor([[1, 2, 3, 4],
                          [5, 6, 7, 0]]),
     'attention_mask': tensor([[1, 1, 1, 1],
                               [1, 1, 1, 0]]),
     'labels': tensor([[-100, -100,    3,    4],
                       [ -100,    6,    7, -100]])}
    ```

given that completion_only_loss is true by default

qgallouedec · 2025-04-22T22:30:59Z

tests/test_sft_trainer.py

-    def test_train_model_wrong_torch_dtype(self):
-        # Get the dataset
-        dataset = load_dataset("trl-internal-testing/zen", "standard_language_modeling", split="train")
-
-        with tempfile.TemporaryDirectory() as tmp_dir:
-            # Initialize the trainer
-            training_args = SFTConfig(output_dir=tmp_dir, model_init_kwargs={"torch_dtype": -1}, report_to="none")
-            with self.assertRaises(ValueError) as context:
-                SFTTrainer(
-                    model="trl-internal-testing/tiny-Qwen2ForCausalLM-2.5", args=training_args, train_dataset=dataset
-                )
-            self.assertIn(
-                "Invalid `torch_dtype` passed to `SFTConfig`. Expected either 'auto' or a string representing "
-                "a `torch.dtype` (e.g., 'float32'), but got -1.",
-                str(context.exception),
-            )
-


This is not related to the core change of this PR.
With the new serialisation logic of TrainingArguments, passing a wrong dtype fails when you instantiate the TrainingArguments. There is no need for such test anymore

qgallouedec · 2025-04-22T22:32:31Z

trl/trainer/sft_trainer.py

-            # If the dataset is prompt-completion, convert it to language modeling type
-            first_example = next(iter(dataset))
-            if "prompt" in first_example.keys() and "completion" in first_example.keys():
-                key = "messages" if is_conversational(first_example) else "text"
-
-                def concat_prompt_completion(example):
-                    return {key: example["prompt"] + example["completion"]}
-
-                dataset = dataset.map(concat_prompt_completion, remove_columns=["prompt", "completion"])
-


This concatenation needs to be removed, as we loses the information about where the completion starts. This completion is now managed in tokenize.

qgallouedec · 2025-04-22T22:43:09Z

Suggestion to update the DataCollator example to:

Thanks! Done in 7f7f2a4

qgallouedec and others added 3 commits April 20, 2025 23:20

Train on completion only

538ec20

a bit of documentation

34b7017

Merge branch 'fix-add_special_tokens' into train-completion-only

3a2788b

qgallouedec requested review from kashif, edbeeching, lewtun and shirinyamani April 20, 2025 23:42

HERIUN mentioned this pull request Apr 21, 2025

[SFT] only completion loss #3316

Closed

5 tasks

qgallouedec added 5 commits April 22, 2025 19:17

minor refinement and fix test

9821faa

now it's fixed!

5ea8495

allow none

d8b3e47

minors

a5dd899

title

0de43b8

qgallouedec commented Apr 22, 2025

View reviewed changes

update example

7f7f2a4

shirinyamani approved these changes Apr 23, 2025

View reviewed changes

Merge branch 'fix-add_special_tokens' into train-completion-only

eb9d7d1

qgallouedec merged commit 9497527 into fix-add_special_tokens Apr 23, 2025
10 checks passed

qgallouedec deleted the train-completion-only branch April 23, 2025 00:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⏩ Train on completion only #3329

⏩ Train on completion only #3329

qgallouedec commented Apr 20, 2025

HuggingFaceDocBuilderDev commented Apr 20, 2025

LeonEricsson commented Apr 21, 2025

qgallouedec Apr 22, 2025

qgallouedec Apr 22, 2025

qgallouedec commented Apr 22, 2025 •

edited

Loading

⏩ Train on completion only #3329

⏩ Train on completion only #3329

Conversation

qgallouedec commented Apr 20, 2025

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Apr 20, 2025

LeonEricsson commented Apr 21, 2025

qgallouedec Apr 22, 2025

Choose a reason for hiding this comment

qgallouedec Apr 22, 2025

Choose a reason for hiding this comment

qgallouedec commented Apr 22, 2025 • edited Loading

qgallouedec commented Apr 22, 2025 •

edited

Loading