Skip to content

feat: add a toggle for automatic rank/world_size discovery #3633

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

Jay-ju
Copy link
Contributor

@Jay-ju Jay-ju commented Apr 2, 2025

I am implementing a scheme to turn some tasks of building indexes on the current GPU into a single-machine multi-GPU scheme. However, I found that the current TorchDataset cannot be switched between single-machine single-GPU and single-machine multi-GPU. T
herefore, a switch is added here.

@github-actions github-actions bot added bug Something isn't working python labels Apr 2, 2025
@Jay-ju Jay-ju changed the title fix: add a toggle for automatic rank/world_size discovery feat: add a toggle for automatic rank/world_size discovery Apr 2, 2025
@github-actions github-actions bot added the enhancement New feature or request label Apr 2, 2025
@eddyxu eddyxu requested review from jmhsieh and chebbyChefNEQ April 14, 2025 15:43
Comment on lines 296 to 297
rank = get_global_rank()
world_size = get_global_world_size()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we just take rank and world_size in the args? if both are None, use inference, if they are set then use the provided values

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, use a NO_AUTO_INIT = -1 value to avoid this branch?

Signed-off-by: jukejian <jukejian@bytedance.com>
@Jay-ju
Copy link
Contributor Author

Jay-ju commented Apr 20, 2025

@chebbyChefNEQ I made some updates. Do you have time to take a look?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request python
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants