You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would like to request to have a single docker image for both CPU and GPU cases. This can be done using a combination of Dockerfile and Dockerfile-cuda-all. An entrypoint.sh can choose between CPU and GPU binaries based on availability of CUDA drivers or based on "CUDA_VISIBLE_DEVICES".
Please let me know your thoughts on this
Motivation
This would help in not always configuring the images to use based on the resources. The image size would be slightly bigger but I think that is a decent trade-off
Your contribution
I would like to help contribute this with a PR, if it's an acceptable feature.
The text was updated successfully, but these errors were encountered:
Currently we're not really keen on doing that. We have all CUDA devices image already but merging every path including CPU (and various CPU backends most likely) would mean adding both compile time checks and runtime checks, which complexifies quite a bit the code.
To take an example. let's say you want to run TEI on GPU on a cuda enabled devices, but you make a mistake in your deployment and forget to provide the GPUs to the pod (by forgetting --gpus=all or --device=nvidia.com/gpu=all for instance, or failing to install the proper CNI on the node). With a ALLIN image you'll end up running the CPU version instead, because we wouldn't be able to find the GPU, everything will be rather slow, but you'd likely not have any idea of what the problem is.).
We could have a flag of some kind of help choose which kind of accelerator you'd expect, but that'd come down to something similar as choosing the correct image in the first place (like CUDA_VISIBLE_DEVICES).
Feature request
I would like to request to have a single docker image for both CPU and GPU cases. This can be done using a combination of Dockerfile and Dockerfile-cuda-all. An entrypoint.sh can choose between CPU and GPU binaries based on availability of CUDA drivers or based on "CUDA_VISIBLE_DEVICES".
Please let me know your thoughts on this
Motivation
This would help in not always configuring the images to use based on the resources. The image size would be slightly bigger but I think that is a decent trade-off
Your contribution
I would like to help contribute this with a PR, if it's an acceptable feature.
The text was updated successfully, but these errors were encountered: