Skip to content

Intel-Gpu-Levelzero: Failed to get Indices from levelZero on WSL2 on Intel Lunar Lake Core Ultra 9 processor #2039

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
shailesh837 opened this issue Apr 11, 2025 · 4 comments · May be fixed by #2043
Labels
bug Something isn't working gpu GPU device plugin related issue

Comments

@shailesh837
Copy link

Describe the bug
We have Customer called CelebalTech based in Dubai, Finland, Bangalore, Targeting there Customer contact Call center Framework running on AIPC running single node rke2 kuberenetes cluster using OPEA-ERAG based RAG solution.
WSL2 : Ubuntu – 24.04
Document I used to reproduce the issue.

Currently we have the Latest the whole OPEA-Based RAG running usign docker-compose on LLM Model : llama3.2 running on iGPU (/dev/dri) exposure for iGPU (Windows driver : version):
After applying side-car (WSL2 : Ubuntu 24-04 gpu device plugin) there is level zero crash or failure inside pod:

Image

To Reproduce
Steps to reproduce the behavior:

  1. RKE2 version 1.31.4 Installtion
     sudo curl -sfL https://get.rke2.io | sudo INSTALL_RKE2_VERSION=v1.31.4+rke2r1 sh -
  2. Enable the rke2-server service
     systemctl enable rke2-server.service
  3. Start the service
     systemctl start rke2-server.service
  4. Follow the logs, if you like
      journalctl -u rke2-server –f
  1. Replace canal cni with calico
    Step 1: Edit this file /etc/rancher/rke2/config.yaml and add cni: calico
    Step 2: restart daemonsets and rke2 service
    sudo systemctl daemon-reload
    sudo systemctl restart rke2-service
    WSL layer enables Intel GPU detection with WSL (Windows Subsystem for Linux) Kubernetes clusters.
    kubectl apply -k 'https://github.com/intel/intel-device-plugins-for-kubernetes/deployments/gpu_plugin/overlays/wsl?ref=v0.32.0'

Expected behavior
A clear and concise description of what you expected to happen.
Intel-gpu-plugin should detect levelzero indices and have gpu pod running under rke2 cluster

Image

Screenshots
If applicable, add screenshots to help explain your problem.

Image

Image

System (please complete the following information):

  • OS version: WSL2 : Ubuntu 24.04
  • Kernel version: microsoft Linux 5.15
  • Device plugins version: v0.32.0
  • Hardware info: Core Ultra 7 288V
@tkatila tkatila added bug Something isn't working gpu GPU device plugin related issue labels Apr 11, 2025
@eero-t
Copy link
Contributor

eero-t commented Apr 11, 2025

Potentially related: intel/compute-runtime#721 (driver segfault fix merged, but not yet in any release)

@tkatila
Copy link
Contributor

tkatila commented Apr 11, 2025

@eero-t it's not. LunarLake seems to require the igc packages that we have dropped by force in the containers. I have debugged the issue with @shailesh837 today.

@shailesh837
Copy link
Author

@tkatila : : Please let me know when we can have new intel-gpu-levelzero container with this changes in intel docker hub, As its urgently required, we have tested it in UAT with customer the local version we created , But now we need to test on multiple customer machines.

@tkatila
Copy link
Contributor

tkatila commented Apr 22, 2025

@shailesh837 it's in progress. We ran into some issues with our containers which we need to fix first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gpu GPU device plugin related issue
Projects
None yet
3 participants