Skip to content

2025-05-01 version of cuda.bindings.path_finder #578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 58 commits into from
May 4, 2025

Conversation

rwgk
Copy link
Collaborator

@rwgk rwgk commented Apr 25, 2025

This work was merged into the path_finder_dev branch (see comment below). Follow-on work is under #604.


Description

Major milestone for the work tracked under #451

This PR introduces only two public APIs:

  • cuda.bindings.path_finder.SUPPORTED_LIBNAMES (currently ('nvJitLink', 'nvrtc', 'nvvm'))
  • cuda.bindings.path_finder.load_nvidia_dynamic_library(libname: str) -> LoadedDL

With:

@dataclass                                                                      
class LoadedDL:                                                                 
    handle: int                                                                 
    abs_path: Optional[str]                                                     
    was_already_loaded_from_elsewhere: bool                                     

However, the implementations were actually thoroughly tested (under #558) for all

SUPPORTED_LIBNAMES + PARTIALLY_SUPPORTED_LIBNAMES

enumerated under cuda.bindings._path_finder.supported_libs (note that this module is private).

To make this PR easier to review, the changes to the nvJitLink, nvrtc, and nvvm bindings are NOT included in this PR. Those changes were also already tested under #558. They will be merged into main with two follow-on PRs (one for the nvrtc bindings, one for nvJitLink and nvvm).

Thorough testing of all SUPPORTED_LIBNAMES + PARTIALLY_SUPPORTED_LIBNAMES requires changes to the GitHub Actions configs, to set up suitable CTK installations. This will also be handled separately in follow-on PRs.


Suggested order for reviewing files:

  • cuda/bindings/_path_finder/supported_libs.py
  • cuda/bindings/_path_finder/load_nvidia_dynamic_library.py
  • cuda/bindings/_path_finder/load_dl_common.py
  • cuda/bindings/_path_finder/load_dl_linux.py
  • cuda/bindings/_path_finder/load_dl_windows.py
  • tests/test_path_finder.py
  • cuda/bindings/_path_finder/find_nvidia_dynamic_library.py
  • everything else

Discussion points:

  • Copyright notice for cuda/bindings/_path_finder/cuda_paths.py (the original file under numba-cuda does not have one)
  • Documentation for the new public APIs
  • Documentation for maintaining SUPPORTED_LIBNAMES + PARTIALLY_SUPPORTED_LIBNAMES as new CTK versions are released

Copy link
Contributor

copy-pr-bot bot commented Apr 25, 2025

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rwgk
Copy link
Collaborator Author

rwgk commented Apr 25, 2025

/ok to test 17478da

Copy link

@rwgk
Copy link
Collaborator Author

rwgk commented Apr 25, 2025

/ok to test 7da74bd

@leofang leofang assigned leofang and rwgk and unassigned leofang Apr 25, 2025
@leofang leofang self-requested a review April 25, 2025 19:34
@leofang leofang added cuda.bindings Everything related to the cuda.bindings module P0 High priority - Must do! feature New feature or request labels Apr 25, 2025
@leofang leofang added this to the cuda-python parking lot milestone Apr 25, 2025
@rwgk
Copy link
Collaborator Author

rwgk commented Apr 25, 2025

/ok to test a649e7d

@rwgk
Copy link
Collaborator Author

rwgk commented Apr 25, 2025

For completeness:

I used these command while working on commit a649e7d:

0db6015-lcedt.nvidia.com:~/ctk_downloads/extracted $ sos=`find . -type f -name 'libnvJitLink.so*' | sort`
0db6015-lcedt.nvidia.com:~/ctk_downloads/extracted $ for so in $sos; do echo $so; nm --defined-only -D $so | grep nvJitLinkVersion; done
./12.0.1_525.85.12/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.0.140
./12.1.1_530.30.02/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.1.105
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.2.140
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.3.101
000000000025ed30 T nvJitLinkVersion@@libnvJitLink.so.12
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
00000000000017da T nvJitLinkVersion@@libnvJitLink.so.12
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.4.127
0000000000265220 T nvJitLinkVersion@@libnvJitLink.so.12
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000001bd7 T nvJitLinkVersion@@libnvJitLink.so.12
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.5.82
00000000002923b0 T nvJitLinkVersion@@libnvJitLink.so.12
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000002064 T nvJitLinkVersion@@libnvJitLink.so.12
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.6.77
0000000000264200 T nvJitLinkVersion@@libnvJitLink.so.12
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
00000000000023f0 T nvJitLinkVersion@@libnvJitLink.so.12
./12.8.1_570.124.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.8.93
00000000004ba630 T nvJitLinkVersion@@libnvJitLink.so.12
./12.8.1_570.124.06/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000002c1a T nvJitLinkVersion@@libnvJitLink.so.12
0db6015-lcedt.nvidia.com:~/ctk_downloads/extracted $ for so in $sos; do echo $so; nm --defined-only -D $so | grep __nvJitLinkCreate_12_0; done
./12.0.1_525.85.12/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.0.140
0000000000226010 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.1.1_530.30.02/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.1.105
00000000002438d0 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.2.140
0000000000256fa0 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.2.2_535.104.05/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000000d39 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.3.101
000000000025ed60 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.3.2_545.23.08/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000001389 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.4.127
0000000000265250 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.4.1_550.54.15/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
00000000000016a9 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.5.82
00000000002923e0 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.5.1_555.42.06/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000001a59 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.6.77
0000000000264230 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.6.2_560.35.03/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000001d08 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.8.1_570.124.06/libnvjitlink/targets/x86_64-linux/lib/libnvJitLink.so.12.8.93
00000000004ba660 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12
./12.8.1_570.124.06/libnvjitlink/targets/x86_64-linux/lib/stubs/libnvJitLink.so
0000000000002378 T __nvJitLinkCreate_12_0@@libnvJitLink.so.12

@rwgk
Copy link
Collaborator Author

rwgk commented Apr 25, 2025

Test results for commit a649e7d:

$ python path_finder_abs_path_from_test_info.py Test*.txt
Test__linux-64__Python_3.10__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-64__Python_3.10__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-64__Python_3.11__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-64__Python_3.11__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-64__Python_3.12__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-64__Python_3.12__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-64__Python_3.13__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-64__Python_3.13__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-64__Python_3.9__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-64__Python_3.9__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-aarch64__Python_3.10__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-aarch64__Python_3.10__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-aarch64__Python_3.11__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-aarch64__Python_3.11__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-aarch64__Python_3.12__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-aarch64__Python_3.12__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-aarch64__Python_3.13__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-aarch64__Python_3.13__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-aarch64__Python_3.9__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__linux-aarch64__Python_3.9__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__win-64__Python_3.12__CUDA_11.8.0__Runner_default__CTK_wheels____test.txt
Test__win-64__Python_3.12__CUDA_11.8.0__Runner_default__local_CTK____test.txt
Test__linux-64__Python_3.10__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.10__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-64__Python_3.10__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.11__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.11__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-64__Python_3.11__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.12__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.12__CUDA_12.8.0__Runner_H100__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.12__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.12.10/x64/lib/python3.12/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-64__Python_3.12__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.13__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.13__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-64__Python_3.13__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.9__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-64__Python_3.9__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.9.22/x64/lib/python3.9/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.9.22/x64/lib/python3.9/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.9.22/x64/lib/python3.9/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-64__Python_3.9__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.10__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.10__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.10.17/arm64/lib/python3.10/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.10.17/arm64/lib/python3.10/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.10.17/arm64/lib/python3.10/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-aarch64__Python_3.10__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.11__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.11__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.11.12/arm64/lib/python3.11/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.11.12/arm64/lib/python3.11/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.11.12/arm64/lib/python3.11/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-aarch64__Python_3.11__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.12__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.12__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.12.10/arm64/lib/python3.12/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.12.10/arm64/lib/python3.12/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.12.10/arm64/lib/python3.12/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-aarch64__Python_3.12__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.13__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.13__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.13.3/arm64/lib/python3.13/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.13.3/arm64/lib/python3.13/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.13.3/arm64/lib/python3.13/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-aarch64__Python_3.13__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.9__CUDA_12.0.1__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__linux-aarch64__Python_3.9__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    /opt/hostedtoolcache/Python/3.9.22/arm64/lib/python3.9/site-packages/nvidia/nvjitlink/lib/libnvJitLink.so.12
    /opt/hostedtoolcache/Python/3.9.22/arm64/lib/python3.9/site-packages/nvidia/cuda_nvrtc/lib/libnvrtc.so.12
    /opt/hostedtoolcache/Python/3.9.22/arm64/lib/python3.9/site-packages/nvidia/cuda_nvcc/nvvm/lib64/libnvvm.so
Test__linux-aarch64__Python_3.9__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvJitLink.so
    /__w/cuda-python/cuda-python/cuda_toolkit/lib/libnvrtc.so
    /__w/cuda-python/cuda-python/cuda_toolkit/nvvm/lib64/libnvvm.so.4.0.0
Test__win-64__Python_3.12__CUDA_12.8.0__Runner_default__CTK_wheels____test.txt
    C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\nvidia\nvjitlink\bin\nvJitLink_120_0.dll
    C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\nvidia\cuda_nvrtc\bin\nvrtc64_120_0.dll
    C:\a\_tool\Python\3.12.10\x64\Lib\site-packages\nvidia\cuda_nvcc\nvvm\bin\nvvm64_40_0.dll
Test__win-64__Python_3.12__CUDA_12.8.0__Runner_default__local_CTK____test.txt
    C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvJitLink_120_0.dll
    C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\bin\nvrtc64_120_0.dll
    C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.8\nvvm\bin\nvvm64_40_0.dll

$ cat path_finder_abs_path_from_test_info.py
import sys


def get_info_abs_path(filename):
    print_buffer = [filename]
    done = set()
    for line in open(filename).read().splitlines():
        if "Z INFO " in line:
            flds = line.split(": abs_path=", 1)
            assert len(flds) == 2
            abs_path = eval(flds[1])  # eval undoes repr double backslashes
            if abs_path not in done:
                print_buffer.append(f"    {abs_path}")
                done.add(abs_path)
    return print_buffer


def run(args):
    no_info = []
    has_info = []
    for filename in sorted(args):
        print_buffer = get_info_abs_path(filename)
        assert print_buffer
        if len(print_buffer) == 1:
            no_info.append(print_buffer)
        else:
            has_info.append(print_buffer)
    for print_buffer in no_info + has_info:
        print("\n".join(print_buffer))


if __name__ == "__main__":
    run(args=sys.argv[1:])

@rwgk rwgk force-pushed the path_finder_review1 branch from 2452fdb to b5cef1b Compare April 25, 2025 23:45
@rwgk
Copy link
Collaborator Author

rwgk commented Apr 25, 2025

/ok to test b5cef1b

@rwgk
Copy link
Collaborator Author

rwgk commented Apr 26, 2025

/ok to test 001a6a2

rwgk added 3 commits April 25, 2025 22:21
…ll tests passed), followed by ruff auto-fixes"

This reverts commit 001a6a2.

There were many GitHub Actions jobs that failed (all tests with 12.x):

https://github.com/NVIDIA/cuda-python/actions/runs/14677553387

This is not worth spending time debugging.
Especially because

* Cursor has been unresponsive for at least half an hour:
    We're having trouble connecting to the model provider. This might be temporary - please try again in a moment.

* The refactored code does not seem easier to read.
@rwgk
Copy link
Collaborator Author

rwgk commented Apr 26, 2025

/ok to test 0cd20d8

@rwgk rwgk requested a review from kkraus14 April 26, 2025 06:21
@rwgk
Copy link
Collaborator Author

rwgk commented May 1, 2025

I'm not sure if this behavior is limited to just the system search, but certain package managers, i.e. conda split packages into runtime and dev packages. I.e. conda has libcublas and libcublas-dev packages, where libcublas contains things needed at runtime, which means it only has the versioned .so files and doesn't have a libcublas.so. libcublas-dev has a libcublas.so which is expected to only be needed at build time for the linker.

The versioned names will be found, too, with the code as-is. See # First look for an exact match and # Look for a versioned library in the code I pointed out before.

We should probably just make sure we're universally using the SONAME of the library unless there's a packaging bug that we need to workaround?

We have SUPPORTED_LINUX_SONAMES already. There are two possible approaches making more use of it:

  • Without using the major version of the driver: I could look for the versioned .so filenames, newest first. This is basically what I have already, only currently I do not target specific version numbers.

  • Or I need to add code to determine the major version of the driver, which we currently don't need. (That's why I backed out the Windows version when you commented on it before.)

The only advantage would be that we'd pick out e.g. libcublas.so.12 even in an environment where libcublas.so.11 is found first with the current implementation. That seems like a pretty broken installation (?), I had doubts about adding complexity — and maintenance overhead long term — to accommodate such installations. — Do you think I should invest time into this?

@kkraus14
Copy link
Collaborator

kkraus14 commented May 1, 2025

The versioned names will be found, too, with the code as-is. See # First look for an exact match and # Look for a versioned library in the code I pointed out before.

👍 I didn't follow this properly. https://github.com/rwgk/cuda-python/blob/aeaf4f02278b62befb0e380e9f6f97a50b848fb3/cuda_bindings/cuda/bindings/_path_finder/find_nvidia_dynamic_library.py#L33-L37 This will be potentially problematic. If there's no libcublas.so and there is a libcublas.so.11 and libcublas.so.12 it would return libcublas.so.11.

I think the only way this could happen today is by installing wheels since there's both cu11 and cu12 wheels and it's part of the pkg name which means you can technically install them side by side. If you do a pip install nvidia-cublas-cu11 nvidia-cublas-cu12 you get .../site-packages/nvidia/cublas/lib/libcublas.so.11 and .../site-packages/nvidia/cublas/lib/libcublas.so.12 without a libcublas.so.

Would it maybe make more sense to just looking for the specific version we'd expect based on cuda 11 vs cuda 12?

@rwgk
Copy link
Collaborator Author

rwgk commented May 1, 2025

Would it maybe make more sense to just looking for the specific version we'd expect based on cuda 11 vs cuda 12?

It's definitely more predictable, but I'll need to add in determining the CUDA driver version. I'll work on that.

To explain why I was hesitating before: Maximize portability

Currently this code would work as pure Python code, even without all the rest of cuda.bindings. I believe that determining the CUDA driver version will either introduce a a dependency on cuda.bindings.driver, or if portability is (or becomes at some point) important, we'd need to reimplement a minimalistic version.

@kkraus14
Copy link
Collaborator

kkraus14 commented May 1, 2025

It's definitely more predictable, but I'll need to add in determining the CUDA driver version. I'll work on that.

We shouldn't need to determine the CUDA driver version, we just need the major version which we're already capturing in the cuda.bindings version. You can run cuda.bindings 11.x against the 12.x CUDA driver without issue for example.

We just need a map of CUDA major version --> soname per library I think?

@rwgk
Copy link
Collaborator Author

rwgk commented May 1, 2025

You can run cuda.bindings 11.x against the 12.x CUDA driver without issue for example.

Oh! Thanks, I need to get my head around that.

@leofang
Copy link
Member

leofang commented May 1, 2025

We shouldn't need to determine the CUDA driver version, we just need the major version which we're already capturing in the cuda.bindings version. You can run cuda.bindings 11.x against the 12.x CUDA driver without issue for example.

We just need a map of CUDA major version --> soname per library I think?

There is a catch here, which is what the current cuda.bindings and nvmath.bindings are based on. CUDA 12 driver can run CUDA 11 libraries, but not the other way around. So a mapping from the driver version to the supported CUDA major versions (then to the sonames) is still needed. Ex:

  • driver 12.x -> support CTK 11 & 12 -> 11 & 12 sonames
  • driver 11.x -> support CTK 11 -> 11 sonames only

…ual inspection of cuda_paths.py. Minor additional edits.
@rwgk
Copy link
Collaborator Author

rwgk commented May 1, 2025

There is a catch here, which is what the current cuda.bindings and nvmath.bindings are based on. CUDA 12 driver can run CUDA 11 libraries, but not the other way around. So a mapping from the driver version to the supported CUDA major versions (then to the sonames) is still needed. Ex:

  • driver 12.x -> support CTK 11 & 12 -> 11 & 12 sonames
  • driver 11.x -> support CTK 11 -> 11 sonames only

Question, scoped to the _find_dll_using_nvidia_bin_dirs and _find_so_using_nvidia_lib_dirs implementations, which are effectively searching for wheels (note that these pre-empt cuda_paths.py searches):

Currently I'm looking through sys.path (which includes site-packages) in order, and I stop as soon as I'm finding libname.so (first), or libname.so* (as fallback).

If we're now targeting, e.g., "just 11" or "12 or 11", do we want to fully traverse sys.path to enumerate all possible matches, then rank them and return the "best" match?

@leofang
Copy link
Member

leofang commented May 1, 2025

(btw I merged #593)

@rwgk rwgk force-pushed the path_finder_review1 branch from e2d6682 to a7b0633 Compare May 1, 2025 17:50
@rwgk rwgk force-pushed the path_finder_review1 branch from a7b0633 to fc22b1d Compare May 1, 2025 17:51
@rwgk
Copy link
Collaborator Author

rwgk commented May 1, 2025

/ok to test aeaf4f0

@kkraus14
Copy link
Collaborator

kkraus14 commented May 1, 2025

If we're now targeting, e.g., "just 11" or "12 or 11", do we want to fully traverse sys.path to enumerate all possible matches, then rank them and return the "best" match?

We want to find the library as if the extension module was normally built and linked against the library we're finding, which means that it would only search for the SONAME, so I would think we should return the first match of "just 11" in the case of having been built for CUDA 11.

@rwgk
Copy link
Collaborator Author

rwgk commented May 4, 2025

There are two changes I still need to work on:

  • Change the Library Search Priority — I already worked this into the README.md in commit bf9734c, but I backed it out again here.

  • Determine driver version -> determine SONAME / DLL name -> search for the exact name

Doing that work under this PR would be unwieldy. This PR has so many commits and comments already, I have to click "Load more ..." several times to find the things I'm looking for. I'll merge this PR into the path_finder_dev branch, then pick up work on the Search Priority.

@rwgk rwgk changed the title First version of cuda.bindings.path_finder 2025-05-01 version of cuda.bindings.path_finder May 4, 2025
@rwgk rwgk changed the base branch from main to path_finder_dev May 4, 2025 05:48
@rwgk rwgk merged commit 2a6452d into NVIDIA:path_finder_dev May 4, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda.bindings Everything related to the cuda.bindings module feature New feature or request P0 High priority - Must do!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants