Skip to content

2025-05-01 version of cuda.bindings.path_finder #578

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 58 commits into from
May 4, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
58 commits
Select commit Hold shift + click to select a range
17478da
Undo changes to the nvJitLink, nvrtc, nvvm bindings
rwgk Apr 25, 2025
7da74bd
Undo changes under .github, specific to nvvm, manipulating LD_LIBRARY…
rwgk Apr 25, 2025
211164d
PARTIALLY_SUPPORTED_LIBNAMES_LINUX, PARTIALLY_SUPPORTED_LIBNAMES_WINDOWS
rwgk Apr 25, 2025
a649e7d
Update EXPECTED_LIB_SYMBOLS for nvJitLink to cleanly support CTK vers…
rwgk Apr 25, 2025
b5cef1b
Save result of factoring out load_dl_common.py, load_dl_linux.py, loa…
rwgk Apr 25, 2025
bc0137a
Fix an auto-generated docstring
rwgk Apr 25, 2025
001a6a2
first round of Cursor refactoring (about 4 iterations until all tests…
rwgk Apr 26, 2025
9721079
Revert "first round of Cursor refactoring (about 4 iterations until a…
rwgk Apr 26, 2025
c409346
A couple trivial tweaks
rwgk Apr 26, 2025
0cd20d8
Merge branch 'path_finder_dev' into path_finder_review1
rwgk Apr 26, 2025
b3a3b16
Prefix the public API (just two items) with underscores for now.
rwgk Apr 28, 2025
42cb9b6
Merge branch 'main' into path_finder_review1
rwgk Apr 29, 2025
180eefd
Add SPDX-License-Identifier to all files under toolshed/ that don't h…
rwgk Apr 29, 2025
bfc4b69
Add SPDX-License-Identifier under cuda_bindings/tests/
rwgk Apr 29, 2025
a7001e1
Respond to "Do these need to be run as subprocesses?" review question…
rwgk Apr 29, 2025
4d95eb4
Respond to "dead code?" review questions (e.g. https://github.com/NVI…
rwgk Apr 29, 2025
72c339a
Respond to "Do we need to implement a cache separately ..." review qu…
rwgk Apr 29, 2025
4ce94be
Remove cuDriverGetVersion() function for now.
rwgk Apr 29, 2025
26eb4b5
Move add_dll_directory() from load_dl_common.py to load_dl_windows.py…
rwgk Apr 29, 2025
72d2567
Add SPDX-License-Identifier and # Forked from: URL in cuda_paths.py
rwgk Apr 29, 2025
e14391d
Add Add SPDX-License-Identifier and Original LICENSE in findlib.py
rwgk Apr 29, 2025
9154995
Very first draft of README.md
rwgk Apr 29, 2025
bdfc6a7
Update README.md, mostly as revised by perplexity, with various manua…
rwgk Apr 29, 2025
2279bda
Merge branch 'main' into path_finder_review1
rwgk Apr 30, 2025
2ad4b79
Refork cuda_paths.py AS-IS: https://github.com/NVIDIA/numba-cuda/blob…
rwgk Apr 30, 2025
7dcaa50
ruff format cuda_paths.py (NO manual changes)
rwgk Apr 30, 2025
714b88c
Add back _get_numba_CUDA_INCLUDE_PATH from 2279bda65640b73a9a5632df87…
rwgk Apr 30, 2025
166837d
Remove cuda_paths.py dependency on numba.cuda.cudadrv.runtime
rwgk Apr 30, 2025
ad1e85e
Add Forked from URLs, two SPDX-License-Identifier, Original Numba LIC…
rwgk Apr 30, 2025
47ad79f
Temporarily restore debug changes under .github/workflows, for expand…
rwgk Apr 30, 2025
1b88ec2
Restore cuda_path.py AS-IT-WAS at commit 2279bda65640b73a9a5632df878f…
rwgk Apr 30, 2025
db79ec3
Revert "Restore cuda_path.py AS-IT-WAS at commit 2279bda65640b73a9a56…
rwgk Apr 30, 2025
2bc7ef6
Force compute-sanitizer off unconditionally
rwgk Apr 30, 2025
7650b2e
Revert "Force compute-sanitizer off unconditionally"
rwgk Apr 30, 2025
b79e85b
Add timeout=10 seconds to test_path_finder.py subprocess.run() invoca…
rwgk Apr 30, 2025
f9a9e9f
Increase test_path_finder.py subprocess.run() timeout to 30 seconds:
rwgk Apr 30, 2025
7f76683
Revert "Temporarily restore debug changes under .github/workflows, fo…
rwgk Apr 30, 2025
aeaf4f0
Force compute-sanitizer off unconditionally
rwgk Apr 30, 2025
6a60161
Add: Note that the search is done on a per-library basis.
rwgk Apr 30, 2025
3277ac5
Add Note for CUDA_HOME / CUDA_PATH
rwgk Apr 30, 2025
1d4420b
Add 0. **Check if a library was loaded into the process already by so…
rwgk Apr 30, 2025
4437fcc
_find_dll_using_nvidia_bin_dirs(): reuse lib_searched_for in place of…
rwgk Apr 30, 2025
fd20253
Systematically replace all relative imports with absolute imports.
rwgk Apr 30, 2025
703988c
handle: int → ctypes.CDLL fix
rwgk Apr 30, 2025
28349a7
Make load_dl_windows.py abs_path_for_dynamic_library() implementation…
rwgk Apr 30, 2025
c55104c
Change argument name → libname for self-consistency
rwgk Apr 30, 2025
b32ed13
Systematically replace previously overlooked relative imports with ab…
rwgk Apr 30, 2025
92e7b42
Simplify code (also for self-consistency)
rwgk Apr 30, 2025
5a835d7
Expand the 3. **System Installations** section with information produ…
rwgk May 1, 2025
b910a6b
Pull out `**Environment variables**` into an added section, after man…
rwgk May 1, 2025
0fa2c83
Merge branch 'main' into path_finder_review1
rwgk May 1, 2025
fc22b1d
Revert "Force compute-sanitizer off unconditionally"
rwgk May 1, 2025
ad0d2f3
Merge branch 'main' into path_finder_review1
rwgk May 3, 2025
5d970f2
Move _path_finder/sys_path_find_sub_dirs.py → find_sub_dirs.py, use f…
rwgk May 3, 2025
bf9734c
WIP (search priority updated in README.md but not in code)
rwgk May 4, 2025
b8ab986
Merge branch 'main' into path_finder_review1
rwgk May 4, 2025
6aa1f13
Revert "WIP (search priority updated in README.md but not in code)"
rwgk May 4, 2025
0a21021
Merge branch 'path_finder_dev' into path_finder_review1
rwgk May 4, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 55 additions & 9 deletions cuda_bindings/cuda/bindings/_bindings/cynvrtc.pyx.in
Original file line number Diff line number Diff line change
Expand Up @@ -9,12 +9,13 @@
# This code was automatically generated with version 12.8.0. Do not modify it directly.
{{if 'Windows' == platform.system()}}
import os
import site
import struct
import win32api
from pywintypes import error
{{else}}
cimport cuda.bindings._lib.dlfcn as dlfcn
from libc.stdint cimport uintptr_t
{{endif}}
from cuda.bindings import path_finder

cdef bint __cuPythonInit = False
{{if 'nvrtcGetErrorString' in found_functions}}cdef void *__nvrtcGetErrorString = NULL{{endif}}
Expand Down Expand Up @@ -45,18 +46,65 @@ cdef bint __cuPythonInit = False
{{if 'nvrtcSetFlowCallback' in found_functions}}cdef void *__nvrtcSetFlowCallback = NULL{{endif}}

cdef int cuPythonInit() except -1 nogil:
{{if 'Windows' != platform.system()}}
cdef void* handle = NULL
{{endif}}

global __cuPythonInit
if __cuPythonInit:
return 0
__cuPythonInit = True

# Load library
{{if 'Windows' == platform.system()}}
with gil:
# First check if the DLL has been loaded by 3rd parties
try:
handle = win32api.GetModuleHandle("nvrtc64_120_0.dll")
except:
handle = None

# Check if DLLs can be found within pip installations
if not handle:
LOAD_LIBRARY_SEARCH_DEFAULT_DIRS = 0x00001000
LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR = 0x00000100
site_packages = [site.getusersitepackages()] + site.getsitepackages()
for sp in site_packages:
mod_path = os.path.join(sp, "nvidia", "cuda_nvrtc", "bin")
if os.path.isdir(mod_path):
os.add_dll_directory(mod_path)
try:
handle = win32api.LoadLibraryEx(
# Note: LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR needs an abs path...
os.path.join(mod_path, "nvrtc64_120_0.dll"),
0, LOAD_LIBRARY_SEARCH_DEFAULT_DIRS | LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR)

# Note: nvrtc64_120_0.dll calls into nvrtc-builtins64_*.dll which is
# located in the same mod_path.
# Update PATH environ so that the two dlls can find each other
os.environ["PATH"] = os.pathsep.join((os.environ.get("PATH", ""), mod_path))
except:
pass
else:
break
else:
# Else try default search
# Only reached if DLL wasn't found in any site-package path
LOAD_LIBRARY_SAFE_CURRENT_DIRS = 0x00002000
try:
handle = win32api.LoadLibraryEx("nvrtc64_120_0.dll", 0, LOAD_LIBRARY_SAFE_CURRENT_DIRS)
except:
pass

if not handle:
raise RuntimeError('Failed to LoadLibraryEx nvrtc64_120_0.dll')
{{else}}
handle = dlfcn.dlopen('libnvrtc.so.12', dlfcn.RTLD_NOW)
if handle == NULL:
with gil:
raise RuntimeError('Failed to dlopen libnvrtc.so.12')
{{endif}}


# Load function
{{if 'Windows' == platform.system()}}
with gil:
handle = path_finder.load_nvidia_dynamic_library("nvrtc").handle
{{if 'nvrtcGetErrorString' in found_functions}}
try:
global __nvrtcGetErrorString
Expand Down Expand Up @@ -241,8 +289,6 @@ cdef int cuPythonInit() except -1 nogil:
{{endif}}

{{else}}
with gil:
handle = <void*><uintptr_t>path_finder.load_nvidia_dynamic_library("nvrtc").handle
{{if 'nvrtcGetErrorString' in found_functions}}
global __nvrtcGetErrorString
__nvrtcGetErrorString = dlfcn.dlsym(handle, 'nvrtcGetErrorString')
Expand Down
20 changes: 14 additions & 6 deletions cuda_bindings/cuda/bindings/_internal/nvjitlink_linux.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@
#
# This code was automatically generated across versions from 12.0.1 to 12.8.0. Do not modify it directly.

from libc.stdint cimport intptr_t, uintptr_t
from libc.stdint cimport intptr_t

from .utils import FunctionNotFoundError, NotSupportedError
from .utils cimport get_nvjitlink_dso_version_suffix

from cuda.bindings import path_finder
from .utils import FunctionNotFoundError, NotSupportedError

###############################################################################
# Extern
Expand Down Expand Up @@ -52,9 +52,17 @@ cdef void* __nvJitLinkGetInfoLog = NULL
cdef void* __nvJitLinkVersion = NULL


cdef void* load_library(int driver_ver) except* with gil:
cdef uintptr_t handle = path_finder.load_nvidia_dynamic_library("nvJitLink").handle
return <void*>handle
cdef void* load_library(const int driver_ver) except* with gil:
cdef void* handle
for suffix in get_nvjitlink_dso_version_suffix(driver_ver):
so_name = "libnvJitLink.so" + (f".{suffix}" if suffix else suffix)
handle = dlopen(so_name.encode(), RTLD_NOW | RTLD_GLOBAL)
if handle != NULL:
break
else:
err_msg = dlerror()
raise RuntimeError(f'Failed to dlopen libnvJitLink ({err_msg.decode()})')
return handle


cdef int _check_or_init_nvjitlink() except -1 nogil:
Expand Down
53 changes: 45 additions & 8 deletions cuda_bindings/cuda/bindings/_internal/nvjitlink_windows.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,12 @@

from libc.stdint cimport intptr_t

from .utils cimport get_nvjitlink_dso_version_suffix

from .utils import FunctionNotFoundError, NotSupportedError

from cuda.bindings import path_finder
import os
import site

import win32api

Expand Down Expand Up @@ -39,9 +42,44 @@ cdef void* __nvJitLinkGetInfoLog = NULL
cdef void* __nvJitLinkVersion = NULL


cdef void* load_library(int driver_ver) except* with gil:
cdef intptr_t handle = path_finder.load_nvidia_dynamic_library("nvJitLink").handle
return <void*>handle
cdef inline list get_site_packages():
return [site.getusersitepackages()] + site.getsitepackages()


cdef load_library(const int driver_ver):
handle = 0

for suffix in get_nvjitlink_dso_version_suffix(driver_ver):
if len(suffix) == 0:
continue
dll_name = f"nvJitLink_{suffix}0_0.dll"

# First check if the DLL has been loaded by 3rd parties
try:
return win32api.GetModuleHandle(dll_name)
except:
pass

# Next, check if DLLs are installed via pip
for sp in get_site_packages():
mod_path = os.path.join(sp, "nvidia", "nvJitLink", "bin")
if os.path.isdir(mod_path):
os.add_dll_directory(mod_path)
try:
return win32api.LoadLibraryEx(
# Note: LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR needs an abs path...
os.path.join(mod_path, dll_name),
0, LOAD_LIBRARY_SEARCH_DEFAULT_DIRS | LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR)
except:
pass
# Finally, try default search
# Only reached if DLL wasn't found in any site-package path
try:
return win32api.LoadLibrary(dll_name)
except:
pass

raise RuntimeError('Failed to load nvJitLink')


cdef int _check_or_init_nvjitlink() except -1 nogil:
Expand All @@ -50,24 +88,23 @@ cdef int _check_or_init_nvjitlink() except -1 nogil:
return 0

cdef int err, driver_ver
cdef intptr_t handle
with gil:
# Load driver to check version
try:
nvcuda_handle = win32api.LoadLibraryEx("nvcuda.dll", 0, LOAD_LIBRARY_SEARCH_SYSTEM32)
handle = win32api.LoadLibraryEx("nvcuda.dll", 0, LOAD_LIBRARY_SEARCH_SYSTEM32)
except Exception as e:
raise NotSupportedError(f'CUDA driver is not found ({e})')
global __cuDriverGetVersion
if __cuDriverGetVersion == NULL:
__cuDriverGetVersion = <void*><intptr_t>win32api.GetProcAddress(nvcuda_handle, 'cuDriverGetVersion')
__cuDriverGetVersion = <void*><intptr_t>win32api.GetProcAddress(handle, 'cuDriverGetVersion')
if __cuDriverGetVersion == NULL:
raise RuntimeError('something went wrong')
err = (<int (*)(int*) noexcept nogil>__cuDriverGetVersion)(&driver_ver)
if err != 0:
raise RuntimeError('something went wrong')

# Load library
handle = <intptr_t>load_library(driver_ver)
handle = load_library(driver_ver)

# Load function
global __nvJitLinkCreate
Expand Down
18 changes: 13 additions & 5 deletions cuda_bindings/cuda/bindings/_internal/nvvm_linux.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@
#
# This code was automatically generated across versions from 11.0.3 to 12.8.0. Do not modify it directly.

from libc.stdint cimport intptr_t, uintptr_t
from libc.stdint cimport intptr_t

from .utils import FunctionNotFoundError, NotSupportedError
from .utils cimport get_nvvm_dso_version_suffix

from cuda.bindings import path_finder
from .utils import FunctionNotFoundError, NotSupportedError

###############################################################################
# Extern
Expand Down Expand Up @@ -51,8 +51,16 @@ cdef void* __nvvmGetProgramLog = NULL


cdef void* load_library(const int driver_ver) except* with gil:
cdef uintptr_t handle = path_finder.load_nvidia_dynamic_library("nvvm").handle
return <void*>handle
cdef void* handle
for suffix in get_nvvm_dso_version_suffix(driver_ver):
so_name = "libnvvm.so" + (f".{suffix}" if suffix else suffix)
handle = dlopen(so_name.encode(), RTLD_NOW | RTLD_GLOBAL)
if handle != NULL:
break
else:
err_msg = dlerror()
raise RuntimeError(f'Failed to dlopen libnvvm ({err_msg.decode()})')
return handle


cdef int _check_or_init_nvvm() except -1 nogil:
Expand Down
61 changes: 53 additions & 8 deletions cuda_bindings/cuda/bindings/_internal/nvvm_windows.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,12 @@

from libc.stdint cimport intptr_t

from .utils cimport get_nvvm_dso_version_suffix

from .utils import FunctionNotFoundError, NotSupportedError

from cuda.bindings import path_finder
import os
import site

import win32api

Expand Down Expand Up @@ -37,9 +40,52 @@ cdef void* __nvvmGetProgramLogSize = NULL
cdef void* __nvvmGetProgramLog = NULL


cdef void* load_library(int driver_ver) except* with gil:
cdef intptr_t handle = path_finder.load_nvidia_dynamic_library("nvvm").handle
return <void*>handle
cdef inline list get_site_packages():
return [site.getusersitepackages()] + site.getsitepackages() + ["conda"]


cdef load_library(const int driver_ver):
handle = 0

for suffix in get_nvvm_dso_version_suffix(driver_ver):
if len(suffix) == 0:
continue
dll_name = "nvvm64_40_0.dll"

# First check if the DLL has been loaded by 3rd parties
try:
return win32api.GetModuleHandle(dll_name)
except:
pass

# Next, check if DLLs are installed via pip or conda
for sp in get_site_packages():
if sp == "conda":
# nvvm is not under $CONDA_PREFIX/lib, so it's not in the default search path
conda_prefix = os.environ.get("CONDA_PREFIX")
if conda_prefix is None:
continue
mod_path = os.path.join(conda_prefix, "Library", "nvvm", "bin")
else:
mod_path = os.path.join(sp, "nvidia", "cuda_nvcc", "nvvm", "bin")
if os.path.isdir(mod_path):
os.add_dll_directory(mod_path)
try:
return win32api.LoadLibraryEx(
# Note: LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR needs an abs path...
os.path.join(mod_path, dll_name),
0, LOAD_LIBRARY_SEARCH_DEFAULT_DIRS | LOAD_LIBRARY_SEARCH_DLL_LOAD_DIR)
except:
pass

# Finally, try default search
# Only reached if DLL wasn't found in any site-package path
try:
return win32api.LoadLibrary(dll_name)
except:
pass

raise RuntimeError('Failed to load nvvm')


cdef int _check_or_init_nvvm() except -1 nogil:
Expand All @@ -48,24 +94,23 @@ cdef int _check_or_init_nvvm() except -1 nogil:
return 0

cdef int err, driver_ver
cdef intptr_t handle
with gil:
# Load driver to check version
try:
nvcuda_handle = win32api.LoadLibraryEx("nvcuda.dll", 0, LOAD_LIBRARY_SEARCH_SYSTEM32)
handle = win32api.LoadLibraryEx("nvcuda.dll", 0, LOAD_LIBRARY_SEARCH_SYSTEM32)
except Exception as e:
raise NotSupportedError(f'CUDA driver is not found ({e})')
global __cuDriverGetVersion
if __cuDriverGetVersion == NULL:
__cuDriverGetVersion = <void*><intptr_t>win32api.GetProcAddress(nvcuda_handle, 'cuDriverGetVersion')
__cuDriverGetVersion = <void*><intptr_t>win32api.GetProcAddress(handle, 'cuDriverGetVersion')
if __cuDriverGetVersion == NULL:
raise RuntimeError('something went wrong')
err = (<int (*)(int*) noexcept nogil>__cuDriverGetVersion)(&driver_ver)
if err != 0:
raise RuntimeError('something went wrong')

# Load library
handle = <intptr_t>load_library(driver_ver)
handle = load_library(driver_ver)

# Load function
global __nvvmVersion
Expand Down
3 changes: 3 additions & 0 deletions cuda_bindings/cuda/bindings/_internal/utils.pxd
Original file line number Diff line number Diff line change
Expand Up @@ -165,3 +165,6 @@ cdef int get_nested_resource_ptr(nested_resource[ResT] &in_out_ptr, object obj,

cdef bint is_nested_sequence(data)
cdef void* get_buffer_pointer(buf, Py_ssize_t size, readonly=*) except*

cdef tuple get_nvjitlink_dso_version_suffix(int driver_ver)
cdef tuple get_nvvm_dso_version_suffix(int driver_ver)
14 changes: 14 additions & 0 deletions cuda_bindings/cuda/bindings/_internal/utils.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -127,3 +127,17 @@ cdef int get_nested_resource_ptr(nested_resource[ResT] &in_out_ptr, object obj,
class FunctionNotFoundError(RuntimeError): pass

class NotSupportedError(RuntimeError): pass


cdef tuple get_nvjitlink_dso_version_suffix(int driver_ver):
if 12000 <= driver_ver < 13000:
return ('12', '')
raise NotSupportedError(f'CUDA driver version {driver_ver} is not supported')


cdef tuple get_nvvm_dso_version_suffix(int driver_ver):
if 11000 <= driver_ver < 11020:
return ('3', '')
if 11020 <= driver_ver < 13000:
return ('4', '')
raise NotSupportedError(f'CUDA driver version {driver_ver} is not supported')
Loading