Skip to content

Implement venv/site-packages based binaries #2156

Open
0 of 1 issue completed
Open
Feature
0 of 1 issue completed
@groodt

Description

@groodt

Context

This is a tracking issue to recognise that the lack of a site-packages layout causes friction when making use of third-party distribution packages (wheels and sdists) from indexes such as PyPI.

Outside bazel and rules_python, it is common for distribution packages to assume that they will be installed into a single site-packages folder, either in a "virtual environment" or directly into a python user or global site installation.

Notable examples are the libraries in the AI / ML ecosystem that make use of the nvidia CUDA shared libraries. These shared libraries contain relative rpath in the ELF/Mach-O/DLL which fail when not installed as siblings in a site-packages layout.

There is also a complication introduced into the rules due to lack of the single site-packages folder. Namespace packages in rules_python are all processed into pkg-util style namespace packages. This seems to work, but wouldn't be necessary if site-packages was used.

Another rare issue is failure to load *.pth files. Python provides Site-specific configuration hooks that can customize the sys.path at startup. rules_python could workaround this issue perhaps, but if a site-packages layout was used and discovered by the interpreter at startup, no workarounds would be necessary.

Distribution packages on PyPI known to have issues:

  • torch
  • onnxruntime-gpu
  • rerun-sdk

Known workarounds

  1. Patch the third-party dependencies using rules_python patching support
  2. Use an alternative set of rules such as rules_py
  3. Patch the third-party dependencies outside rules_python and push the patched dependencies to a private index

Related

Proposed design to solve

The basic proposed solution is to create a per-binary virtual env whose site-packages contains symlinks to other locations in runfiles. e.g. ``$runfiles/mybin.venv/site-packages/foowould be a symlink to$runfiles/_pypi_foo/site-packages/foo`

TODO list

  • Add PyInfo.site_packages_symlinks. A depset of site-packages relative paths and runfiles paths to symlink to.
  • Make pypi-generated targets use this site-packages solution by default
    • Disable pkgutil-style __init__.py generation in pypi repo phase
    • Maybe refactor the pypi generation to use a custom rule instead of plain py_library.
  • Add a flag to allow experimentation and testing
  • Edge cases
    • if two distributions install into the same directory and/or have overlapping files
    • Handling pkgutil-style packages
    • Interaction of bootstrap=script vs bootstrap=system with this new layout
    • Handle platforms/cases where symlinks can't be created at build time (windows, using rules_pkg)
    • Handling if multiple versions of a distribution are in the deps and ensuring only one is used, while still respecting merge/conflict logic.

Sub-issues

Metadata

Metadata

Assignees

No one assigned

    Labels

    core-rulesIssues concerning core bin/test/lib rules

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions