Skip to content

ci: improve test execution times (brainstorming) #3047

Open
@terriko

Description

@terriko

We've more than doubled the number of checkers in the past couple of releases, and unsurprisingly that slightly less than doubles the amount of time it takes to run tests.

  • Linux short tests on python 3.7, 3.8, 3.9, 3.11 each take around 20-25 minutes to run (3.7 is going away at end of June)
  • Linux long tests (python 3.10) are about the same at 25 minutes.
  • Windows short tests (python 3.10) take ~45-60 minutes (and sometimes fail because that's the cutoff)
  • Windows long tests (python 3.9) take around the same at ~60-70 minutes
  • We also have a bunch of cache and network-reliant jobs that take < 5 min each so I don't think there's much cause to reduce them

Right now, our long tests are only slightly longer than our short tests. It might be nice if we could find ways to get the short test jobs to be shorter (say, < 10 minutes?) so contributors could get faster feedback from CI when doing pull requests and shorter default runs for tests run locally.

Potential ideas:

  • Run fewer checker tests in short tests (e.g. only run the first test in the series, save the others for the long run)
  • Reduce the size of the test/language_data files. Many of them look up a lot of components because we started with real language data files for real components with full dependency lists, but we could probably reasonably test the scanner's ability to handle files with much smaller lists as long as we keep coverage on types of ways that dependencies can be described (e.g. in requirements.txt we'd want to keep at least one unpinned and one pinned dependency to test that we don't break parsing for one or the other)
  • Look at which tests currently take the longest and consider if any of them could reasonably be moved to run once per platform
  • looking at whether there's any other performance improvements we could be implementing in cve-bin-tool itself (e.g. does our database need more indexes to improve lookup? Would switching to a "real" database help?)

Honestly, I don't know how much of a difference this will make in practice: Github Actions itself provides a pretty big bottleneck because tests can wind up sitting in the queue for hours. But it's worth periodically thinking about whether we can do better and how even if this might not have as strong an impact as I'd like.

Metadata

Metadata

Assignees

No one assigned

    Labels

    CIRelated to our continuous integration service (GitHub Actions)discussionDiscussion thread or meeting minutes that may not have any trivially fixable code issues associated

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions