Skip to content

Performance degrades the more wildcards that are contained in TSConfig include or exclude patternsΒ #61103

Open
@MichaelMitchell-at

Description

@MichaelMitchell-at

πŸ”Ž Search Terms

wildcard pattern include exclude tsconfig regex regexp regular expression glob performance slow node

πŸ•— Version & Regression Information

  • This is the behavior in every version I tried

⏯ Playground Link

No response

πŸ’» Code

Create a TS project with any empty TS file and the following TSConfig.

{
  "compilerOptions": {
    "noCheck": true,
    "noEmit": true,
    "noResolve": true,
  },
  "exclude": [
    "alpha/beta/gamma/delta/epsilon/zeta/eta/theta/iota/kappa/lambda/**/*.ts",
    "alpha/beta/gamma/delta/epsilon/zeta/eta/theta/iota/kappa/lambda/**/*.ts",
    "alpha/beta/gamma/delta/epsilon/zeta/eta/theta/iota/kappa/lambda/**/*.ts",
    ...repeat a couple hundred or more times
  ]
}

I've created a repo to more easily reproduce the behavior: https://github.com/MichaelMitchell-at/typescript_slow_wildcards_repro

πŸ™ Actual behavior

tsc takes several seconds to run on the project, even though there is no work to do and there are no files to compare the exclude patterns against. In fact, running tsc with node --jitless is much faster than without.

πŸ™‚ Expected behavior

tsc should complete quickly.

Additional information about the issue

I've determined through profiling that the issue is that a giant RegExp combining all the patterns gets created which is very slow to evaluate.

export function getRegularExpressionForWildcard(specs: readonly string[] | undefined, basePath: string, usage: "files" | "directories" | "exclude"): string | undefined {
const patterns = getRegularExpressionsForWildcards(specs, basePath, usage);
if (!patterns || !patterns.length) {
return undefined;
}
const pattern = patterns.map(pattern => `(${pattern})`).join("|");
// If excluding, match "foo/bar/baz...", but if including, only allow "foo".
const terminator = usage === "exclude" ? "($|/)" : "$";
return `^(${pattern})${terminator}`;
}

From my testing, it's much faster to build a separate RegExp for each pattern and test them one by one for large number of patterns and for a small number of patterns there is no perceivable difference in speed. I've created a PR implementing this change which greatly improves the performance: #61104

Metadata

Metadata

Assignees

No one assigned

    Labels

    Experimentation NeededSomeone needs to try this out to see what happensPossible ImprovementThe current behavior isn't wrong, but it's possible to see that it might be better in some cases

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions