Skip to content

fix: allow concurrent prettier setups #2462

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

simschla
Copy link
Contributor

@simschla simschla commented Apr 7, 2025

One of my colleagues had strange issues in his project when using prettier - issues, I'd never seen in our project setups. The npm based steps behaved erratically. Sometimes, they did not work, sometimes they did. The behaviour was different if he was using configuration cache then when he did not. It also seemed to make a difference which spotlessXYCheck task he started - or if he started the root task spotlessCheck.
We did a lot of debugging and finally found the culprit: They were making heavy use of prettier for their formatting. They used the same prettier setup for several formatters which turned out to be a problem. A setup like the following triggers the issue:

spotless {
    format 'mytypescript', {
        target 'test.ts'
        prettier().config(prettierConfig1)
    }
    format 'json', {
        target 'test.json'
        prettier().config(prettierConfig2)
    }
    javascript {
        target 'test.js'
        prettier().config(prettierConfig3)
    }
}

All these prettier()-setups are resolving internally to the same package.json which in turn resolved to the same path for the node_modules directory to use. So each of these steps tried to install its node_modules into the same directory and start a server from there. Sometimes concurrently, sometimes not (depending on the files involved, the scheduling done by gradle, race conditions, ...).

This PR fixes this by making sure each formatter includes a unique aspect (e.g. the formatter name) in the path used to house the node_modules directory. That way, we prevent all the before mentioned issues alltogether.

simschla added a commit to simschla/spotless that referenced this pull request Apr 7, 2025
simschla added 3 commits April 7, 2025 22:33
without this change, using the exact same npm-based formatter (e.g.
prettier) in multiple formatters, we could end up using the same
directory to store their node_modules inside. This could lead -
especially when using parallel builds, to a lot of issues:
- overwriting each others node_modules mid-flight
- overwriting package.json mid-flight
- starting multiple npm-based servers on the same directory (overwriting
the port-file thus leading to cross-access between formatter steps and
their corresponding node server).

By applying this fix, each formatter will have its own separate
node_modules directory.
@simschla simschla force-pushed the bugfix/allow-concurrent-prettier-setups branch from 416748b to 215793d Compare April 7, 2025 20:36
CHANGES.md Outdated
@@ -13,6 +13,7 @@ We adhere to the [keepachangelog](https://keepachangelog.com/en/1.0.0/) format (
### Changed
* Use palantir-java-format 2.57.0 on Java 21. ([#2447](https://github.com/diffplug/spotless/pull/2447))
* Re-try `npm install` with `--prefer-online` after `ERESOLVE` error. ([#2448](https://github.com/diffplug/spotless/pull/2448))
* **BREAKING** Assert that each npm-based step uses its own `node_modules` directory, even when configured exactly the same, to prevent race conditions when running in parallel. For this, the API of npm-based formatter steps has been changed. ([#2462](https://github.com/diffplug/spotless/pull/2462))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is technically a breaking change to the api, so added the **BREAKING** here.

Copy link
Member

@nedtwigg nedtwigg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really don't like the plumbing changes that this requires. A FormatterStep should not need to know about the Formatter it came from.

If multiple steps all resolve to the same configuration, that should be a blessing, right? We only need to resolve the dependencies once now, instead of three indepdendent times, if we can manage the concurrency.

I would try again with either a static mutex or a lockfile.

openFormatters.put(formatterFactory, formatter);
}

System.out.println("Created formatters: " + openFormatters.keySet().stream().map(f -> f.includes()).collect(Collectors.toList()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rogue println

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🙈 sorry about that

FormatterFactory formatterFactory = entry.getKey();
Supplier<Iterable<File>> files = entry.getValue();
Formatter formatter = formatterFactory.newFormatter(files, config);
Formatter formatter = formatterFactory.newFormatter(files, config, formatterIndex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like that creating a new formatter now requires an index

this.encoding = encoding;
this.licenseHeaderDelimiter = licenseHeaderDelimiter;
this.ratchetFrom = ratchetFrom;
this.provisioner = provisioner;
this.fileLocator = fileLocator;
this.spotlessSetLicenseHeaderYearsFromGitHistory = spotlessSetLicenseHeaderYearsFromGitHistory;
this.name = name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smell

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you expand on that?

@simschla
Copy link
Contributor Author

simschla commented Apr 8, 2025

I really don't like the plumbing changes that this requires. A FormatterStep should not need to know about the Formatter it came from.

If multiple steps all resolve to the same configuration, that should be a blessing, right? We only need to resolve the dependencies once now, instead of three indepdendent times, if we can manage the concurrency.

I would try again with either a static mutex or a lockfile.

Never thought of it that way — but if we can make it work using the same on-disk resources, that would be totally fine too 😄

There are three requirements we'd need to satisfy to go down that route:

  1. Avoid multiple installations
    We need to ensure the npm dependencies are installed only once.

  2. Distinguish between server instances
    If multiple servers are launched from the same node_modules directory, we need a way to identify which server was started for which Java formatter instance.

  3. Support concurrent usage
    The npm modules must allow concurrent access.

Regarding point 3, I believe we're safe — since we don’t mutate any shared state on disk once the npm servers are installed and running.

For 1, we could use Java semaphores to guard the install step. Lock files might work too — I’ll need to investigate further.

2 is currently handled by the npm server writing a file with its assigned port when it starts. To make this work concurrently, we need a reliable way to map Java formatter instances to the correct npm server ports. Two options come to mind:

  • Pass a unique identifier from the formatter to the npm process on launch, so we can name the port file accordingly. This would allow each formatter to read its own port file reliably. (My preference.)
  • Use stdout/stderr to pass the port from the npm process back to the Java side. (Not ideal — I'd rather avoid parsing output streams if we can help it.)

Alternatively — and this just occurred to me — what if we don’t run the npm module as a long-running server at all? Instead, we could call it as a CLI tool per format call. This would simplify things in terms of lifecycle and coordination, though of course it could have performance tradeoffs. On the flip side, it would increase robustness and reduce the coupling between formatter and process.

Curious to hear your thoughts on all of the above — especially the CLI-per-call idea!

simschla added 7 commits April 8, 2025 16:09
to be able to launch multiple instances of npm in the same node_modules
dir.

Refs: diffplug#2462
before: make sure every formatter uses its own node_modules dir.
now: actively support using same node_modules dir for multiple
formatters

Refs: diffplug#2462
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants