Skip to content

fix: realpathSafe to safely resolve symlinks for seatbelt #378

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

jonchurch
Copy link
Contributor

@jonchurch jonchurch commented Apr 19, 2025

This is a followup to #370 and #275

It fixes #330

And also integrates feedback from @bolinfest (ty for the very good feedback!)

I have concerns with this PR:

Without testing, it is not 100% clear to me that this fixes your specific problem without creating problems for a number of other people. For example, if I have a symlink like ~/myrepo, which is a symlink to one of multiple clones of my repo (~/src/myrepo1, ~/src/myrepo2, etc.) and I start Codex in ~/myrepo, will canonicalizing the path screw things up? Because the agent is likely to spawn commands with paths relative to ~/myrepo.

I tested w/ a symlinked cwd, and in both versions (w/ and w/o realpath) they succeeded for reading/writing in the cwd. Which surprised me, tbh. I expected that it would fail w/o the realpath resolution.

Seatbelt is apparently poorly documented, or rather the implementation and policy language is considered apple internal. Im reading reverse engineer documents and bug reports to wayfind here 🤣 The bug we were fixing for originally of /var needed to be resolved in the policy to /private/var is an edgecase that I have read about (yes Im using AI to accelerate this understanding, buyer beware) and seems to be known.

Edit to add: Didnt make this clear, we are not touching userland paths. We are only altering the paths we add to the policy, for comparison against what the kernel hands to seatbelt for checking.

This uses realpathSync(). In general, we try to avoid blocking I/O calls in our codebase.
You call this in execWithSeatbelt(), which means the canonicalization is done every time Seatbelt is used as opposed to just doing it one time at the start of the agent loop.

addressed the sync. I did sync originally bc I missed that the function was returning Promise<ExecResult>, saw the lack of async keyword on the func and thought I was making the least invasive change.

As for caching it, yeah I can do that if desired. This isn't what I'd call a hotpath, but if the additional complexity is worth trimming the waste here in your mind I can do that. We have minute long network calls making up most of the latency in this application. Strong opinion held weakly on my end.

A workaround would be to use the new -w flag I introduced in #263 with -w $GOTMPDIR or -w $(realpath $GOTMPDIR) or however things work in Go. (Sorry, I don't write Go.)

The issue here is that writing to macos tmpdir is currently broken without the realpath change. The agent cannot write to it, nor the software it invokes. Im not a go dev either, but the particulars are generic to tmp dir writeability.

Try it out on a mac:

echo $TMPDIR # /var/folders/....
codex --full-auto -w $TMPDIR "Append some data to a test file in my macos tmpdir, we are testing your ability to write to it"`

A targetted fix would be to realpath just the tmp dir and play whackamole as new reports come in w/ other edgecaes (which is fully valid). I can make that change if desired.

@tibo-openai
Copy link
Collaborator

Ty for persisting and fixing this for others. CI looks like it's failing, but will merge once that is fixed!

@jonchurch

This comment was marked as outdated.

@jonchurch

This comment was marked as outdated.

@jonchurch
Copy link
Contributor Author

@tibo-openai I do think that some extra opinions on this change is warranted, if you have any macos greybeards you can ping to look at this seatbelt policy, that would go pretty far.

The specifics here are not something Im confident in beyond what I have been able to test

@tibo-openai
Copy link
Collaborator

Deferring to @bolinfest for the review here

@bolinfest
Copy link
Collaborator

@jonchurch thanks for your detailed investigation on this! Here's what I think we should do:

Since #263 already elevated additionalWritableRoots to cli.tsx and threads it all the way through to exec.ts, we should do the merging of additionalWritableRoots and the default writable roots in cli.tsx and then just thread writableRoots: ReadonlyArray<string> all the way through instead of additionalWritableRoots: ReadonlyArray<string>. That way, we only construct the array once up front.

As for the set of default writable roots, based on your investigation, I would only do it for macOS and I would NOT try/catch because if realpath() fails, something is wrong and we should know about it:

const writableRoots = [
  process.cwd(),
  // Comment here explaining why...should probably link to this PR?
  process.platform === 'darwin' ? await realpath(os.tmpdir()) : os.tmpdir(),
  ...additionalWritableRoots,
];

process.cwd() always returns the canonical path (well, at least on UNIX? I haven't checked on Windows), so there's no need for realpath() there. As for additionalWritableRoots, they were supplied by the user, so I think we should preserve them as-is and then it's up to a user to decide whether to use realpath before passing them in.

Though I am curious if, for some reason, we need both os.tmpdir() and realpath(os.tmpdir()) on macOS. You see, we have been using Codex to develop Codex, and so I am pretty sure that it can run Codex's own npm run test in full-auto mode, yet we are writing to Node.js's mkdtempSync() here:

projectDir = mkdtempSync(join(tmpdir(), "codex-proj-"));
mkdirSync(join(projectDir, ".git")); // mark as project root

I'm pretty sure full-auto mode has also worked with Rust tests that write to $TMPDIR via https://crates.io/crates/tempfile.

So perhaps the issue is "specific to Go" insofar as Go does realpath($TMPDIR) internally for its tmpdir needs? (Which is reasonable, just different than what the standard equivalents in Node.js and Rust do.)

I just want to be sure we don't fix something for Go and break the other languages.

@bolinfest
Copy link
Collaborator

@jonchurch just checking in: were you planning on following up on this PR? Since you did a lot of the investigation, I think it would be great to make the suggested changes and get this in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error: ENOENT: no such file or directory, lstat '/Users/{$profile}/.pyenv'
3 participants