Skip to content

Send raw headers, without validation/modification #4144

Open
@pimterry

Description

@pimterry

This would solve...

For some use cases you want to fully specify all headers yourself, without automatic behaviour for headers like content-length, connection, etc.

This is particularly relevant for proxy servers, who are forwarding requests potentially with a complete set of headers they want to accurately proxy onwards, with the bare minimum of validation & changes. It also helpful for all other tools that need low-level control of HTTP content and want to manage that themselves.

Right now that's tricky, as many headers will throw an error if you try to set them manually. This is problematic for anybody who doesn't want the default behaviours, for example:

$ undici.request('https://example.com', { headers: { 'Transfer-Encoding': 'gzip' } });
> Uncaught InvalidArgumentError: invalid transfer-encoding header

This is not an invalid header, it's a valid example listed on MDN, but by default there's a small set of headers including transfer-encoding that Undici effectively owns, and won't allow users to configure.

I think this mostly makes sense as a default, but exposing options here would be helpful for http parity, as we move towards exposing Undici.request in core. Node http has supported removeHeader(x) for various headers to disable default headers behaviours since 0.4, and gained setDefaultHeaders: false recently to toggle this more generally. If Undici is going to eventually make http legacy, it would be good to allow users to similarly take control of these fully for advanced cases.

This would also offer a nice performance benefit, since it means we can skip a bunch of preprocessing completely!

(Yes, transfer-encoding and many other examples here are per-hop headers, so technically a proxy doesn't need to forward them, but there's often good reasons to make the upstream connection behaviour match downstream behaviour where possible).

The implementation should look like...

Personally I think making raw headers always behave like this would have been quite reasonable (i.e. if you pass a flat header array, you're taking responsibility for all header values, and Undici won't validate or modify them). That would look like this:

$ undici.request('https://example.com', { headers: [ 'Transfer-Encoding', 'gzip' ] });

This would be fine if we had a blank slate, but the downside in reality is that this would be a breaking change, since that API approach is already supported with automatic header management. I don't know how many people use that raw header format, but disabling automatic behaviours would effectively break everything for all of them. If there's any real number of people using this API that is quite bad.

Alternatively, there's options like:

  • Use a rawHeaders option, to opt into this
  • Create a RawHeaders class you instantiate elsewhere and pass to headers
  • A disableDefaultHeaders boolean option
  • ...?

Would love to hear opinions.

I have also considered...

This relates somewhat to #3998, which is looking to skip header validation for different reasons (to cache & reuse pre-validated/generated headers). It might be possible to align both approaches, although presumably in that other use case you would still want automatic headers most of the time in the initial preprocessing, so it's unlikely to be a solution out of the box.

Additional context

In both this PR and #3998, there are some tricky edge cases for HTTP/2 to watch out for, because headers need translation between HTTP/1 & HTTP/2. If you don't know what the server uses yet, you cannot know what headers you need to send, and even if you're reusing headers after the first request, you can't know that a server will always use HTTP/2 for subsequent connections (likely, but not guaranteed). The automatic behaviour here is minimal so it's not too bad (add :method etc, lower case everything, host -> :authority, just drop connection et al) but the approach needs some thought to provide control while preserving the super-useful transparent HTTP/2 + HTTP/1 support.

I'm happy to work on implementing this, I'm opening this issue to discuss approaches & enthusiasm for the concept overall.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions