Description
Affects: 6.1.2
The class org.springframework.http.ContentDisposition should allow setting different encodings for the filename
and filename*
part. On MDN we learn that the filename
part is for compatibility with user agents that don't support "complex" encodings.
We have observed that when setting the encoding to UTF-8 the filenames are parsed correctly by browsers (likely from filename*
) but other clients, for example curl -OJ
don't decode the UTF-8 encoded filename
part and instead write a file with a name like =?UTF-8?Q?myFile.txt?=
which is not a nice filename
Example:
@Test
public void testContentDispositionUTF8() {
var disposition = ContentDisposition.builder("attachment")
.filename("myFile.txt", StandardCharsets.UTF_8)
.build();
assertEquals("attachment; filename=\"=?UTF-8?Q?myFile.txt?=\"; filename*=UTF-8''myFile.txt",
disposition.toString());
}
It would be great if it was possible to specify that filename*
should be encoded as UTF-8 and filename
should be encoded in a "ascii" safe way (for example discard all characters > 255)
When choosing the ASCII charset and using ContentDisposition
outside the ascii range Tomcat will error.
For example:
response.setHeader(HttpHeaders.CONTENT_DISPOSITION, ContentDisposition.builder("attachment").filename("。.txt").build().toString());
Then Tomcat would eventually tell us
java.lang.IllegalArgumentException: The Unicode character [。] at code point [12,290] cannot be encoded as it is outside the permitted range of 0 to 255
at org.apache.tomcat.util.buf.MessageBytes.toBytesSimple(MessageBytes.java:310) ~[tomcat-embed-core-10.1.16.jar:10.1.16]
at org.apache.tomcat.util.buf.MessageBytes.toBytes(MessageBytes.java:283) ~[tomcat-embed-core-10.1.16.jar:10.1.16]
at org.apache.coyote.http11.Http11OutputBuffer.write(Http11OutputBuffer.java:389) ~[tomcat-embed-core-10.1.16.jar:10.1.16]
at org.apache.coyote.http11.Http11OutputBuffer.sendHeader(Http11OutputBuffer.java:368) ~[tomcat-embed-core-10.1.16.jar:10.1.16]
Due to this issue it is not sufficient to allow filename
to be encoded as ASCII
while filename*
would be encoded as UTF-8
, instead we need a way to strip non-safe characters from filename
(or encode it in a better supported format?)
Activity
[-]ContentDisposition should allow specifying different encoding for `filename` and `filename*`[/-][+]ContentDisposition should allow encoding `filename` and `filename*` with different strategies and encodings[/+][-]ContentDisposition should allow encoding `filename` and `filename*` with different strategies and encodings[/-][+]ContentDisposition should allow encoding filename and filename* with different strategies and encodings[/+]deg-hrisser commentedon Nov 27, 2024
Hey, I have also stumbled upon the problem - not all User Agents support the currently used encoding, which caused a great deal of confusion.
In our case, it was a version of the Postman HTTP-Client.
It also explicitly states in Section 5 of RFC2047:
In conjunction with RFC 6266, Appendix C.1, which reiterates:
...
It seems to me, that the current strategy does not honor these standards,
although I fully admit I'm not very knowledgeable about these in particular.
I see this is still generally planned?
Is there any workaround short of copying / modifying the Content-Disposition code into our own project?
Thanks for your work :)