Open
Description
Bug report
Bug description:
Python decodes the bytes 0x8FA2A7 as ~ (TILDE) in EUC-JP.
assert b'\x8f\xa2\xb7'.decode('euc_jp') == '~'
This reference document is ambiguous in that it shows a simple ~ (TILDE), but most other software (iconv, Vim, Firefox, Rust's encoding_rs) interpret this as ~ (FULLWIDTH TILDE). Note that EUC-JP already includes US-ASCII, and so:
assert '~'.encode('euc-jp') == b'~'
CPython versions tested on:
3.11, CPython main branch
Operating systems tested on:
Linux
Linked PRs
Metadata
Metadata
Assignees
Projects
Status
No status