xml.etree.ElementTree: file source must be binary for non-UTF-8 encodings

# Documentation

In the [documentation for xml.etree.ElementTree.parse](https://docs.python.org/3/library/xml.etree.elementtree.html) it says for the first argument `source`: "... source is a filename or file object containing XML data. ..."

But for file objects this does not work in all (expected) cases. A hint like the following is missing:
```
For file objects containing XML data with non-ASCII and non-UTF-8 encoding (e.g. ISO 8859-1), the file must have been opened in binary mode.
```
Otherwise (if opening the file in ASCII mode, regardless of the specified encoding) non-ASCII characters are not read correctly. (see this [question on stackoverflow](https://stackoverflow.com/questions/47883390) and also the [attached files in test_parseXml.zip](https://github.com/python/cpython/files/9930967/test_parseXml.zip) for reproducing the problem)


Here is an excerpt of the attached test code:

```python
import xml

# ok
with open('test_ISO-8859-1.xml', 'rb') as fileInBinary:
    root = xml.etree.ElementTree.parse(fileInBinary).getroot()
print(root.attrib['attributeWithUmlauts'])

# garbage
with open('test_ISO-8859-1.xml', 'r', encoding='ISO-8859-1') as fileInAscii:
    root = xml.etree.ElementTree.parse(fileInAscii).getroot()
print(root.attrib['attributeWithUmlauts'])
```

giving the following output:

```
äöü
Ã¤Ã¶Ã¼
```


### Linked PRs
* gh-123887

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xml.etree.ElementTree: file source must be binary for non-UTF-8 encodings #99064

Documentation

Linked PRs

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

xml.etree.ElementTree: file source must be binary for non-UTF-8 encodings #99064

Description

Documentation

Linked PRs

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions