Skip to content

Commit 6883a9b

Browse files
committed
Add more details and add description of existing messaging APIs
1 parent 72797e2 commit 6883a9b

File tree

2 files changed

+284
-39
lines changed

2 files changed

+284
-39
lines changed

.RFC-0000-template.md.swp

-16 KB
Binary file not shown.

RFC-0026-logging-system.md

+284-39
Original file line numberDiff line numberDiff line change
@@ -1,57 +1,47 @@
1-
# PyTorch Logging System
1+
# New PyTorch Logging System
22

33
## **Summary**
44
Create a message logging system for PyTorch with the following requirements:
55

6+
### Consistency
7+
8+
* The C++ and Python APIs should match each other as closely as possible.
9+
610
* All errors, warnings, and other messages generated by PyTorch should be
7-
emitted using the the logging system API
11+
emitted using the the logging system API.
12+
813

9-
* The APIs for emitting messages and changing settings should all be consistent
10-
between C++ and Python
14+
### Severity and message classes
1115

1216
* Offer different message severity levels, including at least the following:
1317

1418
- **Info**: Emits a message without creating a warning or error. By default,
15-
this gets printed to stdout
19+
this gets printed to stdout.
1620

17-
- **Warning**: Emits a message as a warning. By default, this will turn into
18-
a Python warning
21+
- **Warning**: Emits a message as a warning. If a warning is never caught,
22+
the warning may get printed to stdout.
1923

20-
- **Error**: Emits a message as an error. By default, this will turn into
21-
a Python error
24+
- **Error**: Emits a message as an error. If an error is never caught, the
25+
application will quit.
2226

2327
- TODO: Should we also have a **Fatal** severity for integration with
2428
Meta's internal logging system? A fatal message terminates the program
2529

26-
* Offer different classes of messages, including at least the following:
27-
28-
- **Default**: A catch-all message class
29-
30-
- **Nondeterministic**: Emitted when `torch.use_deterministic_algorithms(True)`
31-
is set and a nondeterministic operation is called
32-
33-
- **Deprecated**: Emitted when a deprecated function is called
30+
* Offer different message classes under each severity level.
3431

35-
- **Beta**: Emitted when a beta feature is called. See
36-
[PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/)
32+
- Every message is emitted as an instance of a message class.
3733

38-
- **Prototype**: Emitted when a prototype feature is called. See
39-
[PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/)
34+
- Each message class has both a C++ class and a Python class, and when a
35+
C++ message is propagated to Python, it is converted to its corresponding
36+
Python class.
4037

41-
- TODO: Should all the classic Python errors and warnings (`TypeError`,
42-
`ValueError`, `NotImplementedError`, `DeprecationWarning`, etc) have their
43-
own message class? Or are those separate from our concept of a message
44-
class, and any message class is allowed to raise any Python exception or
45-
warning type?
38+
- Whenever it makes sense, the Python class should be one of the builtin
39+
Python error/warning classes. For instance, currently in PyTorch, the C++
40+
error class `c10::Error` gets converted to the Python `RuntimeError` class.
4641

47-
* Continue using warning/error APIs that currently exist in PyTorch wherever
48-
possible. For instance, `TORCH_CHECK`, `TORCH_WARN`, and `TORCH_WARN_ONCE`
49-
should continue to be used in C++
42+
* Adding new message classes and severity levels should be easy
5043

51-
- NOTE: These existing APIs don't currently have a concept of message classes,
52-
so that will need to be added
53-
54-
* Creating new message classes and severity levels should be easy
44+
### User-facing configurability
5545

5646
* Ability to turn warnings into errors. This is already possible with the
5747
Python `warnings` module filter, but the PyTorch docs should mention it and
@@ -66,7 +56,7 @@ Create a message logging system for PyTorch with the following requirements:
6656
to a warning, but we wouldn't want to downgrade an error from invalid
6757
arguments given to an operation.
6858

69-
- Disabling warnings in Python should already be possible with the `warnings`
59+
- Disabling warnings in Python is already possible with the `warnings`
7060
module filter. See [documentation](https://docs.python.org/3/library/warnings.html#the-warnings-filter).
7161
There is no similar system in C++ at the moment, and building one is probably
7262
low priority.
@@ -75,7 +65,7 @@ Create a message logging system for PyTorch with the following requirements:
7565
excessive printouts can degrade the user experience. Related to
7666
issue [#68768](https://github.com/pytorch/pytorch/issues/68768)
7767

78-
* Settings to avoid emitting duplicate messages generated by multiple
68+
* Settings to enable/disable emitting duplicate messages generated by multiple
7969
`torch.distributed` ranks. Related to issue
8070
[#68768](https://github.com/pytorch/pytorch/issues/68768)
8171

@@ -85,15 +75,20 @@ Create a message logging system for PyTorch with the following requirements:
8575
- NOTE: Currently `TORCH_WARN_ONCE` does this in C++, but there is no Python
8676
equivalent
8777

88-
- TODO: Should there be a setting to turn a warn-once into a warn-always for
89-
a given message class and vice versa?
78+
- NOTE: `torch.set_warn_always()` currently controls some warnings (maybe
79+
only the ones from C++? I need to find out for sure.)
80+
81+
- TODO: Should there be a setting to turn a warn-once into a warn-always and
82+
vice versa for an entire message class?
9083

9184
* Settings can be changed from Python, C++, or environment variables
9285

9386
- Filtering warnings with Python command line arguments should
9487
remain possible. For instance, the following turns a `DeprecationWarning`
9588
into an error: `python -W error::DeprecationWarning your_script.py`
9689

90+
### Compatibility
91+
9792
* Should integrate with Meta's internal logging system, which is
9893
[glog](https://github.com/google/glog)
9994

@@ -102,12 +97,19 @@ Create a message logging system for PyTorch with the following requirements:
10297
* Must be OSS-friendly, so it shouldn't require libraries (like glog) which may
10398
cause incompatibility issues for projects that use PyTorch
10499

100+
### Other requirements
101+
102+
* Continue using warning/error APIs and message classes that currently exist in
103+
PyTorch wherever possible. For instance, `TORCH_CHECK`, `TORCH_WARN`, and
104+
`TORCH_WARN_ONCE` should continue to be used in C++
105+
105106
* TODO: Determine the requirements for the following concepts:
106107

107108
- Log files (default behavior and any settings)
108109

109110

110111
## **Motivation**
112+
111113
Original issue: [link](https://github.com/pytorch/pytorch/issues/72948)
112114

113115
Currently, it is challenging for PyTorch developers to provide messages that
@@ -116,5 +118,248 @@ act consistently between Python and C++.
116118
It is also challenging for PyTorch users to manage the messages that PyTorch
117119
emits. For instance, if a PyTorch user happens to be calling PyTorch functions
118120
that emit lots of messages, it can be difficult for them to filter out those
119-
messages so that their project's users don't get bombarded with warnings that
120-
they don't need to see.
121+
messages so that their project's users don't get bombarded with warnings and
122+
printouts that they don't need to see.
123+
124+
125+
## **Proposed Implementation**
126+
127+
### Message classes
128+
129+
At least the following message classes should be available. The name of the
130+
C++ class appears first in all the listed entries below, with the Python class
131+
to the right of it.
132+
133+
Each severity level has a default class. All other classes within a given
134+
severity level inherit from the corresponding default class.
135+
136+
NOTE: Most of the error classes below already exist in PyTorch. However,
137+
info classes do not currently exist. Also, only one type of warning currently
138+
exists in C++, and it is not implemented as a C++ class that can be inherited
139+
(as far as I understand).
140+
141+
#### Error message classes:
142+
143+
* **`c10::Error`** - `RuntimeError`
144+
- Default error class. Other error classes inherit from it.
145+
146+
* **`c10::IndexError`** - `IndexError`
147+
- Emitted when attempting to access an element that is not present in
148+
a list-like object.
149+
150+
* **`c10::ValueError`** - `ValueError`
151+
- Emitted when a function receives an argument with correct type but
152+
incorrect value.
153+
154+
* **`c10::TypeError`** - `TypeError`
155+
- Emitted when a function receives an argument with incorrect type.
156+
157+
* **`c10:NotImplementedError`** - `NotImplementedError`
158+
- Emitted when a feature that is not implemented is called.
159+
160+
* **`c10::LinAlgError`** - `torch.linalg.LinAlgError`
161+
- Emitted from the `torch.linalg` module when there is a numerical error.
162+
163+
* **`c10::NondeterministicError`** - `torch.NondeterministicError`
164+
- Emitted when `torch.use_deterministic_algorithms(True)` and
165+
`torch.set_deterministic_debug_mode('error')` are set, and a
166+
nondeterministic operation is called.
167+
168+
169+
#### Warning message classes:
170+
171+
* **`c10::UserWarning`** - `UserWarning`
172+
- Default warning class. Other warning classes inherit from it.
173+
174+
* **`c10::BetaWarning`** - `torch.BetaWarning`
175+
- Emitted when a beta feature is called. See
176+
[PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/).
177+
178+
* **`c10::PrototypeWarning`** - `torch.PrototypeWarning`
179+
- Emitted when a prototype feature is called. See
180+
[PyTorch feature classifications](https://pytorch.org/blog/pytorch-feature-classification-changes/).
181+
182+
* **`c10::NondeterministicWarning`** - `torch.NondeterministicWarning`
183+
- Emitted when `torch.use_deterministic_algorithms(True)` and
184+
`torch.set_deterministic_debug_mode('warn')` are set, and a
185+
nondeterministic operation is called.
186+
187+
* **`c10::DeprecationWarning`** - `DeprecationWarning`
188+
- Emitted when a deprecated function is called.
189+
- TODO: `DeprecationWarning`s are ignored by default in Python, so we may
190+
actually want to use a different Python class for this.
191+
192+
193+
#### Info message classes:
194+
195+
* **`c10::Info`** - `torch.Info`
196+
- Default info class. Other info classes inherit from it.
197+
198+
199+
### Error APIs
200+
201+
The APIs for raising errors all share a similar form. They check a boolean
202+
condition, the `cond` argument in the following signatures, and throw an error
203+
if that condition is false.
204+
205+
The error APIs also each have a variable length argument list, `...` in C++ and
206+
`*args` in Python. When an error is raised, these arguments are concatenated
207+
into a string, and the string becomes the body of the error message. If
208+
possible, a developer who writes these error messages should try to include
209+
enough information so that a user could understand why the error happened and
210+
what to do about it. If that goal is not possible, the message should at least
211+
contain some useful information to lead the user in the right direction.
212+
213+
The error APIs are listed below, with the C++ signature on the left and the
214+
corresponding Python signature on the right.
215+
216+
**`TORCH_CHECK(cond, ...)`** - `torch.check(cond, *args)`
217+
- C++ error: `c10::Error`
218+
- Python error: `RuntimeError`
219+
220+
TODO: Add the rest of these and also add sections for warnings and info.
221+
222+
### Other details
223+
224+
* At the moment in PyTorch, the Python `warnings` module is being publicly
225+
included in `torch` as `torch.warnings`. This should probably be removed or
226+
renamed to `_warnings` to avoid confusion.
227+
228+
229+
# PyTorch's current messaging API
230+
231+
The rest of this document contains details about the current messaging API in
232+
PyTorch. This is included to give better context about what will change and
233+
what will stay the same in the new messaging system.
234+
235+
At the moment, PyTorch has some APIs in place to make a lot of aspects of
236+
message logging easy, from the perspective of a developer working on PyTorch.
237+
Messages can be either printouts, warnings, or errors.
238+
239+
Errors are created with the standard `raise` statement in Python
240+
([documentation](https://docs.python.org/3/tutorial/errors.html#raising-exceptions)).
241+
In C++, PyTorch offers macros for creating errors (which are listed later in
242+
this document). When a C++ function propagates to Python, any errors that were
243+
generated get converted to Python errors.
244+
245+
Warnings are created with `warnings.warn` in Python
246+
([documentation](https://docs.python.org/3/library/warnings.html)). In C++,
247+
PyTorch offers macros for creating warnings (which are listed later in this
248+
document). When a C++ function propagates to Python, any warnings that were
249+
generated get converted to Python warnings.
250+
251+
Printouts (or what is called "Info" severity messages in the new system) are
252+
created with just `print` in Python and `std::cout` in C++.
253+
254+
PyTorch's C++ warning/error macros are declared in
255+
[`c10/util/Exception.h`](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h).
256+
257+
## PyTorch C++ Errors
258+
259+
In C++, there are several different types of errors that can be used, but
260+
PyTorch developers typically don't deal with these error classes directly.
261+
Instead, they use macros that offer a concise interface for raising different
262+
error classes.
263+
264+
### C++ error macros
265+
266+
Each of the error macros evaluate a boolean conditional expression, `cond`. If
267+
the condition is false, the error is raised, and whatever extra arguments are
268+
in `...` get concatenated into the error message with `operator<<`.
269+
270+
| Macro | C++ Error class |
271+
| ---------------------------------------- | ------------------------------ |
272+
| `TORCH_CHECK(cond, ...)` | `c10::Error` |
273+
| `TORCH_CHECK_WITH(error_t, cond, ...)` | caller specifies `error_t` arg |
274+
| `TORCH_CHECK_LINALG(cond, ...)` | `c10::LinAlgError` |
275+
| `TORCH_CHECK_INDEX(cond, ...)` | `c10::IndexError` |
276+
| `TORCH_CHECK_VALUE(cond, ...)` | `c10::ValueError` |
277+
| `TORCH_CHECK_TYPE(cond, ...)` | `c10::TypeError` |
278+
| `TORCH_CHECK_NOT_IMPLEMENTED(cond, ...)` | `c10::NotImplementedError` |
279+
280+
There is some documentation on error macros [here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L344-L362)
281+
282+
The reason why C++ preprocessor macros are used, rather than function calls, is
283+
to ensure that the compiler can optimize for the `cond == true` branch. In
284+
other words, if an error does not get raised, overhead is minimized.
285+
286+
### C++ error classes
287+
288+
The primary error class in C++ is `c10::Error`. Documentation and declaration
289+
are
290+
[here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L21-L28).
291+
`c10::Error` is a subclass of `std::exception`.
292+
293+
There are other error classes which are child classes of `c10::Error`, defined
294+
[here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L195-L236).
295+
296+
When these errors propagate to Python, they are each converted to a different
297+
Python error class:
298+
299+
| C++ error class | Python error class |
300+
| ------------------------------- | -------------------------- |
301+
| `c10::Error` | `RuntimeError` |
302+
| `c10::IndexError` | `IndexError` |
303+
| `c10::ValueError` | `ValueError` |
304+
| `c10::TypeError` | `TypeError` |
305+
| `c10::NotImplementedError` | `NotImplementedError` |
306+
| `c10::EnforceFiniteError` | `ExitException` |
307+
| `c10::OnnxfiBackendSystemError` | `ExitException` |
308+
| `c10::LinAlgError` | `torch.linalg.LinAlgError` |
309+
310+
311+
## PyTorch C++ Warnings
312+
313+
When warnings propagate from C++ to Python, they are converted to a Python
314+
`UserWarning`. Whatever is in `...` will get concatenated into the warning
315+
message using `operator<<`.
316+
317+
* `TORCH_WARN(...)`
318+
- [Definition](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L515-L530)
319+
320+
* `TORCH_WARN_ONCE(...)`
321+
- [Definition](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/c10/util/Exception.h#L557-L562)
322+
- This macro only generates a warning the first time it is encountered during
323+
run time.
324+
325+
326+
## Implementation details
327+
328+
### C++ to Python Error Translation
329+
330+
`c10::Error` and its subclasses are translated into their corresponding Python
331+
errors [in `CATCH_CORE_ERRORS`](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/torch/csrc/Exceptions.h#L54-L100).
332+
333+
However, not all of the `c10::Error` subclasses in the table above appear here.
334+
I'm not sure yet what's up with that.
335+
336+
`CATCH_CORE_ERRORS` is included within the `END_HANDLE_TH_ERRORS` macro that
337+
every Python-bound C++ function uses for handling errors. For instance,
338+
`THPVariable__is_view` uses the error handling macro
339+
[here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/tools/autograd/templates/python_variable_methods.cpp#L76).
340+
341+
342+
#### `torch::PyTorchError`
343+
344+
There's also an extra error class in `CATCH_CORE_ERRORS`,
345+
`torch::PyTorchError`. I'm not sure yet why it exists and how it differs from
346+
`c10::Error`. `torch::PyTorchError` has several overloads:
347+
348+
* `torch::IndexError`
349+
* `torch::TypeError`
350+
* `torch::ValueError`
351+
* `torch::NotImplementedError`
352+
* `torch::AttributeError`
353+
* `torch::LinAlgError`
354+
355+
356+
### C++ to Python Warning Translation
357+
358+
The conversion of warnings from C++ to Python is described [here](https://github.com/pytorch/pytorch/blob/72e4aab74b927c1ba5c3963cb17b4c0dce6e56bf/torch/csrc/Exceptions.h#L25-L48)
359+
360+
361+
## Misc Notes
362+
363+
[PyTorch Developer Podcast - Python exceptions](https://pytorch-dev-podcast.simplecast.com/episodes/python-exceptions)
364+
explains how C++ errors/warnings are converted to Python. TODO: listen to it
365+
again and take notes.

0 commit comments

Comments
 (0)