-
Notifications
You must be signed in to change notification settings - Fork 8
refactor: Prepare nullable column by default #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #31 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 36 39 +3
Lines 1788 1869 +81
=========================================
+ Hits 1788 1869 +81 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
I think most of the tests would be fine with nullable=False, so adding the new value there now will mean removal again soon (and it's a lot of lines). I would propose to add import os
import warnings
from collections.abc import Callable
from functools import wraps
TRUTHY_VALUES = ["1", "true"]
def skip_if(env: str) -> Callable:
"""Decorator to skip warnings based on environment variable.
If the environment variable is equivalent to any of TRUTHY_VALUES, the wrapped
function is skipped.
"""
def decorator(fun: Callable) -> Callable:
@wraps(fun)
def wrapper() -> None:
should_skip = os.getenv(env, "").lower() in TRUTHY_VALUES
if should_skip:
return
fun()
return wrapper
return decorator
@skip_if(env="DATAFRAMELY_IGNORE_NULLABLE_DEFAULT")
def warn_nullable_default_change() -> None:
# the actual warning goes here and then we can import
WDYT? |
I don't know, I have no strong feeling, but more i think about it more I think that we should go for the simplest solution: Ignoring the warning in the test from the |
I'm in favor of @AndreasAlbertQC's suggestion as I think it's a nice pattern for future deprecations and it's easy to implement and control. |
pyproject.toml
Outdated
@@ -83,6 +83,7 @@ module = ["pyarrow.*"] | |||
addopts = "--import-mode=importlib" | |||
filterwarnings = [ | |||
"ignore:datetime.datetime.utcfromtimestamp\\(\\) is deprecated.*:DeprecationWarning", | |||
"ignore:The 'nullable' argument was not explicitly set.*:FutureWarning", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't want our tests to generate warnings. I know this is a bit of work and please let us know if we can support this effort but we should adapt all of our tests 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, just to make sure we are align before starting the work. You want to add explicitly nullable=False everywhere in the test suite ? And remove them (or not) in the future breaking release as it will be the default ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked to @AndreasAlbertQC about this again yesterday evening and we converged on: don't bother, let's ignore the warning for now. In almost all test cases, it does not matter whether a column is nullable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey @gab23r ! It took @delsner @borchero and me a little while to coordinate and agree on how we want to proceed. Sorry for the delay, it's out first time navigating this particular question in dataframely
:)
Here's the plan we agreed on:
- We commit to making the actual breaking change only in a major release. We documented this commitment in docs: Document our approach to breaking changes #35. The exact timing of the major release is not set yet, but I think it's realistic to expect this in a few months.
- For the future warning in this PR, please implement the suggestion I made above for a warning function + environment-based skip. In docs: Document our approach to breaking changes #35, we also documented a new
DATAFRAMELY_NO_FUTURE_WARNINGS
environment variable that we'd like to use for this. The reasoning is that we want users to be able to disable the warning also at deployment time without having to write new code. - For the
dataframely
tests, please just disable the warning in this PR. There is no need to migrate the tests. Our reasoning is that while we generally never want to disable warnings, future warnings coming out of our own code are a special case because a) future warnings never mean something is broken right now, only might be in the future and b) our CI will naturally catch actual breakage once the breaking change comes in.
Let me know if you're still up to implement these changes, otherwise I'm of course happy to help. Thanks for your patience.
This sounds really good to me! I will implement it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gab23r , almost there, I only have one more suggestion on the test.
def test_integer_constructor_warns_about_nullable() -> None: | ||
original_value = os.environ.get("DATAFRAMELY_NO_FUTURE_WARNINGS") | ||
os.environ["DATAFRAMELY_NO_FUTURE_WARNINGS"] = "0" | ||
try: | ||
with pytest.warns( | ||
FutureWarning, match="The 'nullable' argument was not explicitly set" | ||
): | ||
dy.Integer() | ||
finally: | ||
if original_value is not None: | ||
os.environ["DATAFRAMELY_NO_FUTURE_WARNINGS"] = original_value | ||
elif "DATAFRAMELY_NO_FUTURE_WARNINGS" in os.environ: | ||
del os.environ["DATAFRAMELY_NO_FUTURE_WARNINGS"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def test_integer_constructor_warns_about_nullable() -> None: | |
original_value = os.environ.get("DATAFRAMELY_NO_FUTURE_WARNINGS") | |
os.environ["DATAFRAMELY_NO_FUTURE_WARNINGS"] = "0" | |
try: | |
with pytest.warns( | |
FutureWarning, match="The 'nullable' argument was not explicitly set" | |
): | |
dy.Integer() | |
finally: | |
if original_value is not None: | |
os.environ["DATAFRAMELY_NO_FUTURE_WARNINGS"] = original_value | |
elif "DATAFRAMELY_NO_FUTURE_WARNINGS" in os.environ: | |
del os.environ["DATAFRAMELY_NO_FUTURE_WARNINGS"] | |
def test_column_constructor_warns_about_nullable(monkeypatch: pytest.MonkeyPatch) -> None: | |
monkeypatch.setenv("DATAFRAMELY_NO_FUTURE_WARNINGS", "") | |
with pytest.warns( | |
FutureWarning, match="The 'nullable' argument was not explicitly set" | |
): | |
dy.Integer() | |
@pytest.mark.parametrize("env_var", ["1", "True", "true"] | |
def test_future_warning_skip(monkeypatch: pytest.MonkeyPatch, env_var: str) -> None: | |
monkeypatch.setenv("DATAFRAMELY_NO_FUTURE_WARNINGS", env_var) | |
# Elevates FutureWarning to an exception | |
with warnings.catch_warnings(): | |
warnings.simplefilter("error", FutureWarning) | |
dy.Integer() |
How about splitting this test into two?
Motivation
towards #20
Changes
Changing to
nullable=False
by default would be a breaking change. For now, this setnullable
to None and warn if nullable is not explicitly set.Question, how to handle warning in tests, should I ignore them ? should I explicitly set nullable=True, nullable=False everywhere ?