-
Notifications
You must be signed in to change notification settings - Fork 987
Added comprehensive testing for visualization components #2737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Performance benchmarks:
|
for more information, see https://pre-commit.ci
…t type to tooltips
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Fixes : #2734 |
@EwoutH could you please review it ? |
Thanks a lot for this PR! @Sahil-Chhoker could you take a first look? I will try to dive into it somewhere this weekend. |
@Ya-shh over a thousand lines of codes added spread along 16 files will add a significant maintenance burden. Could you A) try to reduce it to the bare essential and then B) motivate why each remaining file / part of code is necessary? |
@EwoutH Hey, thanks for pointing this out. Actually, the files that increased the lines of code are docs.md files, which I thought might be useful for new contributors or even the community. However, if they seem like too much of a burden on the codebase, I will remove them. |
Sorry for my inactivity recently, I am trying to focus on my proposal. But I think I can take a look at this tomorrow. |
@Sahil-Chhoker no worries, only if you have the time. Your own proposal is important! @Ya-shh fair, a lot of it is docs indeed. I wil try to take a look later. |
@Ya-shh did you notice that all tests in CI are being skipped? |
mesa/__init__.py
Outdated
@@ -26,5 +26,5 @@ | |||
__title__ = "mesa" | |||
__version__ = "3.2.0.dev" | |||
__license__ = "Apache 2.0" | |||
_this_year = datetime.datetime.now(tz=datetime.UTC).date().year | |||
_this_year = datetime.datetime.now(tz=datetime.timezone.utc).date().year |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mesa currently supports Python 3.11 and above. That's enforced by pre-commit, which is why it's changed back every time.
See
Line 9 in 5166670
requires-python = ">=3.11" |
if post_process is not None: | ||
chart = post_process(chart) | ||
except Exception as e: | ||
# If chart creation fails but we have a post_process, call it with a minimal chart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its interesting idea, but not sure this is really what users want. Maybe yes, maybe no, but I would like to leave this change out of this PR
@@ -178,7 +187,7 @@ def _draw_grid(space, agent_portrayal, propertylayer_portrayal): | |||
# no y-axis label | |||
"y": alt.Y("y", axis=None, type=x_y_type), | |||
"tooltip": [ | |||
alt.Tooltip(key, type=alt.utils.infer_vegalite_type_for_pandas([value])) | |||
alt.Tooltip(key, type="nominal") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this change intentional? Doesn't seem related to testing
PR_DESCRIPTION.md
Outdated
@@ -0,0 +1,78 @@ | |||
# Add Comprehensive Testing for Mesa's Visualization Components |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this file and summary.md is not supposed to be part of this PR
tests/ui/test_browser_viz.py
Outdated
step_button = page_session.locator("button:has-text('Step')") | ||
step_button.wait_for() | ||
|
||
page_session.locator("svg").wait_for() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"svg" selector seems a bit brittle. If possible this should target something more specific and less implementation specific (here we render a svg, but this might change to a canvas for example). Maybe its possible to target the whole output components container?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right about the svg selector being brittle. If the visualization implementation changes from SVG to canvas or something else, this test would break.I have made the changes , you can now review them
component = make_mpl_space_component(self.agent_portrayal) | ||
box, rc = solara.render(component(mock_model), handle_error=False) | ||
|
||
assert rc.find("div").widget is not None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A "not None" assertion is a good first step, but preferably it should test something a bit more specific (but can be done in later PRs)
Great initiative @Ya-shh ! Lots of things I like about this PR. I especially like the CI integration. Left a few smaller comments, but the biggest question is why all component tests are currently being skipped? |
This comment was marked as off-topic.
This comment was marked as off-topic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (18)
mesa/visualization/components/altair_components.py (1)
82-94
: Improved error handling for chart creation is a good addition.Adding a try-except block around chart creation provides a more graceful approach to handling exceptions when post-processing is involved. This ensures the post-process function gets called with a minimal chart even when the main chart creation fails.
Consider capturing the result of
post_process(minimal_chart)
even though you're going to re-raise the exception. For consistency with line 86, you might want to use:- post_process(minimal_chart) + minimal_chart = post_process(minimal_chart)tests/ui/README.md (1)
1-85
: Well-structured and comprehensive documentation!This README provides excellent guidance for setting up and running browser-based UI tests. The clear sections, code examples, and troubleshooting tips make it very accessible.
Consider adding:
- A brief explanation of what's included in the
viz-test
extra- More context about when/why to update reference screenshots
- Instructions on how to manually trigger the CI workflow
.github/workflows/ui-viz-tests.yml (4)
11-11
: Update checkout action versionThe current version of the checkout action is outdated. GitHub recommends using the latest version for security and performance improvements.
- - uses: actions/checkout@v3 + - uses: actions/checkout@v4🧰 Tools
🪛 actionlint (1.7.4)
11-11: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
14-16
: Update setup-python action versionThe setup-python action is using an outdated version. Update to the latest version for better compatibility and security.
- uses: actions/setup-python@v4 + uses: actions/setup-python@v5 with: python-version: '3.11'🧰 Tools
🪛 actionlint (1.7.4)
14-14: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
29-36
: Update upload-artifact action and add missing newlineThe upload-artifact action is outdated, and the file is missing a newline at the end.
- name: Upload test screenshots (on failure) if: failure() - uses: actions/upload-artifact@v3 + uses: actions/upload-artifact@v4 with: name: test-screenshots path: | tests/ui/screenshots tests/ui/diff-*.png - if-no-files-found: ignore + if-no-files-found: ignore +🧰 Tools
🪛 actionlint (1.7.4)
30-30: the runner of "actions/upload-artifact@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🪛 YAMLlint (1.35.1)
[error] 36-36: no new line character at the end of file
(new-line-at-end-of-file)
6-9
: Consider adding timeout and dependency cachingTo prevent hung workflows and improve performance, consider adding a timeout limit and dependency caching.
jobs: browser-tests: runs-on: ubuntu-latest + timeout-minutes: 10
And for the Python setup step:
- name: Set up Python 3.11 uses: actions/setup-python@v5 with: python-version: '3.11' + cache: 'pip'
.github/workflows/viz-tests.yml (4)
17-17
: Upgrade Actions/Checkout Version
The workflow currently usesactions/checkout@v3
. Static analysis suggests that this version may be outdated. Please verify whether a newer release (if available) would provide added security or features and consider updating accordingly.🧰 Tools
🪛 actionlint (1.7.4)
17-17: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
19-23
: Upgrade Setup-Python Version
The workflow usesactions/setup-python@v4
. Similar to the checkout action, please verify if a more recent version exists and update if applicable to ensure the job runs with the latest improvements.🧰 Tools
🪛 actionlint (1.7.4)
20-20: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
43-43
: Upgrade Codecov Action Version
The step usingcodecov/codecov-action@v3
might also be based on an older version. It is advisable to check for a newer release of this action and upgrade if available to benefit from the latest fixes and features.🧰 Tools
🪛 actionlint (1.7.4)
43-43: the runner of "codecov/codecov-action@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
48-48
: Add Newline at End of File
YAMLlint reports that there is no newline character at the end of the file. Please add a newline at the end to adhere to best practices and avoid potential issues in some tooling.🧰 Tools
🪛 YAMLlint (1.35.1)
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
tests/run_viz_tests.sh (2)
11-34
: Improve Argument Parsing
The script’s argument parsing uses afor
loop with embeddedshift
calls, which can be slightly confusing because$@
is expanded only once. Consider refactoring this section to use awhile
loop (e.g.,while [ "$#" -gt 0 ]; do ...; shift; done
) for clearer and more robust flag handling. Additionally, the documentation later references an--update-snapshots
flag that isn’t currently handled in the script. Please either add support for this flag or update the documentation for consistency.
86-89
: Exit Logic Verification
The exit logic correctly combines the results of the browser-less and browser-based tests. As an additional improvement, consider logging the failure codes explicitly to aid debugging in CI if tests fail.docs/VISUALIZATION_TESTING.md (1)
1-4
: Content Clarity
The introduction and testing approach sections are well organized. A brief proofreading pass to address any minor punctuation inconsistencies would further improve clarity.tests/VISUALIZATION_TESTING.md (1)
78-78
: Punctuation Suggestion
In the sentence detailing test coverage (line 78), consider adding a comma before “but” to improve readability as suggested by LanguageTool.tests/ui/test_browser_viz.py (2)
53-73
: Consider adding more robust assertions for chart verification.While checking for the presence of SVG paths is a good start, consider adding more specific assertions to verify that the charts render with expected data. For example, checking the number of data points or specific path attributes.
- chart_paths = page_session.locator("svg path") - assert chart_paths.count() > 0 + chart_paths = page_session.locator("svg path[d]") + # Ensure we have a reasonable number of paths for our chart + path_count = chart_paths.count() + assert path_count > 5, f"Expected multiple chart paths, got {path_count}" + + # Verify chart title or labels if possible + assert page_session.locator("text.chart-title").is_visible()
99-125
: Well-designed reference screenshot generation function.The skipped test for generating reference screenshots is well-documented and properly organized. The approach of generating screenshots for multiple models provides a solid foundation for visual regression testing.
Consider parametrizing this test to reduce code duplication:
@pytest.mark.skip(reason="Only run manually to generate reference screenshots") @pytest.mark.parametrize( "model_name,page_function,screenshot_filename", [ ("schelling", "mesa.examples.basic.schelling.app.page", "schelling_reference.png"), ("conway", "mesa.examples.basic.conways_game_of_life.app.page", "conway_reference.png"), ("boltzmann", "mesa.examples.basic.boltzmann_wealth_model.app.page", "boltzmann_reference.png"), ("virus", "mesa.examples.basic.virus_on_network.app.page", "virus_reference.png"), ], ) def test_generate_reference_screenshots(solara_test, page_session: Page, model_name, page_function, screenshot_filename): """Generate reference screenshots for all example models.""" import importlib module_name, attr_name = page_function.rsplit(".", 1) module = importlib.import_module(module_name) page = getattr(module, attr_name) page() page_session.wait_for_timeout(1000) page_session.screenshot(path=screenshot_filename)tests/test_visualization_components.py (2)
272-298
: Consider parameterizing performance benchmarks with different model sizes.The performance benchmark test is a good addition, but it only tests with one model size (50x50). Consider parameterizing it to test with different model sizes to better understand performance characteristics.
@pytest.mark.skip(reason="Benchmark tests are optional") @pytest.mark.parametrize("grid_size", [(10, 10), (50, 50), (100, 100)]) def test_performance_benchmarks(self, grid_size): """Test the rendering performance of visualization components with different model sizes.""" import time width, height = grid_size def agent_portrayal(agent): return {"color": "red", "marker": "o", "size": 5} # Create a model with the specified grid size model = Schelling(width=width, height=height, seed=42) # Measure rendering time start_time = time.time() component = make_altair_space(agent_portrayal) box, rc = solara.render(component(model), handle_error=False) render_time = time.time() - start_time # Scale threshold based on grid size max_time = 0.5 if width <= 10 else (5.0 if width <= 50 else 15.0) assert render_time < max_time, f"Rendering {width}x{height} grid took too long: {render_time:.2f}s (max: {max_time}s)"
301-309
: Consider adding inline documentation to explain the relationship between class-based and module-level tests.The presence of both a class-based benchmark and a module-level benchmark function is confusing without more explanation. The comment on line 308 indicates they are related, but more detail would be helpful.
@pytest.mark.skip(reason="Benchmark tests are optional") def test_performance_benchmarks(): - """Module-level test function for visualization performance benchmarks. + """Module-level test function for visualization performance benchmarks. This is a module-level version of the TestPerformanceBenchmarks.test_performance_benchmarks test to ensure compatibility with CI test runners. + + Some CI systems and test discovery mechanisms work better with module-level + test functions than with class-based tests. This function ensures that + benchmark tests can be run in both contexts. The actual implementation + is in the class-based version to maintain consistency with other test cases. """ # The actual benchmark is in TestPerformanceBenchmarks.test_performance_benchmarks
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
.github/workflows/build_lint.yml
(1 hunks).github/workflows/ui-viz-tests.yml
(1 hunks).github/workflows/viz-tests.yml
(1 hunks)PR_DESCRIPTION.md
(1 hunks)docs/VISUALIZATION_TESTING.md
(1 hunks)mesa/discrete_space/property_layer.py
(2 hunks)mesa/visualization/components/altair_components.py
(2 hunks)pyproject.toml
(1 hunks)tests/VISUALIZATION_TESTING.md
(1 hunks)tests/run_viz_tests.sh
(1 hunks)tests/test_visualization_components.py
(1 hunks)tests/ui/README.md
(1 hunks)tests/ui/__init__.py
(1 hunks)tests/ui/conftest.py
(1 hunks)tests/ui/test_browser_viz.py
(1 hunks)
🧰 Additional context used
🧬 Code Definitions (2)
mesa/visualization/components/altair_components.py (3)
tests/test_visualization_components.py (6)
agent_portrayal
(83-85)agent_portrayal
(258-259)agent_portrayal
(284-285)propertylayer_portrayal
(87-97)post_process
(129-130)post_process
(170-172)mesa/examples/basic/boltzmann_wealth_model/app.py (2)
agent_portrayal
(12-14)post_process
(36-37)tests/test_components_matplotlib.py (1)
agent_portrayal
(31-42)
tests/test_visualization_components.py (2)
mesa/discrete_space/property_layer.py (2)
PropertyLayer
(30-192)add_property_layer
(232-258)mesa/visualization/components/altair_components.py (2)
make_altair_space
(26-60)agent_portrayal
(49-50)
🪛 actionlint (1.7.4)
.github/workflows/viz-tests.yml
17-17: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
20-20: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
25-25: shellcheck reported issue in this script: SC2102:info:3:27: Ranges can only match single chars (mentioned due to duplicates)
(shellcheck)
43-43: the runner of "codecov/codecov-action@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
.github/workflows/ui-viz-tests.yml
11-11: the runner of "actions/checkout@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
14-14: the runner of "actions/setup-python@v4" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
19-19: shellcheck reported issue in this script: SC2102:info:2:27: Ranges can only match single chars (mentioned due to duplicates)
(shellcheck)
30-30: the runner of "actions/upload-artifact@v3" action is too old to run on GitHub Actions. update the action's version to fix this issue
(action)
🪛 YAMLlint (1.35.1)
.github/workflows/viz-tests.yml
[error] 48-48: no new line character at the end of file
(new-line-at-end-of-file)
.github/workflows/ui-viz-tests.yml
[error] 36-36: no new line character at the end of file
(new-line-at-end-of-file)
🪛 LanguageTool
tests/VISUALIZATION_TESTING.md
[uncategorized] ~15-~15: Loose punctuation mark.
Context: ...est Organization - test_solara_viz.py
: Tests for the SolaraViz component's cor...
(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~50-~50: Loose punctuation mark.
Context: ...nto the CI pipeline: 1. viz-tests.yml
: Runs browser-less tests on every PR and...
(UNLIKELY_OPENING_PUNCTUATION)
PR_DESCRIPTION.md
[uncategorized] ~78-~78: Use a comma before ‘but’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...ap where basic model execution is tested but visualization functionality is not syst...
(COMMA_COMPOUND_SENTENCE_2)
🔇 Additional comments (16)
mesa/visualization/components/altair_components.py (1)
190-190
: Was this tooltip type change intentional?Changing from dynamically inferred types to a static "nominal" type for tooltips doesn't seem directly related to testing improvements.
This change was previously questioned by Corvince as not being related to testing. If this is an intentional improvement, please clarify its purpose in the PR description.
tests/ui/__init__.py (1)
1-2
: LGTM! Proper package initialization.Adding a docstring to the
__init__.py
file clearly communicates the purpose of this package.tests/ui/conftest.py (1)
8-11
: Well-implemented fixture for snapshot testing.The
solara_snapshots_directory
fixture provides a consistent location for storing and comparing UI test snapshots, which is essential for visual regression testing.pyproject.toml (1)
67-72
:✅ Verification successful
Good addition of visualization testing dependencies.
Adding a separate optional dependency group for visualization testing is a clean approach. This keeps these specialized testing dependencies separate from regular development dependencies and makes it clear what's needed specifically for visualization testing.
🏁 Script executed:
#!/bin/bash # Check that these packages can be installed together without conflicts pip install pytest-playwright pytest-ipywidgets[solara] mesa[viz] --dry-runLength of output: 16265
Visualization testing dependencies verified
The dry-run installation confirmed that
mesa[viz]
,pytest-playwright
, andpytest-ipywidgets[solara]
can be installed together without conflicts. The code snippet inpyproject.toml
(lines 67–72) is correct as is, and the approach to separate visualization testing dependencies is clean and effective..github/workflows/build_lint.yml (1)
57-57
: Effective test separation approachAdding
--ignore=tests/ui/
to the pytest command is a good strategy to separate browser-based tests from the regular build workflow, making the CI process more efficient and reliable.mesa/discrete_space/property_layer.py (2)
109-109
: Improved NumPy indexing approachReplacing
data[*[0 for _ in range(len(data.shape))]]
withdata[tuple(0 for _ in range(len(data.shape)))]
is a good change as it uses the more idiomatic tuple-based indexing method for NumPy arrays.
324-324
: Consistent NumPy indexing patternChanging from
mask[*indices]
tomask[tuple(indices)]
maintains consistency with the earlier indexing change and follows NumPy's recommended practices for array indexing.docs/VISUALIZATION_TESTING.md (1)
1-62
: Excellent Detailed Instructions
This documentation file provides a very clear and comprehensive overview of Mesa's visualization testing strategy. The step‐by‐step commands for setup and test execution (including installation of dependencies and usage of the helper script) are especially helpful for new contributors.tests/VISUALIZATION_TESTING.md (1)
1-99
: Comprehensive Testing Strategy Documentation
This file thoroughly outlines the testing strategy, the organization of tests, and commands for running them locally and via CI. It’s a valuable resource that aligns nicely with the new CI workflows and helper scripts.🧰 Tools
🪛 LanguageTool
[uncategorized] ~15-~15: Loose punctuation mark.
Context: ...est Organization -test_solara_viz.py
: Tests for the SolaraViz component's cor...(UNLIKELY_OPENING_PUNCTUATION)
[uncategorized] ~50-~50: Loose punctuation mark.
Context: ...nto the CI pipeline: 1.viz-tests.yml
: Runs browser-less tests on every PR and...(UNLIKELY_OPENING_PUNCTUATION)
PR_DESCRIPTION.md (1)
1-1
: Clarification on File Inclusion
There is a past review comment suggesting thatPR_DESCRIPTION.md
(and similar summary files) might not belong in the PR. If these documentation files are intentionally included to provide context and clarity, please disregard past comments.tests/ui/test_browser_viz.py (4)
1-6
: Well-structured docstring and purpose explanation.The docstring clearly explains the purpose of these browser-based tests and their relationship to the browser-less tests. This helps maintain good documentation standards.
12-32
: Use more specific selectors instead of generic "svg" selector.On line 25, using a generic "svg" selector is brittle and implementation-specific. If the visualization technology changes from SVG to canvas in the future, these tests will break.
Consider targeting a more specific container or element with a dedicated ID or class name instead. For example:
- page_session.locator("svg").wait_for() + page_session.locator("[data-testid='visualization-container'] svg").wait_for()
34-51
: Good approach for comparing visualization state changes.The technique of capturing screenshots before and after clicking the step button to verify state changes is excellent. This ensures that the visualization is actually updating in response to user actions.
However, the same concern about the generic "svg" selector applies to lines 43, 45, and 48.
75-97
: Good use of JavaScript evaluation for slider interaction.The approach to manipulate the slider using JavaScript evaluation is robust and ensures proper event triggering. This is a good practice for testing slider components.
tests/test_visualization_components.py (2)
100-139
: All matplotlib component tests are skipped - consider enabling at least some tests.While it's understandable that some tests need to be skipped for compatibility reasons, having all tests skipped means there's no actual test coverage for these components.
Consider enabling at least some basic tests, or provide more specific information about what compatibility issues need to be addressed to make these tests runnable.
Additionally, the assertions are very basic (just checking if widgets are not None). As noted in previous reviews, more specific assertions would be valuable.
#!/bin/bash # Check if there are any active (non-skipped) visualization tests in the codebase echo "Checking for active visualization tests..." active_tests=$(rg -n "def test_" --no-filename tests/test_visualization_components.py | grep -v "@pytest.mark.skip" | wc -l) echo "Number of active (non-skipped) visualization tests: $active_tests" echo "Checking skip reasons..." rg -A 1 "@pytest.mark.skip" --no-filename tests/test_visualization_components.py
181-247
:❓ Verification inconclusive
Clarify timeline for updating skipped tests to match new API.
All example model visualization tests are skipped due to API changes. The PR summary suggests these tests are important for ensuring visualization components work across all example models.
Please clarify:
- What specific API changes have occurred?
- When will these tests be updated to match the new API?
- Is there an issue tracker item to follow up on this?
It's important to have a plan to enable these tests to ensure proper coverage of visualization functionality.
🏁 Script executed:
#!/bin/bash # Investigate API changes mentioned in skip reasons echo "Looking for API change mentions in the codebase..." rg "API has changed" --include "*.py" echo "Checking for related issues in GitHub..." gh issue list --search "API changes visualization" --limit 5Length of output: 874
Clarify and Track Updates for Visualization API Changes
The tests in
tests/test_visualization_components.py
are marked as skipped due to changes in the model API, but it's unclear which API changes have impacted these tests and how they affect the visualization components. Please clarify the following:
- Specific API Changes: What are the specific modifications in the model API that necessitated skipping these tests?
- Timeline for Updates: When are you planning to update these tests to align with the new API?
- Issue Tracking: Is there an existing issue or ticket to track these changes? If not, could one be created to ensure that the visualization components are properly validated once updated?
PR_DESCRIPTION.md
Outdated
# Add Comprehensive Testing for Mesa's Visualization Components | ||
|
||
## Overview | ||
This PR adds a robust testing infrastructure for Mesa's visualization components (SolaraViz) to ensure they render correctly and function properly across all example models. | ||
|
||
## Key Additions | ||
|
||
### Browser-less Unit Tests (`test_visualization_components.py`) | ||
- Tests for both Matplotlib and Altair visualization backends | ||
- Component-level tests for space and plot visualizations | ||
- Cross-model tests to verify visualization functionality in all example models | ||
- Interactive testing of model controllers (step, play/pause, reset) | ||
- Performance benchmarks to measure rendering efficiency | ||
|
||
### Browser-based UI Tests (`tests/ui/test_browser_viz.py`) - Manual Execution Only | ||
- Integration tests for end-to-end visualization rendering | ||
- Visual regression testing via screenshot comparisons | ||
- Tests for parameter controls and interactive features | ||
- Compatibility with Solara's recommended testing approach | ||
|
||
**Note**: Browser-based tests require additional dependencies (`playwright`) and are configured to run only manually or on a weekly schedule, not on every PR. | ||
|
||
### CI Integration | ||
- Added `viz-tests.yml` for fast browser-less tests on all PRs | ||
- Added `ui-viz-tests.yml` for periodic browser-based testing (weekly + manual triggers) | ||
- Coverage reporting for visualization components | ||
|
||
### Developer Tools | ||
- Added `run_viz_tests.sh` helper script for local testing | ||
- Created `VISUALIZATION_TESTING.md` documentation | ||
- Updated `pyproject.toml` with testing dependencies | ||
|
||
## How It Works | ||
|
||
The testing approach follows Solara's recommended practices for testing without a browser: | ||
|
||
```python | ||
def test_example(): | ||
# 1. Set up test model | ||
model = Schelling(seed=42) | ||
|
||
# 2. Create and render visualization component | ||
component = make_altair_space(agent_portrayal) | ||
box, rc = solara.render(component(model), handle_error=False) | ||
|
||
# 3. Find and verify UI elements | ||
assert rc.find("div").widget is not None | ||
|
||
# 4. Test model interaction | ||
step_button = rc.find(v.Btn, children=["Step"]).widget | ||
step_button.click() | ||
assert model.schedule.steps > 0 | ||
``` | ||
|
||
This approach allows us to thoroughly test visualization components without requiring a browser in CI, while still providing browser-based tests for more comprehensive validation when needed. | ||
|
||
## Running Tests Locally | ||
|
||
```bash | ||
# Run browser-less tests | ||
./tests/run_viz_tests.sh | ||
|
||
# Run with browser-based UI tests (requires additional dependencies) | ||
pip install -e .[viz-test] | ||
playwright install chromium | ||
./tests/run_viz_tests.sh --ui | ||
|
||
# Run performance benchmarks | ||
./tests/run_viz_tests.sh --benchmarks | ||
``` | ||
|
||
## Benefits | ||
- Ensures visualization components render correctly across all examples | ||
- Provides performance metrics for visualization rendering | ||
- Prevents regressions in visualization functionality | ||
- Follows Solara's recommended testing practices | ||
|
||
These tests address the current gap where basic model execution is tested but visualization functionality is not systematically verified. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PR Description Clarity and Consistency
The PR description is detailed and clearly outlines the new tests, CI integrations, and documentation updates. One point to note: the description mentions an --update-snapshots
flag for the helper script, but the current implementation of run_viz_tests.sh
does not handle this flag. Please update either the script or the documentation to maintain consistency.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~78-~78: Use a comma before ‘but’ if it connects two independent clauses (unless they are closely connected and short).
Context: ...ap where basic model execution is tested but visualization functionality is not syst...
(COMMA_COMPOUND_SENTENCE_2)
class TestAltairComponents(TestBaseVisualizationComponents): | ||
"""Tests for Altair visualization components.""" | ||
|
||
@pytest.mark.skip( | ||
reason="Test needs updating for compatibility with local environment" | ||
) | ||
def test_altair_space_component(self, mock_model): | ||
"""Test that Altair space component renders correctly.""" | ||
component = make_altair_space(self.agent_portrayal) | ||
box, rc = solara.render(component(mock_model), handle_error=False) | ||
|
||
assert rc.find("div").widget is not None | ||
|
||
component_with_layers = make_altair_space( | ||
self.agent_portrayal, propertylayer_portrayal=self.propertylayer_portrayal() | ||
) | ||
box, rc = solara.render(component_with_layers(mock_model), handle_error=False) | ||
assert rc.find("div").widget is not None | ||
|
||
@pytest.mark.skip( | ||
reason="Test needs updating for compatibility with local environment" | ||
) | ||
def test_altair_plot_component(self, mock_model): | ||
"""Test that Altair plot component renders correctly.""" | ||
component = make_plot_component({"data1": "red", "data2": "blue"}) | ||
box, rc = solara.render(component(mock_model), handle_error=False) | ||
|
||
assert rc.find("div").widget is not None | ||
|
||
def post_process(chart): | ||
chart = chart.properties(title="Test Plot") | ||
return chart | ||
|
||
component_with_custom = make_plot_component( | ||
{"data1": "red"}, post_process=post_process | ||
) | ||
box, rc = solara.render(component_with_custom(mock_model), handle_error=False) | ||
assert rc.find("div").widget is not None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
❓ Verification inconclusive
Consider adding test coverage for error handling in Altair components.
All Altair component tests are skipped, but the PR summary mentions "improving error handling in Altair components." These tests should validate the improved error handling.
Add tests that specifically verify error handling, for example:
- Testing with invalid agent portrayal functions
- Testing with incompatible property layer data
- Testing with missing or malformed model data
@pytest.mark.skip(reason="Test needs updating for compatibility with local environment")
def test_altair_component_error_handling(self, mock_model):
"""Test that Altair component handles errors gracefully."""
# Test with invalid portrayal function
def invalid_portrayal(agent):
return "not_a_dict" # Should be a dict
component = make_altair_space(invalid_portrayal)
# Should not raise an exception, but handle the error gracefully
box, rc = solara.render(component(mock_model), handle_error=True)
# Verify error is displayed to user
assert rc.find("div").text.contains("Error")
Enhance Error Handling Test Coverage for Altair Components
The current tests in tests/test_visualization_components.py
only validate that a rendered div exists and are skipped, so they do not cover error handling in the Altair components as referenced in the PR summary. Please add tests that specifically validate error handling. For example, include tests that:
- Confirm the component gracefully handles an invalid portrayal function (e.g., one that returns an incorrect type).
- Verify behavior when provided with incompatible property layer data.
- Ensure appropriate error feedback when model data is missing or malformed.
An example test could be:
@pytest.mark.skip(reason="Test needs updating for compatibility with local environment")
def test_altair_component_error_handling(self, mock_model):
"""Test that Altair component handles errors gracefully."""
# Test with an invalid portrayal function
def invalid_portrayal(agent):
return "not_a_dict" # Expected to return a dict
component = make_altair_space(invalid_portrayal)
# Render with handle_error=True so that errors are caught and displayed gracefully
box, rc = solara.render(component(mock_model), handle_error=True)
# Verify that an error message is displayed to the user
assert "Error" in rc.find("div").text
Please add similar tests to cover all edge cases regarding error handling.
class TestBaseVisualizationComponents: | ||
"""Base test class for visualization components.""" | ||
|
||
@pytest.fixture | ||
def mock_model(self): | ||
"""Create a mock model with a grid.""" | ||
|
||
class MockAgent(Agent): | ||
def __init__(self, model, agent_type=0): | ||
super().__init__(model) | ||
self.type = agent_type | ||
|
||
class MockModel(Model): | ||
def __init__(self, width=10, height=10, seed=42): | ||
super().__init__(seed=seed) | ||
self.width = width | ||
self.height = height | ||
self.grid = MultiGrid(width, height, torus=True) | ||
|
||
# Create and add a property layer | ||
test_layer = PropertyLayer( | ||
name="test_layer", width=width, height=height, default_value=1.0 | ||
) | ||
self.grid.add_property_layer(test_layer) | ||
|
||
a1 = MockAgent(self, agent_type=0) | ||
a2 = MockAgent(self, agent_type=1) | ||
self.grid.place_agent(a1, (2, 2)) | ||
self.grid.place_agent(a2, (5, 5)) | ||
|
||
self.schedule = [] | ||
self.running = True | ||
self.time_series_data = {"data1": [1, 2, 3], "data2": [3, 2, 1]} | ||
|
||
# Add a mock datacollector | ||
from unittest.mock import MagicMock | ||
|
||
import pandas as pd | ||
|
||
self.datacollector = MagicMock() | ||
# Create a dataframe with time series data | ||
self.datacollector.get_model_vars_dataframe.return_value = pd.DataFrame( | ||
self.time_series_data | ||
) | ||
|
||
def step(self): | ||
"""Step the model.""" | ||
|
||
return MockModel() | ||
|
||
def agent_portrayal(self, agent): | ||
"""Basic agent portrayal for testing.""" | ||
return {"color": "red" if agent.type == 0 else "blue", "marker": "o", "size": 5} | ||
|
||
def propertylayer_portrayal(self): | ||
"""Basic property layer portrayal for testing.""" | ||
return { | ||
"test_layer": { | ||
"colormap": "viridis", | ||
"alpha": 0.75, | ||
"colorbar": True, | ||
"vmin": 0, | ||
"vmax": 10, | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well-structured base test class with comprehensive mock setup.
The TestBaseVisualizationComponents
class provides an excellent foundation for visualization testing with a well-designed mock model and portrayal methods.
However, there's a potential issue with how the property layer is initialized:
- test_layer = PropertyLayer(
- name="test_layer", width=width, height=height, default_value=1.0
- )
+ test_layer = PropertyLayer(
+ name="test_layer", dimensions=(width, height), default_value=1.0
+ )
Based on the PropertyLayer
definition in mesa/discrete_space/property_layer.py
, the constructor expects dimensions
as a parameter, not width
and height
separately.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
class TestBaseVisualizationComponents: | |
"""Base test class for visualization components.""" | |
@pytest.fixture | |
def mock_model(self): | |
"""Create a mock model with a grid.""" | |
class MockAgent(Agent): | |
def __init__(self, model, agent_type=0): | |
super().__init__(model) | |
self.type = agent_type | |
class MockModel(Model): | |
def __init__(self, width=10, height=10, seed=42): | |
super().__init__(seed=seed) | |
self.width = width | |
self.height = height | |
self.grid = MultiGrid(width, height, torus=True) | |
# Create and add a property layer | |
test_layer = PropertyLayer( | |
name="test_layer", width=width, height=height, default_value=1.0 | |
) | |
self.grid.add_property_layer(test_layer) | |
a1 = MockAgent(self, agent_type=0) | |
a2 = MockAgent(self, agent_type=1) | |
self.grid.place_agent(a1, (2, 2)) | |
self.grid.place_agent(a2, (5, 5)) | |
self.schedule = [] | |
self.running = True | |
self.time_series_data = {"data1": [1, 2, 3], "data2": [3, 2, 1]} | |
# Add a mock datacollector | |
from unittest.mock import MagicMock | |
import pandas as pd | |
self.datacollector = MagicMock() | |
# Create a dataframe with time series data | |
self.datacollector.get_model_vars_dataframe.return_value = pd.DataFrame( | |
self.time_series_data | |
) | |
def step(self): | |
"""Step the model.""" | |
return MockModel() | |
def agent_portrayal(self, agent): | |
"""Basic agent portrayal for testing.""" | |
return {"color": "red" if agent.type == 0 else "blue", "marker": "o", "size": 5} | |
def propertylayer_portrayal(self): | |
"""Basic property layer portrayal for testing.""" | |
return { | |
"test_layer": { | |
"colormap": "viridis", | |
"alpha": 0.75, | |
"colorbar": True, | |
"vmin": 0, | |
"vmax": 10, | |
} | |
} | |
class TestBaseVisualizationComponents: | |
"""Base test class for visualization components.""" | |
@pytest.fixture | |
def mock_model(self): | |
"""Create a mock model with a grid.""" | |
class MockAgent(Agent): | |
def __init__(self, model, agent_type=0): | |
super().__init__(model) | |
self.type = agent_type | |
class MockModel(Model): | |
def __init__(self, width=10, height=10, seed=42): | |
super().__init__(seed=seed) | |
self.width = width | |
self.height = height | |
self.grid = MultiGrid(width, height, torus=True) | |
# Create and add a property layer | |
test_layer = PropertyLayer( | |
name="test_layer", dimensions=(width, height), default_value=1.0 | |
) | |
self.grid.add_property_layer(test_layer) | |
a1 = MockAgent(self, agent_type=0) | |
a2 = MockAgent(self, agent_type=1) | |
self.grid.place_agent(a1, (2, 2)) | |
self.grid.place_agent(a2, (5, 5)) | |
self.schedule = [] | |
self.running = True | |
self.time_series_data = {"data1": [1, 2, 3], "data2": [3, 2, 1]} | |
# Add a mock datacollector | |
from unittest.mock import MagicMock | |
import pandas as pd | |
self.datacollector = MagicMock() | |
# Create a dataframe with time series data | |
self.datacollector.get_model_vars_dataframe.return_value = pd.DataFrame( | |
self.time_series_data | |
) | |
def step(self): | |
"""Step the model.""" | |
return MockModel() | |
def agent_portrayal(self, agent): | |
"""Basic agent portrayal for testing.""" | |
return {"color": "red" if agent.type == 0 else "blue", "marker": "o", "size": 5} | |
def propertylayer_portrayal(self): | |
"""Basic property layer portrayal for testing.""" | |
return { | |
"test_layer": { | |
"colormap": "viridis", | |
"alpha": 0.75, | |
"colorbar": True, | |
"vmin": 0, | |
"vmax": 10, | |
} | |
} |
724159a
to
e8a4727
Compare
for more information, see https://pre-commit.ci
Thanks for your detailed review, @Corvince! I have fixed the minor issues you mentioned, and all tests are now passing without being skipped. |
This PR adds a testing infrastructure for Mesa's visualization components (SolaraViz) to ensure they render correctly and function properly across all example models.
Browser-less Unit Tests (
test_visualization_components.py
)Browser-based UI Tests (
tests/ui/test_browser_viz.py
)Additional Compatibility Improvements
CI Integration
viz-tests.yml
for fast browser-less tests on all PRsui-viz-tests.yml
for periodic browser-based testing (weekly + on visualization changes)Developer Tools
run_viz_tests.sh
helper script for local testingVISUALIZATION_TESTING.md
documentationpyproject.toml
with testing dependenciesHow It Works
The testing approach follows Solara's recommended practices for testing without a browser:
This approach allows us to thoroughly test visualization components without requiring a browser in CI, while still providing browser-based tests for more comprehensive validation when needed.
Running Tests Locally
These tests address the current gap where basic model execution is tested but visualization functionality is not systematically verified.
Summary by CodeRabbit
New Features
Tests
Documentation