Skip to content

Distributed regridding v2 - source data on distributed space #1175

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

juliasloan25
Copy link
Member

This is the second PR of ClimaCoupler SDI #188, the first being #1107.

Major goals to accomplish in this PR

  • Eliminate the use of serial spaces in distributed remapping functions.
  • Make use of MPI to share information between processes, rather than constructing objects on serial spaces and broadcasting as before.

Specific changes to be implemented in this PR

  • Generate source data on the distributed source space only. Send all source data to all processes using DSS.
    • This is not the ideal implementation since we are sending more information than is necessary. This will be modified in the future, but is a good next step to take at this point.
  • Add source_global_elem_lidx field to LinearMap. Use this to perform matrix multiplication for local indices of source data only in remap!. Since at this point all processes have all the source data, this approach will include redundant multiplication. However, this setup will be useful when we use the super-halo exchange to send only the necessary source data.
  • Create two methods for the generate_map function, one for the serial case and one for the distributed case.
  • In the distributed case of generate_map, use only the distributed source and target spaces (i.e. no serial spaces).
    • Extend Spaces.unique_nodes to correctly return the number of unique nodes in a distributed space.

The most involved component of this PR will be extending Spaces.unique_nodes to handle a distributed space as input. This will require developing an algorithm to count the unique nodes and then implementing it in the code. This may be best done as a separate PR to minimize scope of each PR.

The other non-trivial component of this PR is sending the source data to all processes. We should be able to use the existing DSS code, but we may need to add some data structures and functions to our code to use it. We should develop this part with the super-halo implementation in mind, so that any relevant infrastructure can easily be extended for that case.

QA

  • Code follows the style guidelines OR N/A.
  • Unit tests are included OR N/A.
  • Code is exercised in an integration test OR N/A.
  • Documentation has been added/updated OR N/A.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant