Skip to content

[ENG-7759] Add new task to update node metadata with verified_links #11078

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: feature/verified-resource-linking
Choose a base branch
from

Conversation

Vlad0n20
Copy link
Contributor

@Vlad0n20 Vlad0n20 commented Apr 7, 2025

Purpose

Add new task to update node metadata with verified_links

Changes

QA Notes

Please make verification statements inspired by your code and what your code touches.

  • Verify
  • Verify

What are the areas of risk?

Any concerns/considerations/questions that development raised?

Documentation

Side Effects

Ticket

https://openscience.atlassian.net/browse/ENG-7759?atlOrigin=eyJpIjoiMDQ3ZTEzYjU3ODU4NDVlZThiYmRiNWViMmQ3MzIzOWYiLCJwIjoiaiJ9

Copy link
Collaborator

@cslzchen cslzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work overall. Glad we revealed some details that I hadn't thought of before. In addition to my comments, let's see if we can add unit tests to make sure the metadata is built correctly as we expected.

Copy link
Collaborator

@cslzchen cslzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review, don't respond yet:

  • I picked the model update part and created a new [ENG-7759] Part 1: Add addon_verified_resource_links to node #11099
  • I don't think we need the links in the node/registration serializer. Let me know if you ran into anything I am not aware of which requires us to update serializer.
  • Task should receive links/types and save the data to the node
  • If possible, the save should also trigger automatic node identifier update
  • Finally a nitpicking suggestion on condition and loop statements

@@ -387,6 +388,8 @@ class AbstractNode(DirtyFieldsMixin, TypedModel, AddonModelMixin, IdentifierMixi

schema_responses = GenericRelation('osf.SchemaResponse', related_query_name='nodes')

verified_resource_links = DateTimeAwareJSONField(null=True, blank=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed and updated options in my PR

addon_verified_resource_links = DateTimeAwareJSONField(default=dict, blank=True)

@@ -17,3 +17,29 @@ def task__update_doi_metadata_on_change(self, target_guid):
@celery_app.task(ignore_results=True)
def update_doi_metadata_on_change(target_guid):
task__update_doi_metadata_on_change(target_guid)

@celery_app.task(bind=True, max_retries=5, acks_late=True)
def task__update_doi_metadata_with_verified_links(self, target_guid):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The task misses one important data: which is the links and types sent from GV. The task expects them to come in as arguments and the task should update the node. Otherwise, identifier update and Datacite tree-walk will see an empty dict.

In addition, I am wondering if we can make identifier update automatic when this field is saved/updated on node.

verified_links = getattr(osf_item, 'verified_resource_links', None)
if verified_links:
for link, resource_type in verified_links.items():
if link and isinstance(link, str) and smells_like_iri(link):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this check is here to make sure link looks like a link, which is good.

  • Is smells_like_iri() precise enough? (i.e. not too loose or too strict)
  • If so, it already checks the str type so we can skip it.

'resourceTypeGeneral': resource_type
})
else:
logger.warning('skipping non-URL verified link "%s"', link)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turn this into a sentry message. In addition, let's make it Pythonic:

if not verified_links:
    return
for ...:
    if <false condition>:
        # log sentry
        continue
    self.visit(...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants