-
Notifications
You must be signed in to change notification settings - Fork 344
[ENG-7839] [ENG-7766] Add celery task to update Datacite metadata + Use new GV API endpoint to retrieve node's verified links #11117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🌟
website/identifiers/tasks.py
Outdated
sentry.log_message('Updating DOI with verified links for guid', | ||
extra_data={'guid': target_guid}, | ||
level=logging.INFO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, we don't need to log to sentry about the update. We can simply do info
or debug
logging.
website/identifiers/tasks.py
Outdated
sentry.log_message('DOI metadata with verified links updated for guid', | ||
extra_data={'guid': target_guid}, | ||
level=logging.INFO) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, use logging instead of sentry.
website/identifiers/tasks.py
Outdated
except Exception as exc: | ||
sentry.log_message('Failed to update DOI metadata with verified links', | ||
extra_data={'guid': target_guid, 'error': str(exc)}, | ||
level=logging.ERROR) | ||
raise self.retry(exc=exc) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We now have GVException
with we can raise and except, for example, see: https://github.com/CenterForOpenScience/osf.io/pull/11113/files#diff-d288f5a6729b0d85cc2d5c82086b82324edbb94b369fac5f719e9fb01dbfc726R103-R105
if isinstance(metadata_record_xml, bytes): | ||
metadata_record_xml = metadata_record_xml.decode('utf-8') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, what is the issue that has led you to do this extra check? Maybe our changes caused the build_metadata
to behave differently?
And if we have to do this, I think we have some helpers called ensure_str
and ensure_bytes
that we can use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added this for local testing and forgot to remove it.
def get_links(node_guid, requesting_user=None): | ||
return iterate_gv_results( | ||
ADDONS_ENDPOINT.format(addon_type=AddonType.LINK) + f'/{node_guid}/verified-links', | ||
requesting_user=requesting_user, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we already have this implemented (which was newly merged yesterday morning, see https://github.com/CenterForOpenScience/osf.io/pull/11113/files#diff-8d04c47c31d3869aca32c85256dc4dc08e69fd836971299d95b8386503348668), please take a look and I think the new get_verified_links()
should provide what you need.
from osf.models import AbstractNode | ||
|
||
if isinstance(osf_item, AbstractNode): | ||
gv_verified_link_list = osf_item.get_links() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, use get_verified_links()
for item in gv_verified_link_list: | ||
verified_link, resource_type = item.get('target_url', None), item.get('resource_type', None) | ||
if verified_link and resource_type: | ||
if verified_link and isinstance(verified_link, str) and smells_like_iri(verified_link): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
smells_like_iri()
already checks for str
type, so if verified_link and smells_like_iri(verified_link)
should be good.
'resourceTypeGeneral': resource_type.title() | ||
}) | ||
else: | ||
logger.warning('skipping non-URL verified link "%s"', verified_link) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be more useful if logged to sentry. And add which node it is.
It is probably nicer if we keep a list of skipped ones and send only one sentry message after the for loop.
osf/models/node.py
Outdated
@@ -2351,6 +2355,9 @@ def is_fork_of(self, other): | |||
def is_registration_of(self, other): | |||
return self.is_derived_from(other, 'registered_from') | |||
|
|||
def get_links(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, we should already have this after recent merge.
from website.identifiers.tasks import update_doi_metadata_with_verified_links | ||
if self.get_identifier('doi') and bool(self.IDENTIFIER_UPDATE_FIELDS.intersection(set(saved_fields))): | ||
update_doi_metadata_with_verified_links(self._id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if we have to do this? Is it because the identifier update is never automatically?
I think self.update_or_enqueue_on_node_updated(user_id, first_save, saved_fields)
should have done this? Please double check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added it because, as I remembered, you asked me to do it in the previous PR for metadata updating
…nScience#11098) ## Purpose When user has a deleted node where he is the only admin, osf admin can't gdpr delete this user because of this error ## Changes Ignore this message for deleted nodes. Added a test ## Ticket https://openscience.atlassian.net/browse/ENG-7557
…Admin App (CenterForOpenScience#11105) ## Purpose Implement Force Archive for admin UI based on `osf/management/commands/force_archive.py` ## Changes BE: 1. remove globals from force archive management command to make it thread-safe (usable in views) 2. Implement CheckArchiveStatusRegistrationsView and ForceArchiveRegistrationsView 3. RestartStuckRegistrationsView is replaced with ForceArchiveRegistrationsView FE: 1. Added Check archive status button 2. Added Force Archive button 3. Updated Restart Stuck Registration button (now it'll open the same modal as Force Archive to choose parameters but dry-mode is not available) ## QA Notes - Check archive status is displayed if node is registration - Force Archive button is displayed if registration is not archived and not stuck - Restart/Remove Stuck Registration are displayed if registration is not archived and stuck ## Ticket https://openscience.atlassian.net/browse/ENG-7270
## Purpose `Preprint` inherits from `VersionedGuidMixin` that inherits from `GuidMixin` thus `coerce_guid` tries to fetch guid from `guids` attribute instead of `versioned_guids` ## Changes Added if statement ## Ticket https://openscience.atlassian.net/browse/ENG-7798
## Purpose Fix bug with citations for a new preprint version. The problem was in the process of creating a version of the preprint, so also added the manage command to fix already created versions ## Changes Fix bug with citations for a new preprint version ## Ticket https://openscience.atlassian.net/browse/ENG-7263
double-quote-string-fixer
## Purpose Fix preprint updating ## Changes Add a manage command to fix why_no_data and has_data_links for broken preprints due to an unknown bug on FE (Futa described it in the comments to the ticket) ## Ticket https://openscience.atlassian.net/browse/ENG-7503
clean up code we don't need anymore (and avoid error messages when CenterForOpenScience/SHARE#859 ) contains CenterForOpenScience#11012
…h date uploaded) via the admin app (CenterForOpenScience#11097) ## Purpose Add "Create new version 1" button on admin UI ## Changes - Add button to admin UI - Extend existing preprint versioning functionality - Move permission validation logic from serializer to model level to keep everything in one place - add required_permission decorator (with ignore_permission functionality) that checks if user has enough permission to call method ## QA Notes Expected behavior: 1. Admin goes the preprint admin page 2. Clicks "Create new version 1", a modal asking to choose date will be displayed 3. If date is valid and file versions exists: - a new version 1 will be created - admin user will be redirected to v1 - optionally, admin user can make new preprint published by clicking "Make published" button (!) Preprint guid should point to the latest version, not the newly created one. (!) Admin should be able to create new versions of a preprint even if they are not a contributor. ## Ticket https://openscience.atlassian.net/browse/ENG-7716
<!-- Before submit your Pull Request, make sure you picked the right branch: - For hotfixes, select "master" as the target branch - For new features, select "develop" as the target branch - For release feature fixes, select the relevant release branch (release/X.Y.Z) as the target branch --> ## Purpose fix for non-contributor admin ## Changes - add ignore_permission to set_license - prevent creating new preprint version when serializer raises error https://openscience.atlassian.net/browse/ENG-7716
## Purpose Fix the errors in the management command ## Changes Remove useless parameter ## Ticket https://openscience.atlassian.net/browse/ENG-7263
…h date uploaded) via the admin app (CenterForOpenScience#11118) ## Purpose improve exception handling and minor fixes ## Changes - improve exception handling - supplemental node permission error - set subjects from relationship to keep valid hierarchy - ignore unfinished/unpublished versions ## Ticket https://openscience.atlassian.net/browse/ENG-7716
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this PR now has develop
code which hasn't been merged into feature/verified-resource-linking
. It's very likely you rebased from or merged the latest develop
. Please fix the commit.
I suggest you create a new PR from our feature branch and cherry pick your own commits. Otherwise, we may run into a case where your PR will reverse develop
stuff you merged when the feature PR is merged into develop
.
e.g. Here is current status
- current feature branch: F->D1
- current develop: D2->D1
- your current feature branch: A->Modified D2->F->D1
The following two will have the same diff in GitHub:
- What we need: expected feature branch: A->F->D1
- What we should avoid: A->Modified D2 Removed->Modified D2->F->D1
Purpose
Changes
See diff
QA Notes
N/A
Documentation
N/A
Side Effects
N/A
Ticket
https://openscience.atlassian.net/browse/ENG-7839
https://openscience.atlassian.net/browse/ENG-7766