173 - duplicates
Created by: ri-pandey
Description
Allow duplicate datasets to be ingested in Bioloop, and for authorized users to be able to accept/reject them.
Related Issue(s)
Closes #173
Changes Made
List the main changes made in this PR. Be as specific as possible.
-
Feature added -
Bug fixed -
Code refactored -
Documentation updated
Checklist
Before submitting this PR, please make sure that:
-
Your code passes linting and coding style checks. -
Documentation has been updated to reflect the changes. -
You have reviewed your own code and resolved any merge conflicts. -
You have requested a review from at least one team member. -
Any relevant issue(s) have been linked to this PR.
Additional Information Documentation of the process - https://github.com/IUSCA/bioloop/blob/173-duplicates-new/docs/dataset_duplication.md.
Summary of features:
- Bioloop can now register duplicate datasets for a given dataset.
- Multiple duplicate datasets can coexist in the system, with versions being assigned to concurrent duplicates (this feature is disabled at the moment).
- Alerts are shown (only to operators and admins) regarding state of a dataset in case it has a duplicate, or is a duplicate. This is done in 3 places:
- project dataset modal
- project dataset table
- dataset page
- file browser
- project dataset modal
- Operators and admins are shown notifications to make them aware of the duplication, and buttons to accept/reject datasets in alerts. Users only see alerts.
- Both the API and the worker layers are involved in accepting or rejecting a duplicate. For detailed explanation of steps, see the linked documentation.
- Checks have been added to the API layer so that certain operations are forbidden for duplicate datasets (like adding them to a project). The API/UI code has also been updated to omit duplicate datasets from the existing UI controls that show datasets (like the Project-dataset search).
- After acceptance, the duplicate dataset replaces the original dataset at the database level and filesystem level. The original dataset is (soft-) deleted in the database.
- After rejection, the duplicate dataset is (soft-) deleted in the database, and its filesystem resources are purged.
- Acceptance/rejection are irreversible operations. Users see a modal to confirm before they accept/reject a dataset.
- After a file downloaded is initiated, the dataset is re-fetched (without page refresh), so user can be made aware of the fact that a duplicate of the dataset that they are downloading is currently being integrated by the system.
- Duplicate datasets can be seen in the
/duplicateDatasets
view