Possible Duplicates

Possible duplicates are pairs of study records where the Medusa algorithm detected meaningful similarity but could not classify the match with high confidence. These pairs sit in a middle ground - they may be duplicates, but the similarity is not strong enough to act on automatically. Each pair requires a human decision.

Why possible duplicates need review

Possible duplicates arise in situations such as:

Two studies with the same first author and a similar topic but different years - possibly a re-analysis or follow-up, not a duplicate
Studies from the same trial reported in different journals with partial title overlap
Records where one entry is missing the abstract, reducing the similarity signal
Similar-sounding titles in a narrow subject area that are actually distinct publications

Because the system cannot distinguish these cases reliably, you need to inspect each pair and make the call.

Opening the possible duplicates list

From the Deduplication section, click the Possible Duplicates tab. Pairs are displayed in order of decreasing similarity score - the most likely duplicates appear first.

Screenshot needed

Possible duplicates list showing pairs ordered by similarity score with scores displayed next to each pair

Reviewing a pair

Click a pair to expand it. You see both records side by side with their full metadata. Pay particular attention to:

DOI or PubMed ID - if both records have the same DOI or PMID, they are almost certainly the same publication
Year and volume/issue - matching publication details strongly suggest a duplicate
Abstract text - read the abstracts; if they describe the same study population and methods, it is a duplicate

Screenshot needed

Expanded possible duplicate pair showing both records with DOI, year, and abstract highlighted for comparison

Marking as a true duplicate

If you determine the two records are the same study, click Mark as Duplicate. The pair is confirmed and handled the same way as a confirmed true duplicate - one record is deactivated and excluded from screening.

Discarding the pair

If you determine the two records are different studies, click Not a Duplicate. The pair is moved to the Discarded tab and both records remain in your active project set for screening.

Tip

When in doubt, search for the DOI or title in a separate browser tab to check whether the records are genuinely the same publication. It takes only a moment and prevents a false removal from your project.

Skipping a pair

If you are uncertain and want to return to a pair later, use Skip to move it to the bottom of the list without making a decision. Skipped pairs remain in the possible duplicates queue.

Tracking progress

The Possible Duplicates tab shows a count of pairs still awaiting a decision. Work through the list until the count reaches zero. You do not need to resolve all possible duplicates before starting screening - you can continue deduplication in parallel with early screening work - but clearing the queue before you begin is recommended for a cleaner workflow.

Info

Any records still marked as possible duplicates when screening begins will appear in screening as normal active records. They will not be removed unless you confirm them as duplicates.