Ticket #220 (closed defect: fixed)

Opened 2 years ago

Last modified 1 year ago

Search is returning more than one title per article

Reported by: ebrown Assigned to: ronald
Priority: high Milestone:
Component: topaz Version: 0.5-SNAPSHOT
Keywords: search Cc:
Blocking: Blocked By:

Description (Last modified by ebrown)

The actual article causing the problem is pone.0000008.zip. These are the results Viru got from fedoragsearch:

<?xml version="1.0" encoding="UTF-8"?>
<lucenesearch ...>
  <hit no="1" score="0.036502987">
    <field name="PID">doi:10.1371%2Fjournal.pone.0000008</field>
    <field name="property.type">FedoraObject</field>
    <field name="property.state">Active</field>
    <field name="property.createdDate">2006-11-22T23:06:02.921Z</field>
    <field name="property.lastModifiedDate">2006-11-22T23:06:10.406Z</field>
    <field name="property.contentModel">PlosArticle</field>
    <field name="dc.title">Molecular Adaptation during Adaptive Radiation in  ... </field>
    <field name="dc.title">Background</field>
    <field name="dc.title">Results</field>
    <field name="dc.title">Significance</field>
    <field name="dc.creator">Maxim V. Kapralov</field>
...

Dependency Graph

Change History

11/27/06 12:07:17 changed by ebrown

  • owner changed from ebrown to ronald.
  • description changed.

This appears to be an ingest problem.

This is what shows in the foxml file:

<oai_dc:dc xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <dc:title>Molecular Adaptation during Adaptive Radiation in the Hawaiian Endemic Genus Schiedea</dc:title>
  <dc:title>Background</dc:title>
  <dc:title>Results</dc:title>
  <dc:title>Significance</dc:title>
  <dc:creator>Maxim V. Kapralov</dc:creator>
...

See also #225

11/27/06 12:24:49 changed by ronald

  • status changed from new to assigned.

11/27/06 20:57:15 changed by ronald

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [1401]) Closes #220: fixed content of <dc:description>, <dc:title>, and others that may contain markup: the markup is now xml-serialized instead of included directly. This solves problems with Fedora's misparsing of dc elements, thereby creating random new nodes.

10/29/07 21:12:56 changed by

  • milestone deleted.

Milestone November27 deleted