Ticket #109 (closed enhancement: fixed)

Opened 2 years ago

Last modified 1 year ago

cache PMC DTD's to avoid delay during ingest

Reported by: ronald Assigned to: ronald
Priority: critical Milestone:
Component: topaz Version: 0.5-SNAPSHOT
Keywords: article ingest Cc:
Blocking: Blocked By:

Description

Currently each time an article is ingested, the XSLT engine looks up and downloads the PMC DTD(s), which is about 100K worth of data. This leads to a roughly 20 sec delay (depending on the speed of the network) for each ingest.

Instead, we should store the DTDs locally and point the XML parser to the local copies.

Dependency Graph

Change History

08/11/06 16:49:59 changed by amit

  • priority changed from unassigned to critical.

08/11/06 16:50:25 changed by ronald

  • status changed from new to assigned.

08/11/06 17:08:51 changed by ronald

  • milestone changed from TBD to august25.

08/22/06 11:24:54 changed by ronald

(In [485]) Store the pmc as a 'managed' datastream, not an 'xml' datastream, because Fedora thinks it has the right to mess with 'xml' datastreams (such as removing doctype declarations etc).

Also added a test to make sure what we get back from Fedora is really what we put in.

This also addresses #109 as this keeps Fedora from trying to download all the DTD's.

08/22/06 11:33:40 changed by ronald

  • status changed from assigned to closed.
  • resolution set to fixed.

(In [486]) Fix #109: cache PMC DTD's. On the one hand ther is an in-memory cache to minimize network retrievals across multiple ingests. On the other hand we include a local copy of the PMC dtd's for vesions 1.1, 2.0, 2.1, and 2.2 in the jar.

11/12/06 14:26:16 changed by ebrown

(In [1031]) Pull entity-resolver into separate library.

Addresses #172 and #109

10/29/07 21:12:47 changed by

  • milestone deleted.

Milestone august25 deleted