Ticket #289 (closed defect: fixed)

Opened 2 years ago

Last modified 1 year ago

fix one or more of these things - plosone cache, home page browse box, restart required for backups

Reported by: russ Assigned to: somebody
Priority: critical Milestone:
Component: topaz Version:
Keywords: Cc:
Blocking: Blocked By:

Description

the relationship between our backup system and the plosone cache compounds any flakiness on the part of the topaz stack.

-since we have to restart daily -since the plosone cache is blown out on restart -since the home.action page has throw a topaz query for each article in order to build the browse block (~4 sec/query -> 15 minutes to build home page after restart)

there's a lot of load placed on things after restart, and a long period of time when the site is unavailable to users on a daily basis.

can we do one or more of of: -don't blow out the plosone cache on restart -work out a backup system that doesn't require restart -fix the homepage browse box

as soon as possible?

Dependency Graph

Change History

02/21/07 12:31:56 changed by ronald

After talking with Pradeep, here's a suggestion: restart the plosone servers one after the other. I.e.

  1. restart plosone01
  2. reload the home-page on plosone01
  3. restart plosone02
  4. reload the home-page on plosone02

The idea is that as long as one plosone is always up the other can load its cache from it.

02/21/07 13:54:01 changed by ronald

During a discussion with Russ he pointed out that we probably don't really need to restart plosone and topaz - the point of the restart is to get a snapshot of the Fedora and Mulgara databases. So, quiescing the stack by, say, shutting down Apache or by somehow disabling the ajp port on plosone, and then just restarting Fedora and Mulgara ought to be sufficient.

02/22/07 10:01:02 changed by russ

  • status changed from new to closed.
  • resolution set to fixed.

last nights backup, leaving topaz and plosone services up, went swimmingly.

during testing we found that, with fedora and mulgara disabled, the site happily serves pages out of its cache to anonymous users, and returns site errors to logged in users.

with mulgara up and fedora down, logged in users can browse too. we're not sure what happens if a logged in user attempts to annotate when mulgara is up and fedora is down. i suppose i should test that before closing this ticket...

...goes off to test...

perfect. we get an error and the annotation is not logged.

ideally we would disable the webheads during the short service restart (~5 minutes last night) by disabling the ajp connector on the plosone servers. if anyone knows of a good way to do that, i'll do it. but it seems like not a big deal, compared to how nasty things were before.

02/22/07 16:33:37 changed by ronald

I don't have the time to try this out right now, but I think if you enabled JMX on plosone (adding -Dcom.sun.management.jmxremote to params in /etc/sysconfig/plosone) you should be able to fire up jconsole, go to the MBeans tab, select Catalina -> Connector -> (whatever), go to the Operations tab, and there pause and resume the connector. This could also be put into a small utility.

08/07/07 16:25:51 changed by

  • milestone deleted.

Milestone Bugs deleted