Ticket #64 (closed defect: wontfix)

Opened 6 years ago

Last modified 5 years ago

Topaz performance optimization

Reported by: ronald Assigned to: ebrown
Priority: high Milestone:
Component: topaz Version:
Keywords: Cc:
Blocking: Blocked By:

Description

We'll need to do some serious performance testing and optimization at some point. The TopazPerformance? page is used to collect some of the performance issues and optimization ideas as we encounter them.

Dependency Graph

Change History

08/10/06 13:32:09 changed by amit

  • milestone changed from topaz_galois to TBD.

Moving to TBD as this process slowly gets cleaned up.

10/20/06 17:16:13 changed by ebrown

  • milestone changed from TBD to november6.

11/10/06 09:05:19 changed by pradeep

(In [1019]) Encapsulate implied permissions and propagate permissions inside permissions service.

(Prep-work for caching permissions. re #64.)

11/11/06 22:05:03 changed by pradeep

(In [1030]) Defined 2 new XACML condition functions:

  • urn:topazproject:names:tc:xacml:1.0:function:is-revoked
  • urn:topazproject:names:tc:xacml:1.0:function:is-granted

These replace the ITQL queries in the deny-revokes and permit-grants policies.

They are implemented by making direct API calls to permissions-impl isRevoked() and isGranted() functions.

(Prep-work for performance optimization re #64)

11/13/06 19:25:24 changed by pradeep

(In [1045]) Ehcache first cut implimentation. Not enabled yet, need more testing (re #64)

11/17/06 12:36:26 changed by amit

  • milestone changed from november6 to november17.

Ongoing ticket. Transfer to next release.

11/17/06 17:40:02 changed by pradeep

(In [1161]) Couple of things to speed up XACML checks:

Cache is still disabled, since there is a memory leak during integration-tests. (Addresses #64)

11/20/06 16:05:27 changed by amit

  • milestone changed from november17 to November27.

Moved to November 27.

11/26/06 11:28:02 changed by amit

Cache has been enabled [1204] and also invalidation of the cache via Mulgara filter resolver [1340]

11/26/06 11:29:29 changed by amit

Ronald is going to take a look at why user creation is so slow in Topaz. Given that we might have to ingest 40K users from existing AP website, it would be nice to have this faster than it is currently.

11/27/06 21:03:46 changed by amit

  • milestone changed from November27 to Dec1.

Changing milestone.

12/07/06 10:15:47 changed by amit

  • milestone changed from Dec1 to TBD.

Ronald, we need to provide our updates here. Moving it back to TBD after our discussion as given the snoop numbers by Ronald the performance numbers don't look that bad.

03/26/07 14:47:06 changed by amit

  • owner changed from somebody to ebrown.

Now that our staging servers our up can we compare 32-bit vs. 64-bit?

03/28/07 10:33:08 changed by ebrown

I timed some java builds when I first set the systems up and times were about the same. I'm not sure what the break down is between CPU and I/O in a typical build, but it is probably mostly I/O. And all this was going over NFS. But the repeated builds on both 32-bit and 64-bit systems were within a second of each other so no conclusion there.

I then re-did user migration of 349 users to the 32-bit systems. Running just topaz, fedora and mulgara on riddle and making the service calls across a fairly fast DSL connection... it took 2 seconds / user. I did not have caching on AFAIK, so that could have slowed (or speeded) things down. I then went to reproduce on the 64-bit cluster, but ssh problems - there is something wrong with the staging cluster.

According to my old emails, it was taking 4.5-seconds per/user on production in December. While I doubt the problem is 64-bit issues, it will be interesting to see once once staging cluster is healthy again. If it isn't 64-bit, we may want to try simulating 2-stacks with caching, etc.

04/04/07 16:05:03 changed by ebrown

I did migrations again on 64-bit with topaz and mulgara running on separate machines. Performance was 1 user/second. The fastest we've seen. Paul Gearon did indicate mulgara should actually run quite a bit faster on 64-bits and I suspect it to be true now. I doubt running topaz on a different system has much to do with the performance increase (though it is easy to test).

So the data we have is:

  • 4.5-seconds/user - production 64-bit/centos (2-stacks, caching on, xacml off)
  • 2-seconds/user - 32-bit fc6 running on 64-bit AMD. Topaz and mulgara on same system.
  • 1-second/user - 64-bit fc6/AMD. Topaz and mulgara on different systems.

I'm not even sure the caching gets in the way if XACML is turned off. Perhaps there are some updates (broadcasts) going on. Still, a slow-down of greater than 4x on production leaves suggests any of these issues:

  • centos
  • networking
  • physical config of mulgara box (RAID and whatnot config is much different than anything we've ever tested on)
  • caching slowing things down somehow

Also note that user-migration is probably not the best metric. I mean xacml is even off and we know that it carries a high-overhead by itself. But we also know that mulgara is critical to performance and user-migration certainly stresses that.

06/19/07 15:57:44 changed by amit

  • status changed from new to closed.
  • resolution set to wontfix.

It is a little difficult to figure out how to transfer this ticket to the new Topaz architecture. We probably need to start opening tickets on specific performance issues from now on.

08/07/07 16:25:51 changed by

  • milestone deleted.

Milestone Bugs deleted