Tweak MetricsUpdateTask
performance
#1481
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I was profiling the
MetricsUpdateTask
class due to some folks reporting performance concerns and managed to identify a few screws we could tighten to make the task more resource efficient.Profiling
Setup
postgres:14.2-alpine
image.MetricsUpdateTask
running in aJerseyTest
without any resources deployed. The task is running pretty much isolated.INSERT
s, justUPDATE
s.Results
The current implementation is performing a lot of allocations in the
updateComponentMetrics
method. Profiling results showed that the biggest chunk is allocated inAbstractAlpineQueryManager.persist
. More specifically, in DataNucleus'preCommit
method, where DN maintains its L1 cache:This is also where a lot of CPU time is spent:
The DN docs say WRT the L1 cache:
Because DT currently uses a single
PersistenceManager
for the entire portfolio metrics update run, the L1 cache keeps on growing, and DN has to do a lot of work in order to maintain it. Performance deteriorates with every project being processed. So far, this matches what @hostalp was reporting in #1210 (comment).Another factor here is what
AbstractAlpineQueryManager.persist
does with objects passed to it:Especially for larger objects, that's quite a bit of work being performed. In cases like metrics updates, we only want to
INSERT
orUPDATE
something, but we don't care about the object after that, we don't need it to be persistent.There are cases in the current logic where
persist
is called even if it isn't necessary, causing some additional overhead. One of these cases is when updating thelastInheritedRiskScore
field ofComponent
, when the metrics themselves didn't change (see here). Semantically, this is (almost?) always a no-op. But in the background, the persistence layer is producing garbage.When working on objects that already are persistent we could achieve the desired effect using something like this as well:
That safes us the unnecessary calls to
makePersistent
andrefresh
.Tweaks
I didn't want to change too much here, so I only introduced two changes:
updateProjectMetrics
method gets its ownQueryManager
. This ensures that resources like the L1 cache are regularly cleared. We do want to take advantage of DN's caching to some degree, so using a differentQueryManager
for eachupdateComponentMetrics
method is not reasonable. According to DN's docs, creatingPersistenceManager
instances is a cheap operation though.lastInheritedRiskScore
fields of projects and components are only updated when they actually changed. In the majority of cases, this won't be the case and we're saving quite a bit of computing power.Results
Allocations are reduced by more than 50%:
CPU time spent is drastically reduced as well:
With the original code, the task ran for 3m30s. With the changes outlined above, it finished just under 2m30s!
There are probably a few more places throughout the codebase where similar minor changes could be made to achieve such results.