"Move page" is 85% slower in Confluence 5.10+ than in 5.8.18, making moving large number of pages very inefficient

Description

We are developers of the Archiving Plugin for Confluence.

Our add-on batch-archives pages by moving them between spaces. Our customers frequently archive tens of thousands of pages in one go, sometimes even more. "Page move" does not need to be hyper-fast, but is expected to be complete in tens of minutes or in some hours most (overnight) even for this scale of data.

"Page move" has been a fairly fast operation in Confluence 5.7, and it became significantly slower some time between 5.7 and 5.10. (As far as I know the "Move page" feature has been rewritten in some Confluence version released around that time.)

Our profiling results:

Operation

1.014M pages in Confluence 5.7.6

5.8.18 metric

380K pages in Confluence 5.10.1

Page archiving with MOVE

40 minutes (for 1555 pages)

N/A

1.1 hours (for 400 pages - stable, but slower!)

1555/40 = 39 page moved per minute in Confluence 5.7
400/66 = 6 page moved per minute in Confluence 5.10

6/39 means it is 85% slower than it used to be!

Is there any chance to accelerate page moves in Confluence core?
Or offer an API to batch-move multiple pages? (Currently, the API limits us to move one page at a time.)

*Update 02/06/2017*

Joint investigation by and (see comment thread below) showed that regression happened somewhere between 5.8.18 and 6.0.1:

 

5.8.0

5.8.18

6.0.1

Execution time

23 m 10 s

12 m 36 s

23 m 53 s

Regression is present both in memory consumption and cpu usage. As Aron proved increasing memory brings page move time by 30% but it still not up to par with what it was.

Possible area is either `DefaultRelatedContentRefactorer` class or it's clients.

Environment

None

Activity

Show:
Matthew Jensen
September 7, 2017, 1:29 AM

To deliver a better level of support for our Server products we have created a new service desk for all server plugin development questions.

If this issue is still a problem for you please create an issue in with our developer relations team here:
https://ecosystem.atlassian.net/servicedesk/customer/portal/14

If you have questions or issues related to developing an add-on for Confluence Cloud, please continue to use this issue tracker.

Aron Gombas [Midori]
June 2, 2017, 9:17 AM

Thanks, Petro. I really appreciate the help you and Minh provided during this week!

I hope this comes to the top of your backlog, as it is impacting our user who are, of course, running on large user tier licenses of Confluence.
(Note: our profiling used 800 page batches, but they sometimes archive 60-80K pages in one go!)

Petro Semeniuk
June 2, 2017, 3:38 AM

Hi , many thanks for going extra mile and investigating impact of free heap and cache sizes in page speed. Looking at number I think besides page move being more memory hungry there is genuine regression in how many cpu clocks page move eats as well.

Unfortunately there is not that much more I can help with

I'll update ticket description shortly with our findings so it will be up to to prioritise and possibly find innocent soul to fix regression.

Aron Gombas [Midori]
May 31, 2017, 2:03 PM

Hi Petro,

Here is the result of our profiling work.

1. Giving more heap improved this. For instance, moving from 1024M (default) to 1536M, reduced the execution time from 29m to 20m (-33%) for ~800 pages. Giving even more heap, it didn't improve further, but my guess is that it would if the data size would be bigger.
This still isn't blazing fast, but at least something.

2. The cache capacity tuning did not bring any improvement. We increased the capacity of the "under-sized" caches to 3x in each step, re-run the job without any improvement. We continued this until the utilization of all cache went under 80%.

(This is just the summary. I can share the details, if you are interested.)

Aron Gombas [Midori]
May 30, 2017, 1:58 PM

Thanks for the hints, Petro!

We set up a list of experiments with heap size tuning and cache capacity tuning and we start executing those right now...
I should be back soon with our results.

Tracked Elsewhere

Assignee

Unassigned

Reporter

Aron Gombas [Midori]

Add-on Type

Server

Original Estimate

None

Components

Priority

Major