-
-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report archives have tripled in size since update to 2.10 #7181
Comments
Hi @skyhawk669 thanks for the report. Do you maybe notice any improvement in 2.11.0 ? |
As asked by Matt I repost here but I start adding something more from my previous report : In case it could be a clue : Another thing changed in February on my server: The CLI command changed from numeric IP in "--url=" to real domain... console core:archive --url=http://1.2.3.4/ >> piwik-console-cron.log And this was my initial post: Piwik 2.11.b3 Single Piwik instance with 6 web sites, about 2.5 millions pageviews a month. |
I'm seeing the same thing here after we updated from 2.8.3 to 2.10.0. That being said, I can't say for sure that I noticed how large the tables were before the update. I noticed this after investigating some other issues we seem to be having. We have been on 2.11.1 for 24 hours now and the tables are still quite large. I'll be more than happy to run any queries for further diagnosis. |
@CanuckNick and @gaumondp can you both try running this query: If you see my screenshot above, I ran this query for 2014_12, 2015_01, and 2015_02, then sorted by the "Count" column. My results showed that 2014_12 and prior, the tables had at most Count's of 4, while 2015_01 and 2015_02 have Count's as high as 76 so far. According to capedfuzz on the forum (http://forum.piwik.org/read.php?2,123852,page=1#msg-123931), this indicates that "If the result isn't an empty set and there are a lot of rows, it means there are lots of duplicate archives that aren't being deleted for some reason." |
Thanks everyone for the reports, that's really helpful. It looks like the issue is that duplicated archives are not deleted as suggested by @diosmosis - we will fix it in the next major release. Please write a comment if you also experience this issue and haven't commented yet! |
I've noticed the same thing after upgrading. Table sizes have been massive over the last two months.
Please excuse the huge pastes.
December 2014:
|
…tatic methods, move Rules::shouldPurgeOutdatedArchives since it is only used by ArchivePurger and move comment in said function.
…her through tracking or through CronArchive), so remove 6 hour forced time between purges in ArchivePurger::shouldPurgeOutdatedArchives.
Could someone experiencing this problem run the following query on a bloated numeric table:
? Thanks! If the results are too large to copy-paste, feel free to email them to hello@piwik.org. |
Also, if anyone wants to see a better output message than "Purging temporary archives: skipped", you can replace the Please post the message you see here, it may help us fix the issue. |
@diosmosis - I just sent an email with a couple result exports from my tables. Hopefully that will help. I will see if I can update and run the change to |
Just ran |
@diosmosis - here's the results of the query you requested: |
@diosmosis, I have run the job with the modified Rules file and I get the same messages as @CanuckNick for each skipped line: |
@diosmosis I also got "Purging temporary archives: skipped (no authorization)" for all archives including the newest 2015 Jan, Feb, Mar archives |
If you run the |
I believe the cron is running on a regular basis in our testing environment still so I will check the output tomorrow morning (EST) and get back to you. |
…PurgeOutdatedArchives() so archive purging can be forced in contexts other than scheduled task running.
Just checked the logs, this is what shows up this morning: |
This seems to be mostly solved for past months but not for the current one where the blob table remains abnormally large. |
It may not be fully fixed so re-opening. This was reported in http://forum.piwik.org/read.php?2,128325 - (I've also removed the issue from the 2.14.1 changelog where it was announced fixed.) |
Indeed. archive_blob_2015_08 6.4 G 295 M 5,161,333 I'll have the following run to free up some space: Please note this is not documented in http://piwik.org/docs/setup-auto-archiving/#help-for-corearchive-command. |
Here are my numbers after a “purge all”: archive_blob_2015_08 628 M 35.7 M 943,370 archive_numeric_2015_08 28.6 M 45.3 M 275,553 There is clearly an asymmetry between July-August and the rest of the year but I can't tell whether the old ones didn't run completely when they had been re-archived or if July and August still contain too many rows. To be continued... |
Hi everyone, We believe we may have finally found issue causing this bug of archive tables becoming too big. I'm re-opening while we wait for confirmation from you guys, that this issue has been fixed. It would be great if you could test that this bug is also fixed for you. We've released 2.15.0-b2 which you can install easily (see instructions: http://piwik.org/faq/how-to-update/faq_159/) We're waiting for your feedback 👍 |
i just want to leave a comment here. I updated my installation to latest beta, did an archiving and optimze on all tables. My installation shrank from 80gb down to 10gb, where 6gb are from
|
I had 650GB archive_blob_2015_01 table on 2.12.0, will let you know with 2.14.3 and than beta after archive will finish. |
Remember that January is also the table having the yearly reports so it's "normal" it's way bigger (8X ?) than other months. AFAIK. |
Another piwik server: so it's way bigger (10-20 times) |
In my case it's solved and event got better with the new CLI command |
Hi everyone, could you confirm whether this bug is fixed for you after upgrade to 2.15.0? if not fixed, please let us know: we need to make sure this issue is really solved. thanks! |
It seems mostly solved. I wonder to what point it is still normal to have invalidated archives still present for the month of October, now on the 18th of November. Please look at the attached results of
where the one of September has been run on the 5th of October (invalidated archives present) and Today (none present). analyze-archive-table_2015_10.txt The table sizes compare as follows: Did we make sure a table optimisation is run when needed? |
Hi everyone, We haven't heard that the issue is still active so closing this issue. please leave a comment if you find anything interesting or experience some issue. Thanks! |
I updated to 2.15.0 few weeks ago and problem seems solved. piwik_archive_blob_2015_month are now around 400 MB instead of 6 GB (except for January but that's normal, I know) ! Thanks. |
I've updated to 2.16.0 and although it shrank the tables considerably they still appear to be bloated with duplicate entries. before 2.16.0 console core:archive and optimize: after 2.16.0 core:archive and optimize: According to the comments in this thread, I ran the following to see the number of duplicate entries in the Mar 2016 blob table.
This resulted in 727180 rows I ran the following diagnostic as well
Here's the summary Total # Archives: 40728 Please let me know if you need any further information. |
Since upgrading to 2.10 the archive blobs and archive numbers tables have tripled in size (blob tables usually between 30-40MB, now they are 140-500MB).
The archiving process is set in cron to run every hour and nothing else has changed in the system beyond upgrading from 2.9 to 2.10 (no drastic change in amount of visitors, or to the structure of the site).
The tables are reduced in size a bit by running core:run-scheduled-tasks --force, but they're still quite a bit bigger than they used to be.
More background info in the following thread: http://forum.piwik.org/read.php?2,123852
Running Piwik on:
Red Hat 5
apache 2.4
PHP 5.4.35
MySQL 5.0.45
The text was updated successfully, but these errors were encountered: