When deleting visits via the GDPR Tools, invalidate the reports for each day of these deleted visits · Issue #14195 · matomo-org/matomo · GitHub
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When deleting visits via the GDPR Tools, invalidate the reports for each day of these deleted visits #14195

Closed
mattab opened this issue Mar 13, 2019 · 9 comments · Fixed by #14399
Assignees
Labels
c: Usability For issues that let users achieve a defined goal more effectively or efficiently. Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change.
Milestone

Comments

@mattab
Copy link
Member

mattab commented Mar 13, 2019

This is a frequent bug report. When deleting visits via the GDPR Tools then the deleted visits still show up in the reports.

The goal of this issue is that, whenever some visits are deleted then we also should call the Invalidate Reports code. As a result, next time core:archive will run all of the reports for the invalidated days (and the week, month, year) will be reprocessed, and the deleted data will be reflected in the reports (ie. not visible anymore).

(this will also benefit Cloud customers since invalidating old reports is disabled on Cloud yet sometimes we really need some data to be deleted)

Also in the UI we can remove the text In this case you may want to consider to invalidate reports after deleting these visits. since it will now be done automatically.

@mattab mattab added Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change. c: Usability For issues that let users achieve a defined goal more effectively or efficiently. labels Mar 13, 2019
@mattab mattab added this to the 3.11.0 milestone Mar 13, 2019
@tsteur
Copy link
Member

tsteur commented Apr 22, 2019

@mattab do we need to take log deletion setting into consideration? If we invalidate a report, will the invalidated report eventually disappear or do we keep the report if there is no final report? Otherwise we risk some reports cannot be regenerated?

@tsteur
Copy link
Member

tsteur commented Apr 28, 2019

ping @mattab

@katebutler katebutler self-assigned this Apr 28, 2019
@tsteur
Copy link
Member

tsteur commented Apr 28, 2019

To invalidate, we want to use the method ArchiveInvalidator::rememberToInvalidateArchivedReportsLater($idSite, $date)

@tsteur
Copy link
Member

tsteur commented Apr 28, 2019

We delete the dates based on log_visit.visit_last_action_time on the visit.

When we pass the $date to the method, need to make sure to apply the site's timezone. Otherwise we might invalidate the wrong date.

@mattab
Copy link
Member Author

mattab commented Apr 29, 2019

do we need to take log deletion setting into consideration? If we invalidate a report, will the invalidated report eventually disappear or do we keep the report if there is no final report? Otherwise we risk some reports cannot be regenerated?

AFAIK the API which invalidates reports checks when the first logs are available and if the logs are deleted for this date, the reports won't be invalidated. So we shouldn't have to worry about it.

@tsteur
Copy link
Member

tsteur commented May 2, 2019

AFAIK the API which invalidates reports checks when the first logs are available and if the logs are deleted for this date, the reports won't be invalidated. So we shouldn't have to worry about it.

so basically, if someone deletes a visit, and a user expects this data to be deleted from reports, it won't be actually deleted @mattab ? I don't think it's what users expect, at the same time deleting those reports is also not what people expect :)

@mattab
Copy link
Member Author

mattab commented May 2, 2019

My comment was unclear... What i meant is: the API which invalidates reports checks the date-time of the oldest logs and if the date to invalidate reports for is older than this date-time then the reports are not invalidated (because we know there is no logs to re-archive reports). Since when we delete a visit it means there are still logs for this day for sure, it should always result in an invalidation down the line.

@tsteur
Copy link
Member

tsteur commented May 2, 2019

@mattab that's clear. The problem is when you delete a visit of 12 months ago, and log deletion is set to 6 months... then the archive won't be invalidated/deleted but a user would expect the data to be removed from that archive.

@tsteur
Copy link
Member

tsteur commented May 2, 2019

It's an edge case though and don't really need to think about it. Be only an issue when deleting visits and for those visits log purging hasn't happened just yet.

Guess it's more like about deleting older reports that contain personal data but that's a different issue.

@mattab mattab modified the milestones: 3.12.0, 3.10.0 May 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: Usability For issues that let users achieve a defined goal more effectively or efficiently. Task Indicates an issue is neither a feature nor a bug and it's purely a "technical" change.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants