For the EU's GDPR compliance (user privacy), we need to redact personally identifiable information form the versions of our records. I've come up with something that seems to work, but figure I should ask if there's an established way to do this.
class User < ActiveRecord::Base
has_paper_trail
end
user = User.create! name: 'Josh'
user.update_attributes name: 'Josh2'
user.update_attributes name: 'Josh3'
user.destroy!
def self.get_data
PaperTrail::Version.order(:id).where(item_id: 1).map { |ver| [ver.event, ver.object, ver.object_changes] }
end
# ===== BEFORE =====
get_data
# => [["create", nil, {"id"=>[nil, 1], "name"=>[nil, "Josh"]}],
# ["update", {"id"=>1, "name"=>"Josh"}, {"name"=>["Josh", "Josh2"]}],
# ["update", {"id"=>1, "name"=>"Josh2"}, {"name"=>["Josh2", "Josh3"]}],
# ["destroy", {"id"=>1, "name"=>"Josh3"}, nil]]
PaperTrail::Version.where_object_changes(name: 'Josh').each do |ver|
ver.object['name'] = 'REDACTED' if ver.object && ver.object['name'] == 'Josh'
if oc = ver.object_changes
oc['name'] = oc['name'].map { |name| name == 'Josh' ? 'REDACTED' : name }
ver.object_changes = oc
end
ver.save!
end
# ===== AFTER =====
get_data
# => [["create", nil, {"id"=>[nil, 1], "name"=>[nil, "REDACTED"]}],
# ["update",
# {"id"=>1, "name"=>"REDACTED"},
# {"name"=>["REDACTED", "Josh2"]}],
# ["update", {"id"=>1, "name"=>"Josh2"}, {"name"=>["Josh2", "Josh3"]}],
# ["destroy", {"id"=>1, "name"=>"Josh3"}, nil]]
UPDATE: Actually, I'm going to need to scope the record by an association, as well, so my example isn't sufficient.
For the EU's GDPR compliance (user privacy), we need to redact personally identifiable information form the versions of our records. I've come up with something that seems to work, but figure I should ask if there's an established way to do this.
No, as of today, 2018-05-30, there is no built-in feature or documented solution for GDPR redaction.
PaperTrail provides many ways to iterate over, and to query records in the versions
table. where_object_changes
is one such feature, but it generates some pretty complicated SQL.
where_object_changes(name: 'Joan')
SELECT "versions".*
FROM "versions"
WHERE .. ("versions"."object_changes" LIKE '%
name:
- Joan
%' OR "versions"."object_changes" LIKE '%
name:
-%
- Joan
%')
You may, justifiably, have concerns about the correctness of this query. In fact, as of PT 9.0.0, using where_object_changes
to read YAML from a text column raises an error to that effect. Reading JSON from text or from a json/b
column is still allowed.
Anyway, if I've succeeded in making you wary of such complicated SQL then you should choose a simpler approach, perhaps iterating over all of the version records for that user (user.versions.find_each
)