Add a CLI script for scrubbing email addresses etc from export files#69
Open
tellyworth wants to merge 1 commit intotrunkfrom
Open
Add a CLI script for scrubbing email addresses etc from export files#69tellyworth wants to merge 1 commit intotrunkfrom
tellyworth wants to merge 1 commit intotrunkfrom
Conversation
Contributor
Author
|
@iandunn mentioned the idea of taking an allowlist approach to scrubbing, rather than scrubbing selected elements like I've done here. That's a good idea for future, especially if we were to reuse this for other sites. That was my other reason for using DomDocument - we could traverse every node and scrub everything that's not explicitly allowed. If you think this is ok as a starting point let's merge it and add some handbook imports. |
iandunn
reviewed
Jun 3, 2022
|
|
||
| fwrite( $fp_out, $doc->saveXML() ); | ||
|
|
||
| fclose( $fp_out ); No newline at end of file |
Member
There was a problem hiding this comment.
Suggested change
| fclose( $fp_out ); | |
| fclose( $fp_out ); | |
| readline( "\nMake sure you manually review the diff before committing this to the repository! This script might have missed some PII. \n" ); |
Member
|
Here's an example of a safelist approach, for future reference: It works on a SQL dump rather than WRX, but has safelists of various data that might be useful. |
Contributor
Author
|
This is probably not needed now but worth keeping for future use. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
We could periodically use this on the server side to generate sample export files.
I used
DomDocumentrather than regexes to better deal with large files and unexpected whitespace etc.