Is there a "ready-to-use" method to anonymize datas, but keeping relations between keys ? For example, I have :
- Table #1
| user code | zip code |
|---|---|
| ztxp15 | 45789 |
And :
- Table #2
| user code | order date |
|---|---|
| ztxp15 | 2021-06-27 06:22pm |
I want it anonymized as :
| user code | zip code |
|---|---|
| xvdf65 | 32165 |
And :
- Table #2
| user code | order date |
|---|---|
| xvdf65 | 2021-06-27 06:22pm |
This would need : a bijective function that transform a data, keeping its format ([a-z]{4}[0-9]{2}), generating the same value, according a passphrase for example. In this way, unicity will be kept, format too, etc. But maybe I miss something. I think that this problematic is very common so I am looking for previous work about it.
It is a common practice to use a user identifier, which my itself has no meaning to a viewer. I assume in your case this is the
user code.You should only anonymise PII (Personally Identifiable Information). You can encrypt it for bi-directionality, or hash it for single direction anonymise. Hashing is usually done when exporting data to analytics dashboards.
It is not a common practice to anonymise
user code. If all PII is anonymised, then theuser codeis effectively anonymised.