views
The central government took down the draft data anonymisation guidelines this week, just days after posting them for public comment until September 21.
The draft report was commissioned by the Ministry of Electronics and Information Technology (MeitY) and compiled by the STQC Directorate and the Centre for Advanced Computing Development (C-DAC).
According to the PDF copy of the draft, it was uploaded in July this year. The guidelines recommended a variety of techniques and Standard Operating Procedures that e-governance projects can use to anonymise the data they collect (and then harness it for other projects).
They also aimed to aid in the implementation of data anonymisation provisions in government policies and laws.
THE DRAFT
As per the draft, “In the current environment, we are confronting many big data breaches that necessitate governments, organisations, and companies to reconsider privacy. In contrast to that, almost all breakthroughs in Machine Learning come from learning techniques that require a large amount of training data.”
“Besides, research institutions often use and share data containing sensitive or confidential information about individuals. Improper disclosure of such data can have adverse consequences for a data subject’s private information, or even lead to civil liability or bodily harm,” it added.
So, it was decided that determining what data to anonymise, when to anonymise it, and how depends on an organisation’s objectives as well as emerging regulatory regimes and standards.
Additionally, the report clarified that in accordance with the principle of ‘data minimisation’, anonymisation should ideally occur as early in the data collection lifecycle as possible.
However, the draft mentioned 15 steps to follow by the owner organisations or team to make sure that the data is adequately anonymised.
The draft said: “The nodal officer is responsible for taking care of any data (including de-identified data) moving out of the organisation.”
Here are the 15 steps:
Step 1: Determine the dataset that needs a de-identification process.
Step 2: Decide the release model/ Policy. Determine whether the dataset will be made public or shared with restricted groups.
Step 3: Identify roles and responsibilities for overseeing the de-identification process—responsibilities under Chief investigator, Co-director, trial data manager, trial statistician, trial manager, tester, validator and auditor.
Step 4: Determine which data directly identifies an individual (such as phone numbers and, interestingly, Aadhaar) and which data does so indirectly (quasi-identifiers like sexual orientation or religious belief). This will help determine which data should be anonymised and how to do so.
Step 5: Mask (transform) direct identifiers — mask or anonymise direct identifiers first, this eliminates the risk of re-identification in the dataset.
Step 6: Perform threat modelling of quasi-identifiers—perform threat modelling on quasi-identifiers to determine what information may be revealed as a result of them.
Step 7: Determine the re-identification risk threshold.
Step 8: Determine the transformation process to be used to manipulate the quasi-identifiers.
Step 9: Import (sample) data from the source database.
Step 10: Review the results of the trial de-identification—perform trial anonymisation based on steps 6-9 and assess whether the results meet risk-limitation expectations. Examine and correct any errors, and ensure that the risk is less than the re-identification threshold.
Step 11: Transform all of the quasi-identifiers in the dataset. Then, using step 9, apply code and algorithms to entire datasets.
Step 12: For every dataset produced in step 10, ask and evaluate “can this information be used to identify someone?”
Step 13: Compare the actual re-identification risk with the threshold specified by the policymakers.
Step 14: Determine access controls for the data even while sharing it.
Step 15: Details of the anonymisation process should be captured in a detailed manner. It will help reviewers and auditors to identify problems in the anonymisation process.
However, the draft report also suggested conducting a risk assessment after the anonymised data is released and recommended putting systems in place to notify concerned stakeholders of data privacy incidents within specified timeframes.
To reduce these occurrences, the draft also emphasised training e-governance officials in data anonymisation techniques throughout the data processing life cycle (collection, processing/usage, archival, deletion/destruction).
It should be noted that according to the draft, the anonymised data’s privacy can be measured through approaches like K-anonymity. These precautions ensure that the risk threshold for an anonymised dataset is not exceeded.
These functions can assist data processors in determining how resilient their data anonymisation techniques are to re-identification attacks.
However, as the document stated: “This document is informative and advisory in nature and aims to provide guidelines to all entities involved in the processing of personal information (and subtypes) in e-governance projects. The document can also be used by private sector organisations processing personal information.”
Additionally, it said: “MeitY and/or its associated/attached offices and organisations retain the right to make changes to this document at any time, without notice.”
“Further, MeitY and/or its associated/attached offices and organisations makes no warranty for the use of this document and assumes no responsibility for any errors which may appear in the document, nor does it make a commitment to update the information contained herein,” the draft stated.
As per recent reports, ministry officials allegedly stated that the draft was released without adequate consultation with experts and the ministry intends to consult with a broader group of experts before releasing a new draught for public comment “in a few days”.
Read all the Latest News India and Breaking News here
Comments
0 comment