- Data Strategy,
– 04 Jun, 2019
Understanding Data Risk
The more I think about the risks associated with processing business-sensitive, personal or special category data, the more I realise that businesses and public-sector bodies need data clarity and transparency in order to appropriately manage these risks.
Opinion Piece by Paul Gillingwater, MBA, CISM, CISSP
That clarity should begin with a deep understanding of the total IT estate, with well-documented descriptions of how, where and why the data is being captured, stored and processed.
The GDPR-mandated Article 30 Records of Processing Activities (ROPA) are a good start, and provide some key descriptors and attributes that must be answered, but in my view they don’t go far enough to really manage your corporate data risk.
When I’m investigating a suspected data breach, there are certain questions that would help me to understand the potential harm caused by the attack.
Those questions include:
What is the unique name of the data repository, and how does it fit into the overall data architecture of the enterprise? (I’m thinking about TOGAF here.)
How many records are stored in this repository? If it’s relational, you can denormalize for counting purposes. Bonus points if you can track (and graph) growth over time from inception. Most importantly, how many natural persons are represented by the data?
What’s the monetary value of this data repository? One estimate might be based on the cost of re-acquiring the data if it were lost. Also, what’s its Net Present Value moving forward, in terms of potential future revenue that might be derived from processing the data — and can this valuation be expressed in terms of monetary value per individual person?
Can you produce on demand a list of people who presently have access to this data repository? Do they have elevated privileges, e.g. mass deletion? What is the RBAC model used for this repository? Can you produce a list of people who did have access in the past, but no longer have access? Is that list valid for the lifetime of the data-set? Is all data properly labelled for security purposes? Can these labels influence the tools used to access or manipulate the data set?
Do you have a baseline measurement of typical transactions associated with the data repository, e.g. monthly reports, service desk changes based on a ticket, regular transactions, etc? Can you recognise and alert on non-typical data access, e.g. a particular user or role suddenly starts downloading hundreds of megabytes from the data repository? Can a different system be used to bypass RBAC, e.g. SQL queries linked to a Business Intelligence system with its own RBAC?
Do you have records relating to business partners, suppliers or even regulators who have received copies of “significant” subsets of the repository? What parts of the data did they receive? When did they receive it? What was the lawful basis, or business justification for the transfer? Is the partner a Controller in Common, Joint Controller, or Processor of personal data? Is the transfer still occurring as changes are made?
7. Data Subject Access Requests
Do you have comprehensive records around Data Subject Access Requests that were applied to records in this repository? Have records been deleted? When, on whose behalf, and how much data was deleted? Do you know which data has been transferred in machine readable format? To whom was it transferred?
8. Legal Holds
Are there any legal holds for some or all of the data in this repository? Has all of the data been properly tagged with retention periods? When was data last purged from this repository due to exceeding its retention period? How many records were purged (or anonymized), and when did this take place? When will the next purge occur, and how many records will be included?
Where is the data stored? What technology is being used? How do the technical and organisational measures reduce the risk of a data breach? When was a Disaster Recovery test last performed on this repository, and what were the results? How well would this repository withstand an insider attack, APT or ransomware infestation? If you’re using cloud, what are your data residency arrangements?
Can you clearly articulate the three lines of defence for this repository? Do you have effective governance and reporting (KPIs, KRIs) around the utilisation and management of this repository? How does this repository fit in with your corporate strategy? Are there plans for cloud migration, or will it remain on-prem? What’s the expected lifetime for this data? If you were asked to value this as an asset during a due-diligence M&A or takeover process, how would you respond?
Most organisations I am familiar with lack the institutional maturity to provide effective answers to these questions.
In my view, the role of a Chief Data Officer is to build the capabilities to be able to answer these questions quickly and easily. Only then will enterprises be properly empowered to manage the risk of the data for which they are the custodians.
Paul Gillingwater is an Associate Partner at Chaucer Group, responsible for privacy and data protection.
Chaucer offers advisory services on GDPR, as well as DPO and GDPR Representative services. If you think we can help you to implement your project or Privacy Operations Centre strategy, please contact us on DataPrivacy@Chaucer.com or 0203 934 1099.
Paul Gillingwater MBA, CISSP, CISM, RHCE
Paul Gillingwater GDPR, ISO27001, PCI/DSS, GRC, DPA18
Paul is a Managing Principal Consultant and registered DPO at Chaucer who has worked for more than 30 years as a cyber security and risk specialist and advisor to businesses, government and non-profits with their governance, regulatory and compliance requirements. Over the past five years he has focused on UK & EU data protection and is a passionate advocate of online privacy rights education.