The responsible use of data for and about children: treading carefully and ethically

In this series of Q&A, we speak with Stefaan G. Verhulst and Andrew Young from GovLab, (an action-oriented do-tank located at NYU) who are working in collaboration with UNICEF on an initiative called Responsible Data for Children initiative (RD4C) . Its focus is on data – the risks it poses to children, as well as the opportunities it offers.

You have been working with UNICEF on the Responsible Data for Children initiative (RD4C). What is this and why do we need to be talking more about ‘responsible data’?

To date, the relationship between the datafication of everyday life and child welfare has been under-explored, both by researchers in data ethics and those who work to advance the rights of children. This neglect is a lost opportunity, and also poses a risk to children.

Today’s children are the first generation to grow up amid the rapid datafication of virtually every aspect of social, cultural, political and economic life. This alone calls for greater scrutiny of the role played by data. An entire generation is being datafied, often starting before birth. Every year the average child will have more data collected about them in their lifetime than would a similar child born any year prior. Ironically, humanitarian and development organizations working with children are themselves among the key actors contributing to the increased collection of data. These organizations rely on a wide range of technologies, including biometrics, digital identity systems, remote-sensing technologies, mobile and social media messaging apps, and administrative data systems. The data generated by these tools and platforms inevitably includes potentially sensitive PII data (personally identifiable information) and DII data (demographically identifiable information). All of this begs much closer scrutiny, and a more systematic framework to guide how child-related data is collected, stored, and used.

Towards this aim, we have also been working with the Data for Children Collaborative, based in Edinburgh in establishing innovative and ethical practices around the use of data to improve the lives of children worldwide.

What are the ethical considerations around the collection of children’s data? Do children have any say at all in this ‘datafication’ of their lives?

Among the key ethical issues concern consent, agency and privacy. Even adults face challenges in exerting individual or collective agency in some areas of the data ecosystem, especially the use of datastreams drawn from many people to create profiles or marketing segments, for example. As Martin Tisné puts it, “we are prisoners of other people’s consent.” This is even more so for children, who are often subject to other people’s consent and decision-making. For example, unlike adults, children often do not have full agency to make decisions about their participation in programs or services that may generate and record personal data. Even when children are offered a choice to opt-in or out of a service, they may not be provided with adequate support to assess associated risks and benefits. In fact, privacy terms and conditions are often barely understood by educated adults, let alone children. There is a clear need for privacy terms that are more intelligible (for both adults and children), and for a higher bar when it comes to justifying and explaining data collection in relation to children.

Does anonymous data address some of these issues?

Aggregated, anonymized data is often held up as a solution to potential privacy violations and a way to balance the possibilities and risks offered by data. However, a variety of studies have shown that anonymized data rarely offers a panacea and can often continue to pose risks even for adults, for example by re-identification (aka the “Mosaic theory” that refers to aggregating different personal data as to provide a mosaic of one’s identity). When it comes to children, data that is aggregated at the group or demographic level (e.g., to contain anonymized information for all children below a certain age) offers unique challenges. This is because children as a group are often uniquely vulnerable (and visible) due in part to the lack of attention to their rights to privacy and freedom of expression. Services or products targeted specifically at children (e.g., those based in a certain location or receiving services for a particular issue) can pose disproportionate risks to the entire targeted population. Aggregated “group data” about children warrants additional ethical consideration. As it stands, current data protection policies and guidance are primarily geared toward individual-level privacy risks, and often fail to address considerations related to group data — particularly group data about children.

We are seeing more and more examples of misuse of data for political and other purposes? What are some of the implications of this, especially for children’s lives?

First and foremost, privacy and data responsibility are essential to children’s psychosocial growth. Having the freedom and autonomy to experiment with different identities, without prying eyes or chilling dataveillance, is important for children’s identity formation. Children’s capacities are evolving and lack of protection from persistent, invasive data generation and use could impact the way they see themselves and their futures. A sense of privacy can empower children, especially older children, to engage and build relationships with their peers more comfortably and confidently by giving them the option to decide which personal details to disclose and under what conditions. The misuse of data may thus also impact civic and political engagement among young people.

Further, when data is mishandled or when data violations occur, people typically lose trust in organizations or institutions (or in the broader information ecology). This, in turn, can lead to “privacy protective behavior” where individuals are not seeking essential services out of fear of unauthorized data uses, and generally limit the potential benefits of technology. For children, as for many adults, loss of trust may be a formative experience and can have considerable impact. Distrust and privacy protective behaviors related to data misuse can have far-reaching consequences, such as refusal of health care, education, child protection and other public services.

As new technologies are implemented and the volumes of data increase exponentially, there is a very real risk that the rights of children, and associated adult obligations in respect to these, may be overlooked. Sometimes, this can happen because requirements in place to protect them are difficult to monitor and maintain in a new technology ecology (though of course, this is no justification). Often, too, it is simply because the rights and interests of children are not prioritized or adequately considered when organizations implement data processes or systems. For example, data analysis may be undertaken by people who do not have expertise in research involving children. Similarly, service providers collecting children’s data are not always trained in how to ethically approach and manage this work.

How are algorithmic decision-making efforts and the rise of AI processes like machine learning impacting the children’s data responsibility space?

And potential issues continue when we look at the impact of AI and Algorithmic bias that hasn’t been developed specifically with children in mind. Much attention has been drawn in recent years to the promise and pitfalls of algorithmic decision-making. While the use of AI in decisions can in some cases expedite processes, it can also contain hard-to-detect but nonetheless tangible biases that result in real adverse effects (e.g. on those seeking medical care, business loans, parole, or jobs). These risks are only heightened when it comes to children. Once again, children may have less agency or understanding when it comes to how AI and algorithms work; they may not even know that certain processes are the result of algorithmic assessment, modelling, or prediction. As described in UNICEF’s draft Policy Guidance on AI for Children, decision-making in international development, social service provision, and education systems are especially likely to be impacted by AI-driven mediation and filtering, often without the direct engagement or knowledge of children or their caregivers. In addition, if children are impacted by algorithmic bias, they may lack the resources or knowledge to respond or seek recourse. Finally, any decision making that targets children and is based on AI that leverages population training data, may fail to take into account children’s physiological and psychological differences (from adults) resulting in potential negative implications for their physical and mental health outcomes. It is therefore imperative that any Responsible Data Use for Children framework include a component dedicated to the role of AI and algorithms.

So, how do we each play a part in ensuring responsible and ethical collection and use of data in this era of escalating datafication?

More critical engagement around the lifecycle of child-related data is of paramount importance. Attention needs to be given to the collection, preparation, analysis, usage, sharing and storing of children’s data. To begin with, the public sector, businesses, and civil society organizations delivering data-related services for children need to better understand the associated risks–as well as opportunities–in an environment characterized by growing quantification and datafication. We should also be asking ourselves whether and how children and young people themselves – the sometimes so-called ‘digital natives’ – are being involved in helping to address these issues.

The ethical challenges surrounding the datafication of children’s lives are ongoing and continually evolving. We welcome you to join a discussion about Responsible Data for Children by visiting RD4C.org.

Stefaan G. Verhulst is Co-Founder and Chief Research and Development Officer of the Governance Laboratory @NYU (GovLab) where he is building an action-research foundation on how to transform governance using advances in science, data and technology. Verhulst’s latest scholarship centers on how technology can improve people’s lives and the creation of more effective and collaborative forms of governance. Specifically, he is interested in the perils and promise of collaborative technologies and how to harness the unprecedented volume of data to advance the public good.

Andrew Young is the Knowledge Director at The GovLab, where he leads research efforts focusing on the impact of technology on public institutions. Among the grant-funded projects he has directed are a global assessment of the impact of open government data; comparative benchmarking of government innovation efforts against those of other countries; a methodology for leveraging corporate data to benefit the public good; and crafting the experimental design for testing the adoption of technology innovations in federal agencies.

Thanks to Lara Mikocki and Gabrielle Berman for their input and review and Jaimee Dellipoali and Michelle Winowatan for their assistance.