What is Data Ethics for us?

Please see Hajek, K.M., Trauttmansdorff, P., Leonelli, S., Guttinger, S., Milano S. (2025) How to Foster Responsible and Resilient Data. Computer [IEEE Computer Society] https://doi.ieeecomputersociety.org/10.1109/MC.2024.3522696.

The Ethical Data Initiative (EDI) aims to go beyond devising a fixed set of moral principles or code of conduct for producing or working with data—privacy and ownership, for instance. Principles alone are not enough because they are often too abstract; they acquire concrete meaning when they are interpreted in practice. We thus define data ethics as a living ethos: a responsible way of approaching the creation, transmission, storage, and (re-)use of data, which provides resources to anticipate, identify, and address challenges.

This ethos is necessarily dynamic: it needs to be continuously informed by local settings and framed by context-dependent methods and practices. It entails paying attention to the processes of decision-making and the particularities that make up every step along a data journey—the steps data travel from the initial moment of creation in a lab, field, or digital encounter, through inscription in, for instance, an open-source repository, to interpretation and re-use by a third-party actor. When assessed in terms of specific contextual steps in that journey, ethical principles and guiding values become meaningful and actionable. 

Ethical attention to data is bearing in mind the context in which data are created—the choices made by any humans involved in the process, and the ways data are configured by material objects, environments, and apparatus. The EDI fosters precisely this attention to choices and their implications in data-work. We aim to make data practices more responsive to social settings, more responsible regarding their consequences, and more resilient against possible misuse or misappropriation. 

Without attention to choices, context, and consequences, the data and knowledge we produce “threatens to blindly privilege specific ways of knowing” (Leonelli 2022), and particularly to privilege the interests of powerful groups in society, often to the detriment of the most vulnerable. The contexts in which data are handled are hugely diverse, and the implications of data-practices should be assessed in light of that diversity, with recognition that each domain has its specific forms of expertise. Recognising diversity does not mean that anything goes. Rather, it means ensuring that the criteria used to evaluate data-work are relevant and appropriate to the research in question, in light of its goals, situation, and methods.

Attending carefully to the consequences and contingencies of each step on a data journey requires targeted ways of integrating data ethics within a given situation, and particularly of dealing with the conflicts that often arise between different values. Examples of such conflicts include: the need to preserve privacy while still fostering openness in research on medical data; the wish to make data widely accessible, while also tracking their provenance and uses—thereby making data users accountable; and efforts to share data fairly among researchers, while also recognising the ownership claims of those who may have invested most in creating and disseminating the data. In trying to address such diverging expectations in concrete cases of data work, it is crucial to be alert to what choices are made and why, who benefits from those choices, and how they may affect relevant publics. Such analysis helps to unearth data-related discrimination and bias, and makes data practices better accountable to scrutiny and critique. 

Approaching data ethically is about prioritising human judgement over what may be technologically feasible or best adapted to computational analysis. As the use of data evolves constantly, so must our awareness of how they are created and processed, what constraints that entails, and who is benefited or disadvantaged. We advocate making data work a thoughtful process, which engages with relevant stakeholders and environmental concerns, rather than rushing towards technological convenience or the appearance of innovation.

For the EDI, responsible data work ultimately aims to improve the living conditions of all creatures on Earth, which involves being attuned to the plurality of human experience and the preservation of our environment. This means valuing the diversity of the humans who take part at every stage of data work, and building mechanisms to address persistent injustices and inequities in their visibility and ability to access resources. Similarly, harms affecting non-human organisms and the environment as a whole will only be effectively reduced and prevented when they are given attention throughout data practices.