The CARE Principles for Indigenous Data Governance

Concerns about secondary use of data and limited opportunities for benefit-sharing have focused attention on the tension that Indigenous communities feel between (1) protecting Indigenous rights and interests in Indigenous data (including traditional knowledges) and (2) supporting open data, machine learning, broad data sharing, and big data initiatives. The International Indigenous Data Sovereignty Interest Group (within the Research Data Alliance) is a network of nation-state based Indigenous data sovereignty networks and individuals that developed the ‘CARE Principles for Indigenous Data Governance’ (Collective Benefit, Authority to Control, Responsibility, and Ethics) in consultation with Indigenous Peoples, scholars, non-profit organizations, and governments. The CARE Principles are people– and purpose-oriented, reflecting the crucial role of data in advancing innovation, governance, and self-determination among Indigenous Peoples. The Principles complement the existing data-centric approach represented in the ‘FAIR Guiding Principles for scientific data management and stewardship’ (Findable, Accessible, Interoperable, Reusable). The CARE Principles build upon earlier work by the Te Mana Raraunga Maori Data Sovereignty Network, US Indigenous Data Sovereignty Network, Maiam nayri Wingara Aboriginal and Torres Strait Islander Data Sovereignty Collective, and numerous Indigenous Peoples, nations, and communities. The goal is that stewards and other users of Indigenous data will ‘Be FAIR and CARE.’ In this first formal publication of the CARE Principles, we articulate their rationale, describe their relation to the FAIR Principles, and present examples of their application.

Keywords: Indigenous; data sovereignty; data governance; data principles; FAIR principles

Introduction
As the world engages with open data, big data, data reuse, and open science, data have increasingly become a global resource used for wielding power, making decisions, spurring innovation and discovery, and commercialization (Independent Expert Advisory Group 2014, The Economist 2017). The effects of this shift are amplified by the adoption of artificial intelligence technologies and the increasing convergence of biological and digital worlds facilitated through data sharing platforms. Historically plagued by data inequities and data exploitation, Indigenous Peoples have raised concerns about the need to integrate Indigenous knowledges and approaches into data practices and policies as both the volume and opportunities for secondary use of data increase (Jackson et al. 2019; Kukutai & Cormack 2019; Carroll et al. 2019; Garrison et al. 2019; Rainie et al. 2019; Kukutai & Taylor 2016b). The articulation of Indigenous Peoples’ rights and interests in data about their peoples, communities, cultures, and territories is part of reclaiming control of data, data ecosystems, data science, and data narratives in the context of open data and open science. Control, coupled with a focus on collective benefit and equity, repositions Indigenous Peoples, nations, and communities from being subjects of data that perpetuate unequal power distributions to self-determining users of data for development and wellbeing. Harnessing data for governance and acting in the governance of data, shifts Indigenous Peoples from invisibility within data ecosystems to vibrant contributors to data policies, practices, ethics, and innovation.

Background
Global Indigenous presence spans over 90 countries and includes over 370 million Indigenous persons living out more than 5,000 distinct cultures (UN 2009). Indigenous Peoples have continuity with their pre-colonial societies and seek to use their own social, political, and economic systems to preserve, develop, and transmit to future generations their cultures, knowledges, and relationships with their territories and resources (Martinez Cobo 1982). While often limited by settler-colonial oppression and definition, Indigenous Peoples retain their rights to self-determine their forms of collective governance and rules of membership.

In the contemporary world, Indigenous Peoples and nations, people, and communities have become actors across global societies. Indigenous Peoples and their nations are the rightsholders as affirmed by the United Nations and nation-state legal recognition of Indigenous sovereignty (Echo-Hawk 2013; UN 2007; Anaya 2004). Indigenous Peoples as sovereign political collectives often participate in supra- and intertribal organizations such as the Oceti Sakowin Oyate and the Iwi Chairs’ Forum, and have local or regional sub-systems such as Tohono O’Odham districts in the US and the Federation of Victorian Traditional Owner Corporations in Australia. Indigenous people often live as citizens of nations (Indigenous) within nations (the nation state, e.g., the United States). Indigenous people also participate in other Indigenous stakeholder groups that do not hold inherent sovereignty, including daily life within Indigenous communities (groups of individuals) within their nations or mainstream society and interaction with Indigenous organizations such as health care entities, non-profits, and businesses.

Ongoing processes of colonization of Indigenous Peoples and globalization of Western ideas, values, and lifestyles have resulted in epistemicide, the suppression and co-optation of Indigenous knowledges and data systems (Carroll, Rodriguez-Lonebear & Martinez 2019; Kukutai & Taylor 2016a; Kukutai & Walter 2015; Smith 2012). These processes have limited the ability of Indigenous Peoples to recover, develop, and sustain their knowledges, an ability that is central to Indigenous Peoples’ capacity to realize their human rights and fulfill their responsibilities (Wilson, 2004; Ratima 2001).

Alongside Indigenous aspirations for self-determination, social justice movements have promoted the importance of equity within society and become more deliberate about addressing the negative effects of unconscious bias and structural racism. The pervasiveness of bias across data ecosystems (digital infrastructures, analytics, and applications), within sub-systems (regions/platforms), as well as institutions and communities indicate the importance of addressing inequities at multiple levels. Indigenous people recognize and already incorporate this multi-scale complexity in their knowledge systems and activities; however, this may not be the case for other communities of practice, or knowledge systems.

The Emergence of Indigenous Data Sovereignty
Since the 1970s, there has been a resurgence in discourse around Indigenous knowledges, identities, and rights, culminating in the 2007 United Nations Declaration on the Rights of Indigenous Peoples [UNDRIP] (UN 2007). UNDRIP reaffirms Indigenous Peoples’ rights to self-determination as political entities and honors the principle of Indigenous control over Indigenous data (Tsosie 2019; Davis 2016). The rights articulated in UNDRIP (especially Article 31) also reflect discourse around Indigenous Cultural and Intellectual Property Rights (ICIP) and Indigenous research ethics (Mātaatua Declaration 1993; Julayinbul Statement 1993; Janke 1998, 2004; Drugge 2016; Tsosie 1997, 2019; Anderson 2009, 2015). UNDRIP reflects a broad approach to Indigenous data that is not restricted by mainstream conceptions of knowledge and intellectual property (Posey & Dutfield, 1996).

Indigenous Peoples’ data include data generated by Indigenous Peoples, as well as by governments and other institutions, on and about Indigenous Peoples and territories. As well as information about Indigenous communities and the individuals, Indigenous and non-Indigenous, that live within (Carroll, Rodriguez-Lonebear & Martinez 2019; Rainie et al. 2019; MnW 2018; Nickerson 2017; TMR 2016). Indigenous Peoples’ data comprise (1) information and knowledge about the environment, lands, skies, resources, and non-humans with which they have relations; (2) information about Indigenous persons such as administrative, census, health, social, commercial, and corporate and, (3) information and knowledge about Indigenous Peoples as collectives, including traditional and cultural information, oral histories, ancestral and clan knowledge, cultural sites, and stories, belongings.

The advancement of Indigenous self-determination and the reclamation of Indigenous identity and knowledges over the past 50 years has led to the emergence of Indigenous Data Sovereignty, an assertion of the rights and interests of Indigenous Peoples in relation to data about them, their territories, and their ways of life (Carroll, Rodriguez-Lonebear & Martinez 2019; Rainie et al. 2017; Kukutai & Taylor 2016a; Snipp 2016). Indigenous Data Sovereignty has become an increasingly relevant topic as big data, open data, open science, and data reuse become an integral part of research and institutional practices (Carroll, Rodriguez-Lonebear & Martinez 2019; Lovett et al. 2019; Rainie et al. 2019; Kukutai & Taylor 2016a; Taylor & Kukutai 2015). Given that most Indigenous data are held by non-Indigenous governments, institutions, and agencies, increasing Indigenous Peoples’ participation in data governance activities is central to realizing Indigenous Data Sovereignty (Carroll, Rodriguez-Lonebear & Martinez 2019; Rainie et al. 2019; Kukutai & Taylor 2016b; Rodriguez-Lonebear 2016; Smith 2016; Snipp 2016). Indigenous data governance includes both the stewardship and the processes necessary to implement Indigenous control over Indigenous data (collection, storage, analysis, use, reuse) (Carroll, Rodriguez-Lonebear & Martinez 2019; Rainie et al. 2019; Walter 2018; Rodriguez-Lonebear 2016; Smith 2016). The development of conceptual frameworks to inform processes for stewardship and control of Indigenous data has been ongoing in areas such as science, research methods, and research governance (Cajete, 2000; Wilson, 2008; Kovach, 2009; McGregor, Restoule & Johnston, 2018). There is a need to apply insights generated from these frameworks within data ecosystems to help address the challenge of producing data that reflect Indigenous Peoples’ interests and governance needs.

Rationale
Mainstream values related to research and data are often inconsistent with Indigenous cultures and collective rights. For instance, data historically belonged to the researcher in mainstream settings, a notion that has shifted under open data and open science movements where publicly funded research requires data sharing. Meanwhile, Indigenous worldviews have typically centered ‘people’ and ‘purpose’ through governance processes that emphasize collective ownership and control of data (Carroll, Rodriguez-Lonebear & Martinez 2019; Kukutai & Taylor 2016b; Smith 2016; Snipp 2016). Indigenous Peoples have been working on ways to authenticate Indigenous forms of knowledge and to fortify their rights to govern research and resulting data (Garrison et al. 2019; Hudson et al. 2016; Smith 2012).

Drawing on the First Nations principles of OCAP® that created data standards for Ownership, Control, Access, and Possession in Canada in the 1990s (FNIGC 2019), Indigenous Data Sovereignty networks in Aotearoa New Zealand, Australia, and the United States, as well as Indigenous scholars, leaders, and allies, determined that there was an urgent need to develop global principles for the governance of Indigenous data. Underlying this need were the primary goals of (1) fostering Indigenous self-determination by enhancing Indigenous use of data for Indigenous pursuits and (2) honoring the ‘FAIR Guiding Principles for scientific data management and stewardship’ (Findable, Accessible, Interoperable, Reusable) (Wilkinson et al. 2016), while ensuring data sharing on Indigenous terms.

The ‘CARE Principles for Indigenous Data Governance’ (Collective Benefit, Authority to Control, Responsibility, and Ethics) (RDA IG 2019) empower Indigenous Peoples by shifting the focus from regulated consultation to value-based relationships that position data approaches within Indigenous cultures and knowledge systems to the benefit of Indigenous Peoples (Castellano 2004; Anderson et al. 2003). This shift ultimately promotes equitable participation in processes of data reuse, which will result in more equitable outcomes.

Development
Academics and practitioners in the International Indigenous Data Sovereignty Interest Group within the Research Data Alliance (RDA 2017) acknowledged the tension between protecting Indigenous rights and interests in data and supporting open data. With this challenge in mind, a workshop was held in Gaborone, Botswana, on Indigenous Data Sovereignty as part of International Data Week in 2018. This Indigenous-led workshop convened Indigenous and allied academics and practitioners to draft principles for Indigenous data governance. Workshop organizers prepared an analysis of principles from frameworks issued by Indigenous Data Sovereignty and governance networks, as well as by mainstream sources, to inform dialogue (Table 1).

Table 1: Key Documents Contributing to the Development of the CARE Principles.

Participants concluded that existing frameworks were oriented toward data, people, and purpose (see Figure 1). While both Indigenous and mainstream principles identified data-centric principles (such as those named in the FAIR Principles), the Indigenous frameworks emphasized people- and purpose-oriented principles.

Workshop participants felt that the mainstream frameworks adequately covered data-centric principles and focused their attention on the ‘people’ and ‘purpose’ foci of the Indigenous principles to create the CARE Principles. Once the four primary CARE Principles of Collective benefit, Authority to control, Responsibility, and Ethics were identified, participants broke into small groups to outline and draft the language for both primary and sub-principles. The resulting draft was shared among nation-state based network members, Indigenous leaders, Indigenous scholars, and allies for input (edits, revisions, comments, feedback) via electronic media, presentations, and engagements.

The CARE Principles are designed to complement the FAIR Principles and guide the inclusion of Indigenous Peoples in data processes that strengthen Indigenous control for improved discovery, access, use, reuse, and attribution in contemporary data landscapes.

Exposition
The CARE Principles address important considerations for modern data ecosystems and across data lifecycles that support both innovation and Indigenous self-determination (Figure 2). The Principles describe high-level actions applicable within research, government, and institutional data settings. The goal is for data stewards and other users of Indigenous data to implement CARE and FAIR Principles in tandem.

While each is conceptually distinct, the CARE Principles are related to each other. The Principles define rights, interests, and concepts to be employed in facilitating Indigenous control in data governance and reuse. An enacted mechanism can address a single Principle or multiple simultaneously. However, data governance must address the full set of Principles to demonstrate a CARE Full process.

The CARE Principles and Their Supporting Concepts
Indigenous data must facilitate collective benefit for Indigenous Peoples to achieve inclusive development and innovation, improve governance and citizen engagement, and realize equitable outcomes. Benefits accrue when data ecosystems are designed and function to support (1) Indigenous nation and community use and reuse of data; (2) use of data for policy decisions and evaluation of services; and (3) creation and use of data that reflect community values.

UNDRIP affirms Indigenous Peoples’ rights and interests in their data. Recognition of these rights bolsters Indigenous Peoples’ authority to control and govern such data, further affirming the need for ‘data for governance.’ Indigenous Peoples must have access to data that support Indigenous governance and self-determination. Indigenous Peoples must be the ones to determine data governance protocols, while being actively involved in stewardship decisions for Indigenous data that are held by other entities.

When working with Indigenous data, there is a responsibility to nurture respectful relationships with Indigenous Peoples from whom the data originate. Aspects of the relationship include investing in capacity development, increasing community data capabilities, and embedding data within Indigenous languages and cultures. Pursuing these goals fulfills the ultimate responsibility of supporting Indigenous data that advances Indigenous Peoples’ self-determination and collective benefit.

Indigenous Peoples’ rights and wellbeing should be the focus across data ecosystems and throughout data lifecycles in order to minimize harm, maximize benefits, promote justice, and allow for future use. Paramount to ethics in data practices is representation and participation of Indigenous Peoples, who must be the ones to assess benefits, harms, and potential future uses based on community values and ethics.

The CARE Principles in Action
The usefulness of high-level Indigenous data governance principles relies on both Indigenous communities and research/data communities to understand operative concepts and to apply them preemptively across data ecosystems and lifecycles. Creating opportunities for non-Indigenous understanding and for Indigenous leadership in data processes will contribute to the development of data and data systems that can lead to Indigenous innovation and development.

Early Adopters of the CARE Principles
Defining principles is an important step toward exercising Indigenous Data Sovereignty and addressing longstanding concerns about data governance. Some organizations are adopting the CARE Principles through relationships, policies, and mechanisms. Examples include the Research Data Alliance, the Smithsonian Institution, and the Open Data Charter.

Research Data Alliance (www.rd-alliance.org), in research environments
In 2016, the Research Data Alliance (RDA) hosted a workshop on Indigenous Data Sovereignty. Recognizing the importance of enhancing Indigenous rights and interests in data and potential implications for mainstream data policy, the RDA supported the creation of the International Indigenous Data Sovereignty Interest Group (IG). The IG convenes existing Indigenous Data Sovereignty networks with other Indigenous and allied researchers, which has resulted in broader global connections, a deeper focus on research data, and the CARE Principles (Lovett et al. 2019; Rainie et al. 2019; RDA IG 2019; RDA 2017). In further support of the IG and the CARE Principles, the RDA formally affiliated with the Global Indigenous Data Alliance (GIDA), which hosts the Principles. The affiliation affirms the importance of both CARE and GIDA, and represents an effort to promote implementation of the Principles throughout the RDA community (RDA 2019).

The Smithsonian Institution (www.si.edu), in collections environments
In 2012, the American Museum of Natural History at the Smithsonian Institution in Washington, DC released an Access and Benefit Sharing Policy on Genetic Resources in response to the Nagoya Protocol (2010) addressing Indigenous expectations around benefit sharing, prior informed consent, and mutually agreed terms for collections that it stewards (UN 2011; SI 2012). Subsequently, the US government released the new Open Access to Research Policy in 2013 as an important step in providing unprecedented access to research and data (Stebbins 2013). However, this policy (and others like it) treats all data in the same way: without attention to the nature, the content, or the conditions under which the data was created in the first place (Creative Commons 2019, Wikimedia 2019).

In December 2019, Smithsonian Directive 609 was released (SI 2019). With specific focus on digital asset access and use, it sets forth Smithsonian policy for access and use of Smithsonian digital assets ahead of the expected release of the institutional Open Access Statement, in February 2020. The SD 609 is important, because it helps identify specific categories of material that might need more attention and care in their management. One of these includes sensitive content, which is:

"defined in different ways by members of individual communities, nations, tribes, ethnic groups, and religious denominations, but may include materials that relate to traditional knowledge and practices. Such materials may: a) be considered the private domain of specific individuals, clans, cults or societies; b) require an appropriate level of knowledge to view and understand; c) threaten the privacy and well-being of a community when exposed or disclosed to outsiders; and/or d) give offense if inappropriately used or displayed, or when appropriated or exploited for commercial purposes. SD 609 Restriction Code: B1. (SI 2019)"

This identification offers an important opportunity to further explore the incorporation of the CARE Principles into the Smithsonian’s imminent Open Access policy. Recognizing complexity within Smithsonian digital assets and accompanying metadata, the CARE Principles would add an important layer of relational and temporal governance for Indigenous collections within an open access logic. Additionally, there are further opportunities for the CARE Principles to be directly extended into collection management practices. For example, work is about to begin to develop a unique set of ‘Collection CARE Notices’ for Indigenous collections at the Smithsonian and then into other US institutional contexts. Extending the current Cultural Institution Notices developed by Local Contexts (LC 2018), these new Notices will reflect inherent relationships of care, responsibility and governance in collection stewardship within the institution. As they are developed, these Notices will integrate the CARE Principles directly, embedding these into the Smithsonian’s cataloguing and digital infrastructure for Indigenous collections. By providing item- and collection-specific information, the Collection CARE Notices will function as a direct mechanism to assist in management and decision-making consistent with CARE Principles.

Open Data Charter (opendatacharter.net), in government environments
After release of the CARE Principles, the Open Data Charter (ODC) Implementation Working Group hosted a conversation on the Principles (Stone and Calderon 2019). Discussion focused on implementation mechanisms as well as resonance among ODC, FAIR, and CARE Principles (Figure 3). While ODC Principles reflect all four FAIR Principles, they directly address only the CARE Principle of ‘Collective benefit.’ The ODC Principle of ‘Open by default’ does not directly map to the FAIR or CARE Principles. During a review of ODC Principles in 2018, open data community members hotly debated ‘Open by default,’ showing a broad range of interpretations ranging from construing ‘Open by default’ as a key tenet of open data to considering the principle as tough to define and operationalize (Stone and Calderon 2019). The CARE Principles can help refine and specify what ‘Open by default’ means by (1) ensuring roles for Indigenous Peoples and other marginalized communities in open data decision-making and (2) using mechanisms to protect access to data as indicated by these communities to minimize harms and maximize benefits.

While implementation has commenced in these examples and elsewhere, further research is needed (1) to identify mechanisms that support implementation of the CARE Principles; (2) to create tools, policies, and practices that enact the Principles; (3) to explore the application of the Principles in different contexts, such as research repositories, knowledge co-production, institutional policies and practices, and, retrospectively, in already existing databases and systems; and (4) to operationalize the Principles in tandem with the FAIR Principles.

Conclusion
Being CARE Full is a prerequisite for equitable data and data practices. The CARE Principles draw from, integrate, and build on the work of mainstream stakeholders focused on data for reuse (e.g., FAIR Principles) and the efforts of Indigenous-led networks and coalitions focused on Indigenous data governance and research control. While centering data in the FAIR Principles complements other efforts to inform responsibilities for data producers and repositories, the CARE Principles extend that work to actions that align with the ‘people’ and ‘purpose’ for which data exist and are used. As the mainstream community of data stakeholders advances standards and practices to facilitate data reuse and research reproducibility, the CARE Principles enhance that work by reversing historical power imbalances. The CARE Principles address historical inequities by creating value from Indigenous data in ways that are grounded in Indigenous worldviews and by realizing opportunities for Indigenous Peoples within the knowledge economy.

In addition, increasingly robust management and stewardship frameworks for Indigenous digital resources to the benefit of Indigenous Peoples have broader implications for other marginalized populations, highlighting the importance of local and group control for the quality and reproducibility of research and data (Rainie et al. 2019). Furthermore, the CARE Principles address issues of relevance for many populations (e.g., privacy, future use, reuse, stewardship), to the extent that the Principles provide a normative foundation for setting standards, crafting policy, and developing research agendas in relation to data acquired about populations as collectives. Those populations might be minority groups, distinctive communities, or other collectives wanting to maintain high levels of trust and accountability in the use of data about their communities.

The implementation of the CARE Principles in tandem with the FAIR Principles will result in data that reflect the realities of Indigenous Peoples, be useful for Indigenous purposes, and remain under Indigenous control, while promoting knowledge discovery and innovation. The CARE Principles are a guide for data producers, stewards, and publishers to affirm Indigenous rights to self-determination through CARE Full data practices that will ultimately address complex issues related to privacy, future use, and collective interests, and increase the value of data for reuse. Just as the FAIR Principles are a precursor to good data management and stewardship practices, the CARE Principles encourage data users to be FAIR and CARE.

Acknowledgements
The original workshop “Indigenous Data Sovereignty Principles for the Governance of Indigenous Data,” was organized by Stephanie Russo Carroll and Maui Hudson, in collaboration with the Research Data Alliance (RDA) International Indigenous Data Sovereignty Interest Group at the International Data Week held Thursday, 8 November 2018, in Gaborone, Botswana. The Principles described in this manuscript represent voluntary contributions and participation of the authors at, and/or subsequent to, this workshop and from the wider US Indigenous Data Sovereignty Network, Te Mana Raraunga Māori Data Sovereignty Network, and Maiam nayri Wingara Aboriginal and Torres Strait Islander Data Sovereignty Collective communities. We acknowledge the Indigenous Peoples of Botswana, on whose land the Principles emerged, as well as Indigenous Peoples worldwide. We are grateful to the RDA for the workshop location and to the RDA US for travel support for some attendees. This paper was supported by the RDA Europe 4.0 project that has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 777388. Thank you to the following individuals for their comments, edits, and suggestions: Randy Akee, Leah Ballantyne, Donna Cormack, Dominique David-Chavez, Bhiamie Eckford-Williamson, Nanibaa’ Garrison, Sharon Hausam, Lydia Jennings, Tahu Kukutai, Kelsey Leonard, Christina Ore, Qunmigu (Kacey Hopson), Andrew Sporle, Michele Suina, Maggie Walter, the Alaska Native Policy Center at the First Alaskans Institute, and the attendees at the “The International Law, United Nations Declaration of the Rights of Indigenous Peoples and Indigenous Data Sovereignty” Workshop at the Oñati International Institute of the Sociology of Law. We also acknowledge Andrew Martinez’s contributions to designing the tables and figures.

Competing Interests
M.P. is Editor-in-Chief of the Data Science Journal, but played no role in the review and editing of this paper. The authors have no other competing interests to declare.

Author Contributions
S.R.C. was the primary author of the manuscript, and coordinated and participated extensively in the drafting and editing of the CARE Principles. S.R.C., I.G., and D.R.L received partial support from the Morris K. Udall and Stewart L. Udall Foundation. As senior author, M.H. was significantly involved in the drafting of the CARE Principles and this manuscript text. I.G. was significantly involved in the drafting of the manuscript. J.A. contributed a case that illustrates the emerging implementation of the CARE Principles. All other authors are listed alphabetically, and contributed to the manuscript by their participation in the initial workshop and/or by editing or commenting on the manuscript text.