Data Governance Models and the Environmental Context

This blog was originally posted as a three-part series on the Open Environmental Data Project blog.

Part I

Part one of this series explores a brief introduction to why the approach of collaborative governance is interesting and attractive in the environmental context and then defines the different types of governance models being discussed today.


Environmental problems are considered “wicked,” consisting of different complex systems interacting with each other and necessitating a diversity of approaches for solving. They categorically call for decision-support tools like environmental data and information, alongside principles on how best to use the knowledge products. The complexity of environmental problems manifests across the requirement to monitor different environmental parameters (such as air and water quality, and biodiversity), but also to adapt those data collection efforts to the context of existing environmental legal structures, regional politics and economic industry. Though there may be a clearly articulated goal, such as “we want to drink clean water,” getting to that goal takes people on a journey through these wickedly complex systems. Currently, the path to navigating this maze must account for other environmental factors that influence clean drinking water — soil and air quality, an evaluation of economic alternatives, and the consideration of laws that are not related to clean water, but perhaps migratory birds, and finding the correct institutional allies to accomplish clean drinking water goals.

Acknowledging that the Open Environmental Data Project’s work rests inside this wicked problem space, we are currently exploring how alternative models of data ownership, management and data quality assurance could provide fertile ground for solving some of the problems of the environmental data maze. Our starting point is to look at a number of (old and new) frameworks for how data is collectively governed and identify when and where they’ve been applied or could be applied to the environmental problem space.

Types of different data models

As parts of the world enter into the information age — where the sub unit is data — there are many attempts to apply a variety of common resource-sharing structures to data. These generally fall across the following social and economic instruments: guilds, commons, collaboratives and trusts. Each approach has a slightly different purpose, but they all serve to enhance data sharing, management and stewardship practices, sometimes with a rights-based approach and a careful attention to detail on who benefits from the data. The following is a brief high-level discussion of the different types of models encountered today.

Data Guilds

A guild was a common social and economic structure in medieval Europe that allowed for collective governance over common trade industries like carpentry, masonry and steelwork. The model allowed for the communities to control the inputs (tools, raw materials, education/apprenticeship) and the outputs (houses, buildings, ships, etc.) of their trades, while receiving a fair wage for their work. There were restrictions on the tools that could be used to produce the goods in order to meet and qualify for a standard of work. In the context of data, guilds tend to be considered ways to build the labor force required for a functioning data ecosystem. Many examples lay in the venture capital market, where firms are investing in products that meet certain structural components or in the financial industry where banks are building education and training programs into constructing data analytics teams for financial product management.

Data Commons

In general, a commons can be defined as the combination of a resource, a community that gathers around that resource and a set of rules to care for the resource and its resultant community. Most often it is invoked in the context of natural resource commons — such as water, air and land — but this has evolved in the information age into “peer-to-peer” sharing networks. These networks rely on building the enabling capacities of the internet for contributive actions, or in other words, allow for a community to decide how to optimize a resource like water, air or land and define the rules for who and how they might access and use that resource. Many examples of this commons approach in the environmental data field tend to have commons nomenclature attached to a scientific data repository with established data sharing rules and agreements among different scientific and policy institutions.

Data Collaboratives

Data collaboratives are designed to tackle a specific problem for which can only be solved, or could be better solved, with multiple data sources. The key to their functionality is carefully articulating the problem that needs a collaborative data solution. Their functionality and design rests on the desire to understand trends, and for the public reuse of the resultant data and the projects that might arise from the data. The Netherlands Center for Big Data Statistics is experimenting with combining all government collected data on community level issues, visualizing it and sharing it back with the public to create possibilities for future collaborations while the Ag Data commons by the USDA is an attempt to combine all data related to a variety of different scientific disciplines like genomics, hydrology, soils, and economic statistics, and package it for supporting more informed science and policy decisions. For further examples of what qualify or constitute as a data collaborative see Gov Labs Data Collaborative Explorer.

Data Trusts

A data trust can be defined as legal infrastructure that provides stewardship over data by a third-party entity. The core concept behind a data trust is a governance structure composed of a council of people who have a “fiduciary” role to the data entrusted to them — that role is then defined by the beneficiaries of the data and describes their rights to that data. It has mostly been applied within the context of healthcare and social media with attention to maintaining privacy and autonomy in how data can be used for commercial gain. However, the instrument could have applicability in other realms of data ecosystems where the ownership over data has implications for regulatory or stewardship purposes.

Part II

Part two of this series examines environmental use cases and how data governance models could be employed to address transparency, the deliberative process, benefit sharing or local level data control. We briefly researched a handful of trending environmental problems and suggested where a new style of data governance could enhance the problem-solving process.

Natural Resource Commons: Fisheries management across geo-political boundaries

The Columbia Basin Partnership Task Force is a partnership between the National Marine Fisheries Service (NMFS) and the National Oceanic and Atmospheric Administration (NOAA) to manage the salmon and steelhead in the Columbia river basin which spans four states in the Pacific Northwest of the U.S. Their mandate is to create a list of recommendations on how to meet the long-term goals of species recovery which influences economic and social activity along the Columbia river basin. One of the visions for the partnership is to restore the salmon and steelhead habitat, which has been declining since the 1800’s despite 50 years of restoration effort. The largest challenge set out in this shared fisheries management initiative is not a lack of data, but a lack of shared common goals across all the organizations. However the task force, required by US federal law, invites public input on their recommendations via existing public commentary systems like the Federal Register. An environmental data governance model, such as a collaborative, could explore how the feedback cycle of public comment into regional collective management decisions could be strengthened with pooled data about various management issues (catch size limits, market mechanisms, restoration efforts, species richness, water quality, etc.).

The U.S. unincorporated territory of Guam, located in the south pacific, has been under U.S. jurisdiction for the past 50 years. During this time, the Western Pacific Regional Fisheries Management Council, based out of Hawaii, has provided oversight for Guam’s natural resources and its revenue (mainly fisheries). However, we are currently in a historic moment where the oversight is being transferred back to the local government. There are many areas where legacy scientific institutions with years of environmental trend data will need to be integrated and transferred into a local context. An environmental data governance model, such as a data trust, could establish and prioritize the new rights and needs of the local Guam government while onboarding legacy systems that have prior federal, and perhaps not well suited data sharing and stewardship practices, for a local context.

Natural Resource Commons: Water and air management

There are multiple existing forms of public land trusts which either follow a river system or a network of public and private lands. The Delaware River Basin, a project by The Trust for Public Land, contains over 300 miles of free flowing rivers which cross three different states. The goal of the river basin trust is oversight and direction for any development projects along the river basin. It is possible to imagine a new environmental data governance structure that ensures disparate entities using the river basin have both access to accurate data about environmental quality, and further avenues for providing input on proposed development along the river basin.

Due to the COVID crisis cities across the world are now experiencing what life could be like without (or with decreased) air pollution. In Delhi, India where the Air Quality Index consistently maxes out at the top of the scale (300 is considered unhealthy, 999 is where the monitors max out), residents are now seeing daily limits of 100 due to lockdown. Considering that fourteen of the top twenty most polluted cities in the world are located in India, this shift has many researchers considering ways in which they can leverage this change for advancing cleaner air policies in regards to managing traffic and congestion. Prior to COVID it was difficult to make the case about where the majority of the sources of air pollution in Delhi came from, but after the lockdown it is easier to identify that over 70% of air pollution in Delhi is locally generated, as opposed to surrounding industry. This new knowledge calls for a stronger block level air quality monitoring system which could feed into a city-level understanding of where the problem areas are. Because a block level air quality monitoring program has the potential to identify local businesses and individuals as main sources of local pollution, privacy and stewardship concerns would need to be considered. A data collaborative, with rules for engaging with the data, could alleviate some of these concerns.

Extractive Industries: Long-term monitoring and land reclamation

The San Juan Generating Station, a coal-fired power plant with adjacent coal mine, will close its coal burning plant by 2022. Though the economic importance of the San Juan Generating Station is central to the economic landscape of this area of New Mexico, the impacts to health outweigh economic benefit. There is a proposal currently in review to transform the facility into a carbon capture and sequestration facility jointly owned by Enchant Energy and the City of Farmington, New Mexico, which would operate into 2035. Carbon capture and sequestration is a controversial and still unproven method. Though it would provide regional economic relief, slating a relatively new model of energy capture (predicted to be the largest facility carrying out this activity in the world) will need substantial monitoring at many levels. Along with state and federal guidance on environmental assessment, leveraging the a new data governance model, such as that of a data collaborative, towards the shared goal of a decade-plus monitoring of the retrofitted facility could be a way in which a multivalent set of interests — scientists who want to understand more about carbon capture and sequestration, residents concerned about public health, policy makers who need to learn more about the economic risks and benefits of this model — could be measured in concert.

If the facility is not repurposed into a carbon capture and sequestration facility, in 2022 the facility would enter a period of post-closure in which decontamination and reclamation would be argued for (rather than simply shutting down operations and fencing the old facility). A data governance model, such as a trust, with the optimal goal of land remediation could provide for the long term, multiple stakeholder governance of decisions made during the reclamation period.

Extractive industries: Environmental Impact Assessments and commentary

Near Iliamna, Alaska a site called “Pebble Mine’’ boasts a billion dollars surplus of valuable metals. The development of this mine was blocked during the Obama Administration, but recently received approval from the Trump Administration after a (contested), Environmental Impact Statement (EIS) was completed which indicated that “under normal circumstances” there would be no significant environmental detriment or affect on fishing, the other significant economic activity of the region. However, the mine site development will construct holding ponds that have the potential to leach, and two pipelines, which have the potential to leak, one to carry concentrate and one to carry natural gas which will power the mining operations. Though some community members and groups see the economic benefit of the mine, public commentary on the EIS demonstrated that there is a significant level of community concern about the potential for waterway contamination and thus danger to the local (Bristol Bay) fishery, a major source of regional economic livelihood.

If the mining project continues to move forward, there are several opportunities for use of an environmental data governance model, such as a collaborative, to form. The first is to meet the rapid environmental monitoring requirements laid out in an Environmental Impact Assessment. An environmental data collaborative could be a community-formed response and place of input in the monitoring process, with a governance structure that ensures fair representation of community-level datasets around multiple environmental parameters. The second opportunity is the potential for ongoing assessment to trigger public commentary periods in which case a data collective could be used as a consolidating force to ensure distributed and representative public witness to impacts that might arise from mining activities.

Climate change: Economic management

New Orleans is increasingly experiencing local urban flooding, not just during tropical storms and hurricanes, but because of the increasing daily effects of climate change. New Orleans is the perfect storm — a city originally built barely above sea level, with poor drainage and pumping systems, sinking marshy land, and faced with rapidly rising sea levels. On a normal, yet increasingly intense, stormy afternoon in New Orleans, receiving a City of New Orleans “Nola Ready” alert that residents can park their cars on the higher up neutral ground (boulevards) to avoid vehicle flooding, is now normal. In 2016, the Federal Emergency Management Administration (FEMA) rezoned half the population of the urban area out of “high-risk” flood zones, effectively loosening requirements on who was and was not required to have flood insurance. Though it may at face value seem an economic benefit to residents to not be required to have flood insurance, it creates an incredibly vulnerable situation in which residents, especially those in high-risk flood zones (which did not actually change in 2016, just the maps did) are not insured for anything from daily storm events to severe hurricanes.

Using a data governance model, such as a trust, in which residents provide pooled information about block level flooding, that is controlled, managed and governed by residents could support residents to be engaged stakeholders in rezoning, advocating for infrastructural improvements and demonstrate the additional protections that are needed by city residents including advocating for resource distribution under the National Flood Insurance Program. Controlling for access to information in this case would be necessary as this type of data could also lead to property devaluation, increased insurance rates and economic disincentives if mishandled.

Climate change: Adaptation

Water in the United States is generalized under “eastern” (East of Texas and minus Mississippi) and “western” states. Eastern states manage water under riparian law in which, if your property abuts a body of water, you are able to use that water. However, in western states water law follows prior appropriation. To note, tribal rights to water have additional layers of governance and recognition defined by different determinations such as under Winters v. United States and Arizona v. California. Though prior appropriation is used differently state by state, the premise applies “beneficial use” prompting people with priority water rights to use amounts of water that don’t always reflect their actual need, or risk losing the water that is apportioned to them. In Colorado, water laws, which don’t reflect watershed and ecosystem health (especially in neighboring states) coupled with the encroaching effects of climate change and drought, have led to the increasing water shortage in the west.

Though many agree that the prior appropriation approach to water management is antiquated, for rights holders to maintain those rights, they have to buy into the “use it or lose it” system that dominates the Colorado River basin. In joint regional recognition of water insecurity and deprivation, a data governance structure like a collective could support the creation of a rights holder, municipal, state and regional tracking system. In this system, pooled data on water use could be used to advocate for revisions to water law in favor of collective action and management of water resources over individual rights.

Climate change: Mitigation and raw materials for renewable energy infrastructure

One of the key strategies for mitigating the effects of climate change are efforts towards decoupling fossil fuels from energy expenditure. Decoupling our world economy from fossil fuel use requires huge investment and growth in alternative sources of energy and the infrastructure to provide that energy. One area for decoupling the transportation industry is the rise of electric vehicles. The batteries that run electric cars require lithium, which has kicked off a lithium mining boom in places like Australia, Argentina, Bolivia and Chile. Lithium mining is a freshwater intensive activity and the majority of lithium deposits are found in dry and arid landscapes. It is a highly geo politicized debate about who will profit from this extraction method. There is a novel opportunity to establish a data trust that ensures benefits tracking to original lithium sources, particularly the lithium mining found on or adjacent to indigenous land.

Part III

Part three of this series briefly explores some of the important design considerations for applying a collective data governance model to an environmental issue area.

Criteria for an Environmental Trend Data System

A trend analysis is a collection of data points over time, which depending on the time block and the object being monitored could be over seconds, minutes, hours, days, weeks, months or years. The value of trend analysis is to spot patterns and provide insight on how change is occurring over time. To arrive at a trend analysis we first have to imagine a data system that could provide these baseline data points. Thus a “good enough” environmental trend system could be comprised of the following qualitative and quantitative existing information streams supplied by nonprofits, community science groups, government, academia and private industry: biodiversity and species richness data, soil health, air quality, river and lake water quality, recreational use and access, agriculture, public testimony/comment and oral history.

These areas could inform a variety of community deliberative processes on: proposed new development, a judicial or legislative process that impacts how land is zoned, permit levels for a nearby industry, regional insurance rates, or even determining public access points for land and river recreation. However these trend points would need to be complimented by contextual narrative about local economics, politics and other issues taking shape. These contextual inputs could be provided in the format of a formal or informal public commentary process as articulated by law or congressional representative meetings on environmental issue areas.

Scope and Intent

An environmental trend data system should provide and amplify a way to identify emerging patterns (air quality, water quality, biodiversity richness) and then alert interested parties to these trends. While the independent verification of these trends is one intent of the system, what could make this useful to communities and elected officials is a contextual component that qualifies ideas and actions to mitigate the emerging environmental problems. For example, if pollution of a waterway is a growing problem as identified in the environmental trend data system, then offering a way for individuals within the community to exchange ideas on how they could mitigate this problem. This allows for the trend system to diagnose and prescribe simultaneously.

As discussed in the “understanding the environmental problem space” blog series, though there have been significant attempts at creating systems for data harmonization, storage and management, there are few long-term and successful examples, outside of use for educational purposes, that direct efforts towards impactfulness and usefulness of data.

Collaborative models of data governance are early on based in 1) intent for data use (for instance, is the data to be used to inform an Environmental Impact Assessment) and 2) solving for an explicit problem (community-input on water health for this Environmental Impact Assessment), but these collaborative models will need to have an additional layer of sociotechnical construction that supports the contributor/users in implementing the data for different scenarios. Though this sits outside of the construction of the governance models we’ve worked through, it is perhaps a unique need in spaces like environmental protection and management.


Here we offer principles that could guide the design of a new environmental trend data system within a data governance structure.

  • Transparency: To guarantee and nurture trust during the decision-making process and grow an informed civic body, transparency must be built into every part of an environmental trend data system.
  • Privacy and control: Because environmental problems are usually tied to a specific geographic region and identify the impacts of these problems down to individual households and/or identify where rare species can be found, there must be significant consideration for how privacy and control of data dissemination is handled within the environmental trend system. Conducting both an early and an ongoing assessment of the potential benefits and risks involved can help to solve for issues that might arise.
  • Independence: To ensure fair representation of all invested members, the structure should be independent/disinterested (not belonging to one entity) as designated by the governing structure.
  • Governance structure: Depending on the style of governance, an active member from each interested group should have a contractual duty to steward the independent entity. If there is not an association with a formal organization or entity, governance should weigh how individual input is managed to ensure equitable representation. Governance should additionally early on define guidelines for group management, decision making, conflict management, the roles and responsibilities of members and outline how members might be governed under different rules and laws (such as tribal water rights in Colorado).
  • Problem definition: Collaborative models of data governance are based on binding the group in solving an existing problem that has been articulated prior to the development of pooled resources. However, sometimes the environmental problem, especially at the start of an emerging issue, can be difficult to define thoroughly enough.
  • Impact: Any collaborative solution to improving upon existing environmental data models should be focused on how that data can be impactful and usable when environmental decisions are being made. Otherwise the ability for data to serve as a mediator in facilitating decisions that work for and on behalf of people providing the data, will be limited. Additionally, involvement outside of the core governance structure should happen early on with people who will be responsible for implementing data at a further point.


Currently technical platforms that emphasize community or local level environmental input and consideration are usually funded in part by private foundations with some local community support. These funding streams are inadequate and often these platforms die with the organization or initiative funding them. A core component to address is ways in which this environmental data model could be integrated, amplify or augment civic-funded initiatives like a public commentary system. Alternatively, approaches could be made to identify how market driven mechanisms for cooperative data models, like in agriculture or renewables, could dedicate a portion of their revenue stream to public environmental trend data models.

Risks and Challenges

In several of the case studies, there were clear risk components identified. For instance, though flood protection in climate sensitive municipalities such as New Orleans are important, the benefits of collaborative action have to be weighed against the potential for markets and thus individuals to be impacted both in terms of economic valuation of homes and cost of insurance. In the case of water rights in Colorado, though generally agreed that prior appropriation does not address the current realities of the climate crisis in the Colorado River basin, redistributing water rights that have been long held would have to be done with intentionality for how this might impact the many people in the region. In the case of urban air quality monitoring, being cautious of managing potential economic conflict based on air quality readings would have to be at the center of data governance design.

As we’ve previously written about in our “understanding the environmental problem space” blog series there is also a social element that provides a challenge to integrating collective governance structures. As identified in the case studies, most industries, regulators, assessors, etc. are operating in their definable community of practice in which approaches tend not to transcend or incorporate other sectors. To create collaborative structures in the environmental space, there will need to be a carefully curated epistemic cultural approach that can build on a diversity of methodologies and pathways to knowledge production in pursuit of creating shared practices within the collaborative structure. Creating an environmental data governance model that encourages collaborative agreement requires bridging a complexity of different cultures of how problems are approached. This can include: government officials and administrative bureaucracy, community implemented organizing, private technology companies and the profit bottom line, private landowners and individual rights, public lands and their history in military management.

Sometimes the environmental challenges identified as candidates for alternative data governance approaches require more social and cultural adjustment rather than just the creation of the technical system. Because of this, a risk in creating a data collaborative structure for an existing environmental problem, say cleaning up a historically polluted waterway, is that the new data collaborative structure simply creates a technical band-aid. As mentioned in our problems blog post, sometimes the most difficult troubleshooting must occur at the sociocultural level within public agencies that are required by law to protect the polluted waterways.




Building environmental hardware interoperability while changing the way data is shared, verified & used. Learn more at

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Experimenting with non-linear regression!

Let’s retire the phrase “Data Culture”

PostgreSQL vs Python for data cleaning: A guide

Out of the Frying Pan and Into the Build-On Project

Building an ocean vector map with MapTiler

Illustrating a Call to Action for More Equitable and Sustainable Cities

An illustrated city scene, showing a before and after image of the same block. The left scene is titled “Under-served”, and shows a slum with no access to indoor plumbing, no clean water, no public transport. The right scene is titled “Better Served”, and shows the same scene upgraded without displacing residents, with access to clean water, public transport.

study notes: Handling Skewed data for Machine Learning models

Kicking off the Open Referral UK beta

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Open Environmental Data Project

Open Environmental Data Project

Building environmental hardware interoperability while changing the way data is shared, verified & used. Learn more at

More from Medium

Visualizing Privacy Trade-Offs for Sensitive Data

A quantile dotplot shows the distribution of potential privacy-preserving releases. A cursor hovers over a bin to show a tooltip describing the quantile dotplot.

Turn on the analytics machine — Partner with MetricsDAO

MetricsDAO created dashboards and analytics for Harmony protocol, Uniswap and OlympusDAO. And curated blockchain data for Harmony and NEAR.

The Case for SciOps

Predator-prey models to model users