Licensing Tutorial
Last updated on 2024-10-25 | Edit this page
Overview
Questions
- What licenses are required for datasets or data products that are used?
- What licenses should we apply to created datasets?
Objectives
- Understand how to classify source code and data (input, intermediate assessment, final assessment)
- Understand recommended licenses for each data type and code
Introduction
The content of this lesson is taken from the recommendations from the IPCC Task Group on Data Support for Climate Change Assessments (Huard et al 2022).
Licensing of IPCC material, with clear and consistent meaning in all legal jurisdictions, is essential to facilitate its appropriate use to address pressing climate change challenges, while protecting the rights of data providers.
Callout
The IPCC reports and data are licensed separately!
IPCC reports are published under a copyright license that prohibits commercial use and the creation of derivative products, unless discussed first and then given permission by the IPCC Secretariat. This license is applied to protect IPCC reports from distortion since these are accepted by member governments, or approved in the case of the Summary for Policymakers, and adopted in the case of the Synthesis Report. If the same license was applied to data products, it would severely limit their usefulness and value. A different IPCC data license is required to allow the creation of derivatives for the pursuit of research and the re-use of IPCC data-based products for national assessments, adaptation and mitigation policies.
Classifying Data Types
TG-Data distinguishes three categories of data: input data, intermediate assessment data, and final assessment data.
Input data denotes the source data that underpins information in the assessment reports. It is typically authored by credible, authoritative, trusted sources, who decide under which license it is published.
Intermediate assessment data is the outcome of data processing and analysis performed as part of the assessment as an intermediate step in the generation of final assessment data. Data is only defined as intermediate if it has gone through non-trivial processing to be considered an original product, distinct from the input data.
Final assessment data refer to data which is directly presented in data tables or graphically displayed (e.g. as a line graph or a spatial map) in the report.
Source code refers to scripts, online code repositories, and software libraries written to create intermediate and final assessed data, as well as the figures included in the reports.
Licenses For Different Data Types
Input data shall be licensed under the same license terms and conditions imposed by the data providers. Input data copyright holders are encouraged to adopt well-known licenses enabling broad usage, including commercial use, and avoid “ShareAlike” licenses.
Intermediate and final assessment data should be licensed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license, where this does not infringe the interests of relevant license holders. The Creative Commons family of licenses are designed to provide legal interoperability across virtually all jurisdictions.
When input datasets are published under restrictive licenses, waivers or exemptions can be sought for the IPCC assessment reports. These waivers should be negotiated with copyright holders by Working Group co-chairs, with guidance from TG-Data representatives.
These waivers would ensure that derivative products can be licensed by the IPCC under CC BY 4.0, and that the version used by the assessment report is curated in a long-term archive, either by IPCC DDC or another trusted data repository. If exemptions cannot be obtained from the copyright owners, the applicable licenses of input data will apply.
To ensure maximal reusability of source code, similarly to data, code should be published under permissive (non- copyleft) open source licenses that do not restrict commercial use.
Challenge 1: Can you classify the following data types?
- A map used in the report
- Output from a CMIP6 model
- Model agreement on changes in temperature in a warming scenario
- A map used in the report: Final
- Output from a CMIP6 model: Input
- Model agreement on changes in temperature in a warming scenario: Intermediate
Key Points
- Data can be classified into input, intermediate assessment, or final assessment data.
- Input data shall be licensed under the same license terms and conditions imposed by the data providers.
- Data produced as part of the IPCC assessment, be it intermediate or final assessment data, shall be published, wherever possible, under the CC BY 4.0 license.