Skip to content

Licensing guide

Bamboo Filing Cabinet defaults

  • Code: Apache-2.0.
  • Data we publish (original/derived and permitted): CC-BY.
  • Documentation: CC-BY (unless stated otherwise).

License templates

  • Use these templates when adding LICENSE files to new repos.
  • Apache-2.0 text: licenses/apache-2.0.txt (canonical: https://www.apache.org/licenses/LICENSE-2.0)
  • CC-BY 4.0 legal code: licenses/cc-by-4.0.txt (canonical: https://creativecommons.org/licenses/by/4.0/legalcode)

General rules

  • Always include a license for code (Apache-2.0 is the default).
  • Do NOT claim a license the source does not grant.
  • If source licensing is unclear, mark the data license as UNKNOWN and keep provenance strong.
  • Prefer permissive licenses for data (CC-BY is the default for BFC-produced data).
  • Avoid restrictive licenses (e.g., CC-BY-NC, CC-BY-ND) that limit reuse.
  • When combining datasets, ensure license compatibility.
  • Document licensing clearly in dataset metadata (e.g., dataset card or README).
  • When in doubt, seek legal advice or consult licensing experts.

What to document per dataset

  • Source URLs + retrieval dates.
  • Source licensing statements or links (if any).
  • What we changed (transforms, normalization, schema).
  • Code license and data license.
  • Attribution format for CC-BY reuse.