Licensing guide
Bamboo Filing Cabinet defaults
- Code: Apache-2.0.
- Data we publish (original/derived and permitted): CC-BY.
- Documentation: CC-BY (unless stated otherwise).
License templates
- Use these templates when adding LICENSE files to new repos.
- Apache-2.0 text: licenses/apache-2.0.txt (canonical: https://www.apache.org/licenses/LICENSE-2.0)
- CC-BY 4.0 legal code: licenses/cc-by-4.0.txt (canonical: https://creativecommons.org/licenses/by/4.0/legalcode)
General rules
- Always include a license for code (Apache-2.0 is the default).
- Do NOT claim a license the source does not grant.
- If source licensing is unclear, mark the data license as UNKNOWN and keep provenance strong.
- Prefer permissive licenses for data (CC-BY is the default for BFC-produced data).
- Avoid restrictive licenses (e.g., CC-BY-NC, CC-BY-ND) that limit reuse.
- When combining datasets, ensure license compatibility.
- Document licensing clearly in dataset metadata (e.g., dataset card or README).
- When in doubt, seek legal advice or consult licensing experts.
What to document per dataset
- Source URLs + retrieval dates.
- Source licensing statements or links (if any).
- What we changed (transforms, normalization, schema).
- Code license and data license.
- Attribution format for CC-BY reuse.