Primary Data and Secondary Data

Primary data comes directly from emission sources, while secondary data uses industry averages and emission factors.

Primary and Secondary Data

In carbon accounting and life cycle assessment (LCA), data quality plays a crucial role in ensuring reliable results. Two main types of data are used for carbon footprint calculations: primary data and secondary data.

What is primary data?

Primary data refers to data that comes directly from the emission sources themselves — for example from suppliers, production facilities, or logistics partners.
It represents the most accurate and authentic data available and is therefore considered high-quality and company-specific.

However, collecting supplier-specific primary data can be complex and time-consuming, due to:

  • Limited transparency in global supply chains
  • Data confidentiality concerns
  • Inconsistent reporting standards

These challenges make it difficult to achieve full visibility and completeness across all tiers of the value chain.

What is secondary data?

When primary data is not available, secondary data — also called reference data — fills the remaining gaps.
Secondary data is derived from verified external sources, such as:

  • Published scientific research and academic studies
  • Public or official national statistics
  • Corporate sustainability reports and company websites
  • Existing Life Cycle Inventory (LCI) databases

Secondary data helps ensure that CO₂e calculations can still be completed even when supplier-specific information is missing.

The challenge with secondary data

The main limitation of secondary data is that it can become outdated or incomplete, particularly for specific materials, processes, or regional energy mixes. Because energy systems and production methods evolve rapidly due to technological, economic, and geopolitical changes (such as the recent energy crisis and the rise of renewables), up-to-date data is essential for maintaining accuracy and comparability.

Sustamize continuously updates and verifies secondary datasets to ensure high-quality, transparent, and up-to-date CO₂e information for all footprint analyses.

Sources: Ciroth et al., 2019