Skip to main content

COG-5: Asset Metadata

Status:      Draft (Work in Progress)
Version: 0.1
Created: 2025-01-23
Authors: Mike Anderson
Work in Progress

This specification is under active development. Structure and details may change significantly based on implementation experience and community feedback.

This standard specifies the metadata format for assets on the Covia Grid, enabling interoperability, discoverability, and verification of resources across the distributed network.

Purpose

Assets are the fundamental resources of the Covia Grid, representing data, operations, and other computational resources that can be shared and utilised across venues. A well-defined metadata format is essential for:

  • Discovery: Enabling clients to find relevant assets based on descriptive information
  • Verification: Ensuring content integrity through cryptographic hashes
  • Interoperability: Allowing assets to be understood and processed by any Grid participant
  • Provenance: Tracking the origin, authorship, and licensing of resources

This specification defines the JSON-based metadata format that describes assets on the Grid.

Terminology

See COG-1: Architecture for definitions of Grid terminology including Asset, Artifact, Operation, and Job.

Specification

Metadata Format

Asset metadata MUST be a valid JSON object.

Asset metadata MUST be serialised as a UTF-8 encoded string for the purpose of computing the Asset ID.

Implementations SHOULD use canonical JSON formatting (sorted keys, no unnecessary whitespace) when computing Asset IDs for consistency, though this is not strictly required as long as the exact metadata string is preserved.

Asset ID Computation

The Asset ID is computed as follows:

  1. Serialise the metadata as a UTF-8 JSON string
  2. Compute the SHA256 hash of the UTF-8 bytes
  3. Encode the hash as a lowercase hexadecimal string (64 characters)

Example:

Metadata: {"name":"Example Asset","description":"A test asset"}
Asset ID: 7a8b9c0d1e2f3a4b5c6d7e8f9a0b1c2d3e4f5a6b7c8d9e0f1a2b3c4d5e6f7a8b

Implementations MUST preserve the exact metadata string used to compute the Asset ID.

Common Fields

The following fields are RECOMMENDED for all assets:

name (string)

A human-readable name for the asset.

{
"name": "Iris Dataset"
}

description (string)

A detailed description of the asset, its purpose, and how it can be used.

{
"description": "The famous Iris flower dataset for machine learning classification tasks"
}

creator (string)

The creator or author of the asset.

{
"creator": "UCI Machine Learning Repository"
}

dateCreated (string)

The creation date in ISO 8601 format.

{
"dateCreated": "2025-06-05T06:53:59Z"
}

dateModified (string)

The last modification date in ISO 8601 format.

{
"dateModified": "2025-06-05T07:22:59Z"
}

keywords (array of strings)

Keywords for discovery and categorisation.

{
"keywords": ["machine learning", "dataset", "classification"]
}

license (object)

Licensing information for the asset.

{
"license": {
"name": "CC BY 4.0",
"url": "https://creativecommons.org/licenses/by/4.0/"
}
}

Asset Type Fields

Assets are categorised by the presence of specific top-level objects:

Artifacts

Assets with a content object represent Artifacts - immutable data assets.

See COG-6: Artifacts for the complete specification including:

  • Content hash verification
  • Replication and federation
  • Content storage requirements

Operations

Assets with an operation object represent Operations - executable assets.

See COG-7: Operations for the complete specification including:

  • Adapter configuration
  • Input/output schemas
  • Orchestration workflows

Additional Information

The additionalInformation object MAY contain any implementation-specific or domain-specific metadata:

{
"additionalInformation": {
"notes": ["Uploaded for testing purposes"],
"sourceUrl": "https://example.com/original"
}
}

Validation

Implementations SHOULD validate metadata against this specification before accepting assets.

Implementations MUST reject metadata that:

  • Is not valid JSON
  • Contains content hashes that do not match the actual content (for artifacts)
  • References non-existent adapters (for operations)

Security Considerations

Metadata Immutability

Once an asset is created, its metadata string and Asset ID are immutable. Any modification to metadata results in a new Asset ID. Implementations MUST NOT allow modification of existing metadata.

See COG-6: Artifacts and COG-7: Operations for asset-type-specific security considerations.