COG-5: Asset Metadata
Status: Draft (Work in Progress)
Version: 0.1
Created: 2025-01-23
Authors: Mike Anderson
This specification is under active development. Structure and details may change significantly based on implementation experience and community feedback.
This standard specifies the metadata format for assets on the Covia Grid, enabling interoperability, discoverability, and verification of resources across the distributed network.
Purpose
Assets are the fundamental resources of the Covia Grid, representing data, operations, and other computational resources that can be shared and utilised across venues. A well-defined metadata format is essential for:
- Discovery: Enabling clients to find relevant assets based on descriptive information
- Verification: Ensuring content integrity through cryptographic hashes
- Interoperability: Allowing assets to be understood and processed by any Grid participant
- Provenance: Tracking the origin, authorship, and licensing of resources
This specification defines the JSON-based metadata format that describes assets on the Grid.
Terminology
See COG-1: Architecture for definitions of Grid terminology including Asset, Artifact, Operation, and Job.
Specification
Metadata Format
Asset metadata MUST be a valid JSON object.
Asset metadata MUST be serialised as a UTF-8 encoded string for the purpose of computing the Asset ID.
Implementations SHOULD use canonical JSON formatting (sorted keys, no unnecessary whitespace) when computing Asset IDs for consistency, though this is not strictly required as long as the exact metadata string is preserved.
Asset ID Computation
The Asset ID is computed as follows:
- Serialise the metadata as a UTF-8 JSON string
- Compute the SHA256 hash of the UTF-8 bytes
- Encode the hash as a lowercase hexadecimal string (64 characters)
Example:
Metadata: {"name":"Example Asset","description":"A test asset"}
Asset ID: 7a8b9c0d1e2f3a4b5c6d7e8f9a0b1c2d3e4f5a6b7c8d9e0f1a2b3c4d5e6f7a8b
Implementations MUST preserve the exact metadata string used to compute the Asset ID.
Common Fields
The following fields are RECOMMENDED for all assets:
name (string)
A human-readable name for the asset.
{
"name": "Iris Dataset"
}
description (string)
A detailed description of the asset, its purpose, and how it can be used.
{
"description": "The famous Iris flower dataset for machine learning classification tasks"
}
creator (string)
The creator or author of the asset.
{
"creator": "UCI Machine Learning Repository"
}
dateCreated (string)
The creation date in ISO 8601 format.
{
"dateCreated": "2025-06-05T06:53:59Z"
}
dateModified (string)
The last modification date in ISO 8601 format.
{
"dateModified": "2025-06-05T07:22:59Z"
}
keywords (array of strings)
Keywords for discovery and categorisation.
{
"keywords": ["machine learning", "dataset", "classification"]
}
license (object)
Licensing information for the asset.
{
"license": {
"name": "CC BY 4.0",
"url": "https://creativecommons.org/licenses/by/4.0/"
}
}
Asset Type Fields
Assets are categorised by the presence of specific top-level objects:
Artifacts
Assets with a content object represent Artifacts - immutable data assets.
See COG-6: Artifacts for the complete specification including:
- Content hash verification
- Replication and federation
- Content storage requirements
Operations
Assets with an operation object represent Operations - executable assets.
See COG-7: Operations for the complete specification including:
- Adapter configuration
- Input/output schemas
- Orchestration workflows
Additional Information
The additionalInformation object MAY contain any implementation-specific or domain-specific metadata:
{
"additionalInformation": {
"notes": ["Uploaded for testing purposes"],
"sourceUrl": "https://example.com/original"
}
}
Validation
Implementations SHOULD validate metadata against this specification before accepting assets.
Implementations MUST reject metadata that:
- Is not valid JSON
- Contains content hashes that do not match the actual content (for artifacts)
- References non-existent adapters (for operations)
Security Considerations
Metadata Immutability
Once an asset is created, its metadata string and Asset ID are immutable. Any modification to metadata results in a new Asset ID. Implementations MUST NOT allow modification of existing metadata.
See COG-6: Artifacts and COG-7: Operations for asset-type-specific security considerations.
Related Specifications
- COG-1: Architecture - Overall Grid architecture
- COG-2: Decentralised ID - Asset identification
- COG-4: Grid Lattice - Asset storage in the lattice
- COG-6: Artifacts - Immutable data assets
- COG-7: Operations - Executable assets