RO-Crate profiles
While RO-Crates can be considered general-purpose containers of arbitrary data and open-ended metadata, in practical use within a particular domain, application or framework, it will be beneficial to further constrain RO-Crate to a specific profile: a set of conventions, types and properties that one minimally can require and expect to be present in that subset of RO-Crates.
Defining and conforming to such a profile enables reliable programmatic consumption of an RO-Crate’s content, as well as consistent creation, e.g. a form in a user interface form firmly suggest the required types and properties, and likewise a rendering of an RO-Crate can easier make rich UI components if it can reliably assume for instance that the Person
always has an affiliation
to a Organization
which has a url
- a restriction that may not be appropriate for all types of RO-Crates.
As such RO-Crate Profiles can be considered a duck typing mechanism for RO-Crates, but also as a classifier to indicate the crate’s purpose, expectations and focus.
Publishing an RO-Crate profile
An RO-Crate profile is identified with a Profile URI.
Recommendations:
- The profile URI MUST resolve to a human-readable profile description (e.g. a HTML web page)
- The profile URI MAY have a corresponding machine-readable Profile Crate
- The profile URI SHOULD be a permalink (persistent identifier)
- e.g. starting with https://w3id.org/ http://purl.org/ or https://www.doi.org/
- The profile URI SHOULD be versioned with
MAJOR.MINOR
, e.g.http://example.com/image-profile-2.4
- The profile description SHOULD use key words MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL as described in [RFC2119].
Suggestions:
- The profile MAY require/suggest which
@type
of data entities and/or contextual entities are expected. - The profile MAY require/suggest properties expected per type of entity (e.g. “Each CreativeWork must declare a license“)
- The profile MAY require/suggest a particular version of RO-Crate.
- The profile MAY recommend RO-Crate extensions with domain-specific terms and vocabularies.
- The profile MAY require/suggest a particular JSON-LD context.
- The profile MAY require/suggest a particular RO-Crate publishing method or packaging like .zip or BagIt.
Declaring conformance of an RO-Crate profile
RO-Crate can describe a profile by adding it as an contextual entity:
{
"@id": "https://w3id.org/ro/wfrun/process/0.1",
"@type": ["CreativeWork", "Profile"],
"@id": "https://w3id.org/ro/wfrun/process/0.1",
"name": "Process Run crate profile",
"version": "0.1.0"
}
The contextual entity for a profile:
- The
@type
SHOULD be an array. The@type
MUST includeProfile
. - The
'@type
SHOULD includeCreativeWork
(indicating a Web Page) orDataset
(indicating a Profile Crate). - SHOULD have an absolute URI as
@id
- SHOULD have a descriptive name
- MAY declare version, preferably according to Semantic Versioning
RO-Crates that are conforming to (or intending to conform to) such a profile declare this using conformsTo
on the root data entity:
{
"@id": "./",
"@type": "Dataset",
"conformsTo":
{"@id": "https://w3id.org/ro/wfrun/process/0.1"}
}
It is valid for a crate to conform to multiple profiles, in which case conformsTo
is an unordered array.
Note that as profile conformance is declared on the RO-Crate Root (./
in this example), the profile applies to the whole RO-Crate, and may cover aspects beyond the crate’s metadata file (e.g. identifiers, packaging, purpose).
Profile Crate
While the Profile URI @id
must resolve to a human-readable profile description, it can additionally be made to resolve to a Profile Crate.
A Profile Crate is a type of RO-Crate that gathers resources which further define the profile. This allows formalizing alternative profile description for machine-readability, for instance for validation, but also additional resources like examples.
The Root Data entity of a Profile Crate MUST declare Profile
as an additional @type
:
{
"@id": "http://example.com/",
"@type": ["Dataset", "Profile"],
"@id": "https://w3id.org/ro/wfrun/process/0.1",
"name": "Process Run crate profile",
"version": "0.1.0",
"hasPart": [ ],
"…": ""
}
The rest of the requirements for being referenced as a contextual entity also apply here:
- SHOULD have an absolute URI as
@id
- SHOULD have a descriptive name
- MAY declare version, preferably according to Semantic Versioning (e.g.
0.4.0
) - SHOULD list related data entities using
hasPart
(see below)
How to retrieve a Profile Crate
To resolve a Profile URI to a machine-readable Profile Crate, two approaches are recommended to retrieve its RO-Crate metadata file:
- HTTP Content-negotiation for the RO-Crate media type, for example:
Requestinghttps://w3id.org/ro/wfrun/process/0.1
with HTTP header
Accept: application/ld+json;profile=https://w3id.org/ro/crate
redirects to the RO-Crate Metadata filehttps://www.researchobject.org/workflow-run-crate/profiles/0.1/process_run_crate/ro-crate-metadata.json
- The above approach may fail (or returns a HTML page), e.g. for content-delivery networks that do not support content-negotiation. The fallback is to try resolving the path
./ro-crate-metadata.json
from the resolved URI (after permalink redirects). For example:
If permalinkhttps://w3id.org/workflowhub/workflow-ro-crate/1.0
redirects tohttps://about.workflowhub.eu/Workflow-RO-Crate/1.0/index.html
(a HTML page), then try retrievinghttps://about.workflowhub.eu/Workflow-RO-Crate/1.0/ro-crate-metadata.json
- If none of these approaches worked, then this profile probably does not have a corresponding Profile Crate. For humans, display a hyperlink to its
@id
described by itsname
.
What is included in the Profile Crate?
Below follows the suggested data entities to include in a Profile Crate using hasPart
.
Declaring the role within the crate
In order for programmatic use of the Profile Crate to consume particular subresources, e.g. for validation, the role of each entity SHOULD be declared by including them using hasResource
to a ResourceDescriptor
contextual entity that references the subresource using hasResource
, as defined by the Profiles Vocabulary:
{
"@id": "http://example.com/my-crate-profile/0.1/",
"@type": ["Dataset", "Profile"],
"name": "My Crate Profile",
"version": "0.1.0",
"hasPart": [
{"@id": "http://example.com/my-crate-profile/0.1/shape.shex"}
],
"hasResource": [
{"@id": "#hasShape"}
]
}
{
"@id": "#hasShape",
"@type": "ResourceDescriptor",
"hasRole": { "@id": "http://www.w3.org/ns/dx/prof/role/constraints" },
"hasArtifact": {"@id": "http://example.com/my-crate-profile/0.1/shape.shex"}
}
The ResourceDescriptor
entity MAY also declare dct:format
or dct:conformsTo
, however the data entity referenced with hasArtifact
SHOULD declare encodingFormat
(with OPTIONAL conformsTo
) to specify its encoding format, e.g.:
{
"@id": "http://example.com/my-crate-profile/0.1/shape.shex",
"@type": "File",
"encodingFormat": [
"text/shex",
{"@id": "http://shex.io/shex-semantics/" }
]
}
The referenced role SHOULD be declared as a DefinedTerm
contextual entity. The recommended predefined roles from the Profiles Vocabulary are:
{
"@id": "http://www.w3.org/ns/dx/prof/role/constraints",
"@type": ["DefinedTerm", "ResourceRole"],
"name": "Constraints",
"description": "Descriptions of obligations, limitations or extensions that the profile defines"
},
{
"@id": "http://www.w3.org/ns/dx/prof/role/example",
"@type": ["DefinedTerm", "ResourceRole"],
"name": "Example",
"description": "Sample instance data conforming to the profile"
},
{
"@id": "http://www.w3.org/ns/dx/prof/role/guidance",
"@type": ["DefinedTerm", "ResourceRole"],
"name": "Guidance",
"description": "Documents, in human-readable form, how to use the profile"
},
{
"@id": "http://www.w3.org/ns/dx/prof/role/mapping",
"@type": ["DefinedTerm", "ResourceRole"],
"name": "Mapping",
"description": "Describes conversions between two specifications"
},
{
"@id": "http://www.w3.org/ns/dx/prof/role/schema",
"@type": ["DefinedTerm", "ResourceRole"],
"name": "Schema",
"description": "Machine-readable structural descriptions of data defined by the profile"
},
{
"@id": "http://www.w3.org/ns/dx/prof/role/specification",
"@type": ["DefinedTerm", "ResourceRole"],
"name": "Specification",
"description": "Defining the profile in human-readable form"
},
{
"@id": "http://www.w3.org/ns/dx/prof/role/validation",
"@type": ["DefinedTerm", "ResourceRole"],
"name": "Validation",
"description": "Supplies instructions about how to verify conformance of data to the profile"
},
{
"@id": "http://www.w3.org/ns/dx/prof/role/vocabulary",
"@type": ["DefinedTerm", "ResourceRole"],
"name": "Vocabulary",
"description": "Defines terms used in the profile specification"
}
The examples in the rest of this document will list the data entities with a corresponding ResourceDescriptor
entity, but for brevity not repeating the required hasPart
hasArtifact
and DefinedTerm
declarations.
Profile description entity
A Profile Crate MUST declare a human-readable profile description, which is about this Profile Crate and SHOULD have encodingFormat
as text/html
. The corresponding ResourceDescriptor
SHOULD have identifier http://www.w3.org/ns/dx/prof/role/specification
or http://www.w3.org/ns/dx/prof/role/guidance
– for example:
{
"@id": "index.html",
"@type": "File",
"name": "Workflow RO-Crate profile description",
"encodingFormat": "text/html",
"about": "./",
},
{
"@id": "#hasSpecification",
"@type": "ResourceDescriptor",
"hasRole": { "@id": "http://www.w3.org/ns/dx/prof/role/specification" },
"hasArtifact": {"@id": "index.html"}
}
The profile description MAY be equivalent to the RO-Crate Website entity ro-crate-preview.html
(becoming a data entity by listing it under hasPart
):
{
"@id": "ro-crate-preview.html",
"@type": "CreativeWork",
"name": "RO-Crate preview of the Process Run Crate",
"encodingFormat": "text/html",
"about": "./",
}
Profile Schema entity
An optional machine-readable schema of the profile, for instance a Describo JSON profile:
{
"@id": "https://raw.githubusercontent.com/UTS-eResearch/describo/v0.13.0/src/components/profiles/paradisec.describo.profile.json",
"@type": "File",
"name": "PARADISEC profile for Describo",
"encodingFormat": "application/json",
"conformsTo": {"@id": "https://github.com/UTS-eResearch/describo/wiki/dsp-index#profile-structure"}
},
{
"@id": "#hasSchema",
"@type": "ResourceDescriptor",
"hasRole": { "@id": "http://www.w3.org/ns/dx/prof/role/schema" },
"hasArtifact": {"@id": "https://raw.githubusercontent.com/UTS-eResearch/describo/v0.13.0/src/components/profiles/paradisec.describo.profile.json"}
},
{
"@id": "https://github.com/UTS-eResearch/describo/wiki/dsp-index#profile-structure",
"@type": "Profile",
"name": "Describo JSON profile"
}
A schema may formalize restrictions on the RO-Crate metadata file on a graph-level (e.g. what types/properties) as well as serialization level (e.g. use of JSON arrays).
This interpretation of schema assumes the resource somewhat describes the data structure, e.g. expected types and attributes the RO-Crate’s JSON-LD. Use alternatively the role http://www.w3.org/ns/dx/prof/role/validation
if the schema is primarily a set of constraint for validation purposes, or http://www.w3.org/ns/dx/prof/role/vocabulary
for ontologies and term listings.
Below are known schema types in their recommended media type, with suggested identifiers for the contextual entities of encodingFormat with type Standard
and conformsTo
with type Profile
:
Some of the above schema languages are based on general data structure syntaxes like application/json
and text/turtle
, and therefore have a generic encodingFormat
by a specialized conformsTo
URI, which itself is declared as a Profile
.
Software that works with the profile
Software that may consume/validate/generate RO-Crates following this profile (potentially using the schema):
{
"@id": "https://arkisto-platform.github.io/describo/",
"@type": "SoftwareApplication",
"name": "Describo",
"version": "0.13.0",
"url": "https://arkisto-platform.github.io/describo/"
}
Repositories that expect the profile
A repository or collection within a repository that may accept/contain RO-Crates following this profile:
{
"@id": "https://mod.paradisec.org.au/",
"@type": "RepositoryCollection",
"title": "Modern PARADISEC demonstrator",
"description": "PARADISEC curates digital material about small or endangered languages",
"publisher": {"@id": "https://paradisec.org.au/"}
}
BagIt packaging
If conforming RO-Crates should be packaged according to a BagIt profile (e.g. must be serialized as an application/zip
):
{
"@id": "https://w3id.org/ro/bagit/profile/0.3",
"@type": "WebPage",
"name": "BagIt profile for RO-Crate in ZIP",
"encodingFormat": [
"application/json",
{"@id": "https://bagit-profiles.github.io/bagit-profiles-specification/"}
]
}
Extension vocabularies
A profile that extends RO-Crate SHOULD indicate which vocabulary/ontology it uses as a DefinedTermSet:
{
"@id": "https://w3id.org/ro/terms/test#",
"@type": "DefinedTermSet",
"name": "Namespace for workflow testing metadata",
"url": "https://github.com/ResearchObject/ro-terms/tree/master/test",
}
The @id
of the vocabulary SHOULD be the namespace, while url
SHOULD go to a human-readable description of the vocabulary.
A profile that defines many extensions term MAY define its own DefinedTermSet
and relate the terms using hasDefinedTerm
:
{
"@id": "https://w3id.org/cpm/ro-crate",
"@type": "Dataset",
"identifier": "https://w3id.org/cpm/ro-crate",
"name": "Common Provenance Model RO-Crate profiles and vocabulary",
"hasPart": [
{ "@id": "https://w3id.org/cpm/ro-crate#" }
]
},
{
"@id": "https://w3id.org/cpm/ro-crate#",
"@type": "DefinedTermSet",
"name": "Namespace for Common Provenance Model RO-Crate model",
"hasDefinedTerm": [
{ "@id": "https://w3id.org/cpm/ro-crate#CPMProvenanceFile" },
{ "@id": "https://w3id.org/cpm/ro-crate#CPMMetaProvenanceFile" }
]
},
{
"@id": "https://w3id.org/cpm/ro-crate#CPMProvenanceFile",
"@type": "DefinedTerm",
"…" : ""
}
Extension terms
A profile that extends RO-Crate MAY indicate particular terms directly as DefinedTerm, Class and/or Property instances:
{
"@id": "https://w3id.org/ro/terms/test#runsOn",
"@type": "DefinedTerm",
"termCode": "runsOn",
"name": "Runs on",
"description": "Service where the test instance is executed",
"url": "https://lifemonitor.eu/workflow_testing_ro_crate#test-instance",
}
The termCode
SHOULD be valid as a key in JSON-LD @context
of conforming RO-Crates. The term SHOULD be mapped to the terms’ @id
in the @context
of this Profile Crate.
JSON-LD Context
A profile that have a corresponding JSON-LD @context
(e.g. to map its extensions terms, or to suggest a version of RO-Crate’s official context) SHOULD indicate the context in the Profile Crate:
{
"@id": "https://w3id.org/ro/crate/1.2-DRAFT/context",
"@type": "CreativeWork",
"name": "RO-Crate JSON-LD Context",
"encodingFormat": "application/ld+json",
"conformsTo": {"@id": "http://www.w3.org/ns/json-ld#Context"},
"version": "1.1.1",
},
{
"@id": "http://www.w3.org/ns/json-ld#Context",
"@type": "DefinedTerm",
"name": "JSON-LD Context",
"url": "https://www.w3.org/TR/json-ld/"
}
The JSON-LD Context entity:
- MUST have an
encodingFormat
ofapplication/ld+json
- MUST have an absolute URI as
@id
, which MUST be retrievable as JSON-LD directly or with content-negotiation and/or HTTP redirects. - SHOULD have a permalink (persistent identifier) as
@id
- e.g. starting with https://w3id.org/ http://purl.org/
- MAY embed major.minor version in the PID, e.g. https://w3id.org/ro/crate/1.2/context
- SHOULD use
https
rather thanhttp
with a certificate commonly accepted by browsers - SHOULD have a
@id
URI that is versioned withMAJOR.MINOR
, e.g.https://example.com/image-profile-2.4
- SHOULD have a descriptive name
- SHOULD have a
encodingFormat
to the contextual entityhttp://www.w3.org/ns/json-ld#Context
- MAY declare version according to Semantic Versioning
- Updates MAY add new terms or patch fixes (with corresponding
version
change) - Updates SHOULD NOT remove terms already published and potentially used by consumers of the profile
- Updates SHOULD NOT replace URIs terms map to – except for typos.
Note that the referenced context URI does not have to match the @context
of the Profile Crate itself.
The @context
MAY be the Profile Crate’s Metadata JSON-LD file if it is resolvable as media type application/ld+json
over HTTP. Make sure the crate includes the defined terms both within its @context
and ideally as entities in its @graph
.