Rioxx: The Research Outputs Metadata Schema

Rioxx: The Research Outputs Metadata Schema Version 3.0 Release Candidate 1

Note that this is a draft version of the application profile - please check here for the current version

Terminology

the resource refers to the resource in the repository which identified by the dc:identifier property in the Rioxx metadata record.

version of record refers to a version of the resource described in the Rioxx metadata record, which has been made available, electronically, by a publisher.

The terms MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL used in the table below should be interpreted as described in RFC 2119.

Properties in this profile

ali:license_ref | dc:coverage | dc:description | dc:format | dc:identifier | dc:language | dc:publisher | dc:relation | dc:source | dc:subject | dc:title | dcterms:date_accepted | rioxxterms:author | rioxxterms:contributor | rioxxterms:grant | rioxxterms:project | rioxxterms:publication_date | rioxxterms:record_public_release_date | rioxxterms:type | rioxxterms:version | rioxxterms:version_of_record

Property details and examples

Element Cardinality Description
rioxxterms:contributor Zero or more

This field is designed to describe an entity – for example the name of a person, organisation or service – responsible for making contributions to the content of the resource. As many rioxxterms:contributor properties may be entered as required. This property SHOULD take an optional attribute called uri, which MUST contain a URI which uniquely identifies the contributor. The ideal use of this property is to include both an HTTP(S) URI in the uri attribute, and a text string in the body of the property, thus:

<rioxxterms:contributor uri="https-uri-for-this-contributor-entity">
    name-of-this-contributor-entity
</rioxxterms:contributor>

Where the contributor is a person, the RECOMMENDED format is to add text in the form Last Name, First Name(s), and to include a recognised persistent identifier scheme, such as an ORCID ID, if known, in its HTTPS URI form, e.g.

<rioxxterms:contributor uri="https://orcid.org/0000-0002-1919-4138">
    Milgrom, Paul
</rioxxterms:contributor>

Where the contributor is an organisation, the RECOMMENDED format is to add the official name of the organisation, and to include a recognised persistent identifier scheme in its HTTP(S) URI form. Such an identifier scheme might include ISNI, Research Organization Registry, Global Research Identifier Database, VIAF or WikiData concept URI, e.g.

<rioxxterms:contributor uri="https://isni.org/isni/0000000419368956">
    Stanford University
</rioxxterms:contributor>
rioxxterms:author One or more

The author of the resource may be a person, organisation or service, but is most commonly a person. This property SHOULD take an optional attribute called uri, which MUST contain a URI which uniquely identifies the author. Where there is more than one author, a separate rioxxterms:author property MUST be used for each. As many authors may be entered as required.

The ideal use of this property is to include both an HTTP(S) URI in the uri attribute and a text string in the body of the property, thus:

<rioxxterms:author uri="https-uri-for-this-author-entity">
    name-of-this-author-entity
</rioxxterms:author>

Where the author is a person, the RECOMMENDED format is to add text in the form Last Name, First Name(s), and to include a recognised persistent identifier scheme such as an ORCID ID, if known, in its HTTPS URI form, e.g.

<rioxxterms:author uri="https://orcid.org/0000-0001-5305-9450">
    Riccardi, Annalisa
</rioxxterms:author>

Where the author is an organisation, the RECOMMENDED format is to add the official name of the organisation, and to include a recognised persistent identifier scheme in its HTTP(S) URI form. Such an identifier scheme might include ISNI, Research Organization Registry, Global Research Identifier Database, VIAF or WikiData concept URI, e.g.

<rioxxterms:author uri="https://isni.org/isni/0000000419368139">
    University of Strathclyde
</rioxxterms:author>

Where the rioxxterms:author property appears multiple times for one record, it CAN be assumed that the order is significant, in that the first property describes the 'first named author' of the resource. In order to make this more explicit, an extra attribute, first-named-author, SHOULD be used to indicate which of the rioxxterms:author properties describes the first named author of the resource, thus:

<rioxxterms:author uri="https://orcid.org/0000-0001-5305-9450" first-named-author="true">
    Riccardi, Annalisa
</rioxxterms:author>
dc:relation Zero or more

Although this property is not strictly mandated in the Rioxx application profile, it SHOULD be included because this is the property which harvesting software will inspect for to find the URLs for resource file content - for example to locate the "full text" associated with a repository record.

The resource described by a Rioxx record is commonly a web page containing metadata and links to other resources, such as (in the case of a publication) a PDF file. The dc:relation property identifies these other, related resources. Each dc:relation property MUST contain an HTTP(S) URI, and SHOULD include the following attributes:

  • type
  • deposit_date
  • resource_exposed_date

The type attribute (if present) MUST contain a value which is an identifier from the schema.org vocabulary. For example, for the common case of the related resource being a PDF of a journal article, then the RECOMMENDED value would be https://schema.org/ScholarlyArticle

The deposit_date attribute (if present) takes the date on which this related resource was first deposited, irrespective of any relevant embargoes or dark archiving, and irrespective of any subsequent file replacement(s). It is anticipated that in some circumstances the deposit_date will be captured and exposed in repository metadata when the resource described is under temporary embargo or temporary dark archiving. If included, this attribute's value MUST be encoded according to the W3CDTF (a profile of ISO 8601) which typically follows the following format: YYYY-MM-DD.

The resource_exposed_date attribute (if present) takes the date on which this related resource was made publicly available, irrespective of any subsequent file replacement(s). If included, this attribute's value MUST be encoded according to the W3CDTF (a profile of ISO 8601) which typically follows the following format: YYYY-MM-DD. Repositories will typically fulfil resource_exposed_date if the related resource is made publicly visible immediately upon deposit, or when an applicable embargo ends.

Each related resource MUST appear as a separate instance of the property.

Example:

<dc:relation type="https://schema.org/ScholarlyArticle" deposit_date="2021-07-06" resource_exposed_date="2021-07-20">
    https://www.repository.org/article_123456_preprint.pdf
</dc:relation>

<dc:relation type="https://schema.org/ScholarlyArticle" deposit_date="2021-07-28" resource_exposed_date="2021-07-28">
    https://www.repository.org/article_1234567.pdf
</dc:relation>

<dc:relation type="https://schema.org/ScholarlyArticle" deposit_date="2022-03-14" resource_exposed_date="2022-03-14">
    https://www.repository.org/article_1234567_JATS.xml
</dc:relation>

The schema.org vocabulary accommodates a diverse range of creative work types. dc:relation can therefore also be used to communicate the existence of related data or software, such as by types DataSet or SoftwareSourceCode. These examples are suggested in the interest of contributing to the scholarly data graph in general.

Examples:

<dc:relation type="https://schema.org/DataSet" deposit_date="2022-01-13" resource_exposed_date="2022-01-20">
    https://doi.org/10.5281/zenodo.3538919
</dc:relation>
<dc:relation type="https://schema.org/SoftwareSourceCode" deposit_date="2022-03-23" resource_exposed_date="2022-04-18">
    https://github.com/covid19datahub/R
</dc:relation>
ali:license_ref One or more

This is defined in the NISO Open Access Metadata and Indicators. This property MUST take an HTTP(S) URI for its value. This URI MUST point to a resource which expresses the license terms specifying how the resource may be used.

property content MUST be encoded according to the W3CDTF (a profile of ISO 8601) which typically follows the following format: YYYY-MM-DD.

This property MUST include the attribute:

  • start_date

This attribute indicates the date upon which this license takes effect. Multiple ali:license_ref properties may be included. Where several such properties are included, the one with the start_date attribute indicating the most recent date takes precedence.

Example:

<ali:license_ref start_date="2020-11-17">
    https://creativecommons.org/licenses/by/4.0
</ali:license_ref>

This approach allows the expression of 'embargoes', where a particular license takes effect at a date in the subjective future.

In the absence of any other license, the copyright holder reserves all rights automatically. As a convenience, Rioxx provides two URLs which may be used to explicitly convey this state:

dc:publisher Zero or more

This property contains the name of the entity, typically a 'publisher', responsible for making the version of record of the resource available. This could be a person, organisation or service.

Where available and possible, the RECOMMENDED format is to add the official name of the publisher, and to include a recognised persistent identifier in its HTTP(S) URI form as an attribute, e.g.

<dc:publisher uri="https://isni.org/isni/000000040482455X">
    Public Library of Science
</dc:publisher>
<dc:publisher uri="https://ror.org/00qpqrv96">
    Ubiquity Press (United Kingdom)
</dc:publisher>

Typical persistent identifier schemes likely to be relevant here include ISNI, ROR, GRID, VIAF. Where a recognised persistent identifier is unavailable or cannot be provided, the name of the publisher entered here SHOULD instead be from a controlled list.

rioxxterms:grant One or more

The purpose of rioxxterms:grant is to collect grant ID(s), issued by the relevant funder(s), that relate to the resource being described, together with the name and/or global identifier for the funder(s).

The property MUST contain one grant ID. A grant ID can take the form of any identifier provided by the funder, preferably represented as an HTTP(S) URI. In cases where the resource has been funded internally, an appropriate internal code might be used.

The property takes two attributes: funder_name and funder_id. One or both of funder_name and funder_id MUST be supplied.

funder_name

The canonical name of the entity responsible for funding the resource SHOULD be recorded here as text.

funder_id

A globally unique identifier for the funder of the resource SHOULD be recorded here. An HTTP(S) URI MUST be used for this. It is RECOMMENDED that one of the following identifier schemes is used:

Examples

<rioxxterms:grant
    funder_name="Wellcome Trust"
    funder_id="https://isni.org/isni/0000000404277672">
    https://doi.org/10.35802/218671
</rioxxterms:grant>

or

<rioxxterms:grant
    funder_name="Arts and Humanities Research Council"
    funder_id="https://ror.org/0505m1554">
    AH/W007622/1
</rioxxterms:grant>

Where the resource has been funded by more than one funder a separate rioxxterms:grant property MUST be added for each. Similarly, where several grant IDs provided by the same funder have been attached to the resource, a separate rioxxterms:grant property MUST be added for each.

This means that it is permissible for a given funder_name or funder_id to appear in multiple instances of the rioxxterms:grant property in a single Rioxx metadata record.

rioxxterms:project One or more

The purpose of rioxxterms:project is to collect project ID(s), that relate to the resource.

The rioxxterms:project property MUST contain one project ID, a globally unique persistent identifier that identifies a project, such as a local identifier rendered as a persistent identifier or a RAiD handle.

Example

<rioxxterms:project>
    https://handle.net/10378.1/1590366
</rioxxterms:project>

Where the resource is associated with more than one project ID, a rioxxterms:project property MUST be added for each. This means that it is permissible for multiple instances of the rioxxterms:project property to appear in a single Rioxx metadata record.

rioxxterms:version Exactly one

This property indicates which 'version' of the resource is being described. The value of this property MUST be one of the following:

  • AO
  • SMUR
  • AM
  • P
  • VoR
  • CVoR
  • EVoR
  • NA

These terms are adopted from the NISO RP-8-2008 Journal Article Versions (JAV) standard and have the following meanings:

  • AO = Author's Original
  • SMUR = Submitted Manuscript Under Review
  • AM = Accepted Manuscript
  • P = Proof
  • VoR = Version of Record
  • CVoR = Corrected Version of Record
  • EVoR = Enhanced Version of Record
  • NA = Not Applicable (or Unknown)
dc:format Zero or one

This refers to the format of the resource being described. The MIME type of the object pointed to by this Rioxx record’s dc:identifier property MUST be entered here (offical list of MIME Types). Note that this property should not be confused with rioxxterms:type.

Examples might include:

  • application/pdf
  • text/html
  • application/msword
dc:identifier Exactly one

dc:identifier MUST contain an HTTP(S) URI which is a persistent identifier for the resource. In repositories, this is typically a webpage which includes links to other related resources. It is RECOMMENDED that that a DOI, Handle, URN, or other persistent identification scheme be used. In the common case of a "splash-page" linking to related files (potentially in different formats), then one or more instances of the dc:relation property may be included in the Rioxx record to convey this and thereby direct harvesting software agents.

Note that dc:identifier should not be confused with rioxxterms:version_of_record.

dc:coverage Zero or more

Coverage (dc:coverage) will typically include a temporal period (a period label, date, or date range) or jurisdiction (such as a named administrative entity).

In line with the Openaire Guidelines, which recommends the inclusion of this property, dc:coverage is also considered a RECOMMENDED property in Rioxx.

rioxxterms:publication_date Zero or one

This property takes the publication date of the resource in the form in which it would be cited. This allows a Rioxx record to function as a reasonable bibliographic record for the resource.

Where possible the property's value' SHOULD be encoded according to the W3CDTF (a profile of ISO 8601) which typically follows the following format: YYYY-MM-DD.

Example:

<rioxxterms:publication_date>
    2011-02-23
</rioxxterms:publication_date>

As Rioxx can be used to help establish compliance with funders' mandates and licensing of open access publications, the critical dates for the assertion of compliance are those held in the start_date attributes of the ali:license_ref properties.

It is acknowledged that the publication date conventions of certain publishers vary, making the identification of precise publication dates problematic, especially in instances where a publisher assigns a resource to a seasonal issue date, e.g. Spring 2020, Winter 2019, etc. To maintain adherence to the above noted encoding conventions, resources with publication dates assigned to seasonal issues should be expressed according to the following convention, with months expressed as per:

  • 01 = winter (beginning of year)
  • 04 = spring
  • 07 = summer
  • 10 = autumn
  • 12 = winter (end of year)

Examples:

Spring 2020

<rioxxterms:publication_date>
    2020-04
</rioxxterms:publication_date>

Winter 2019 (end of year)

<rioxxterms:publication_date>
    2019-12
</rioxxterms:publication_date>
dc:description Zero or more

This field may be indexed and its contents presented to people conducting searches. The goal is to describe the content of the resource using free text. It is RECOMMENDED that an English language abstract be used where available. HTML or other markup tags SHOULD NOT be included in this field.

dc:language One or more

This refers to the primary language in which the content of the resource is presented. The property MAY be repeated if the resource contains multiple languages. Values used for this property MUST conform to ISO 639-3. This offers two and three letter tags e.g. "en" or "eng" for English and "en-GB" for English used in the UK.

dc:source Zero or one

The source label describes a resource from which the resource is derived (in whole or in part). It is RECOMMENDED that the source is referenced using a unique identifier from a recognised system e.g. the unique 8-digit International Standard Serial Numbers (ISSN) assigned to electronic periodicals, or the 13 digit International Standard Book Number (ISBN13) assigned to books. In the latter case, the ISBN13 for the electronic version of the book SHOULD be used if available.

Use of this property is applicable where the resource is to be published as part of a larger resource. Examples might include a conference paper belonging to proceedings or a chapter of a book, but not a complete book for example.

dc:title Exactly one

This refers to the title, and any sub-titles, of the resource. The title should be represented using the original spelling and wording. The RECOMMENDED format for expressing subtitles is:

Title: Subtitle

Note that where the resource is a chapter in a book, the chapter title MUST be entered here, with the ISBN13 of the book being recorded in the dc:source property.

rioxxterms:version_of_record Zero or one

This property MUST contain an HTTP(S) URI which is a persistent identifier for the published 'version of record' of the resource. If a DOI has been issued by the publisher then this MUST be used (and such a DOI MUST be represented in its HTTP(S) form). For example:

<rioxxterms:version_of_record>
    https://doi.org/10.1103/PhysRevD.102.043015
</rioxxterms:version_of_record>
dcterms:date_accepted Exactly one

The date on which the resource was accepted for publication. Property content MUST be encoded according to the W3CDTF (a profile of ISO 8601) which typically follows the following format: YYYY-MM-DD.

dc:subject Zero or more

The Openaire Guidelines recommend the inclusion of this property.

rioxxterms:record_public_release_date Zero or one

This property takes the date upon which metadata about the resource being described was first made publicly visible. Property content MUST be encoded according to the W3CDTF (a profile of ISO 8601) which typically follows the following format: YYYY-MM-DD.

Examples:

<rioxxterms:record_public_release_date>
    2020-10-02
</rioxxterms:record_public_release_date>

or

<rioxxterms:record_public_release_date>
    2020-09-29T19:20+01:00
</rioxxterms:record_public_release_date>

It is anticipated that in many circumstances rioxxterms:record_public_release_date will be captured and exposed in repository metadata prior to availability of related resources such as the "full text" for a publication; for example as the result of a delay in depositing the full text, or where it is under temporary embargo or temporary dark archiving.

rioxxterms:type One or more

Type refers to the nature or genre of the content of the resource. This property should not be confused with dc:format.

Values recorded at rioxxterms:type MUST be taken from the COAR Controlled Vocabulary for Resource Type Genres (Version 3.0), which provides a hierarchical model of resource type genres supported by language independent HTTP(S) URIs.

Example:

<rioxxterms:type uri="https://purl.org/coar/resource_type/c_5794">
    conference paper
</rioxxterms:type>

The COAR Controlled Vocabulary for Resource Type Genres is detailed in its treatment of type genres. It is anticipated that only the largest repositories would accommodate all vocabulary values, with most others implementing a subset in line with the resource types managed by the repository.