Persistent identifiers

Persistent identifiers

Preserving digital content for the long term is challenging due technological obsolescence, hardware and software failures, or data corruption, as well as the temporary and easily changed nature of web links. To tackle this last aspect, the National Library has implemented a persistent identifier system based on the Archival Resource Keys (ARK) format, which ensures that links to Luxembourg’s digital resources always remain reliable and accessible. Any organisation located in Luxembourg and managing digital collections can benefit from this service.

Main benefits

The ARK format is based on the following key principles.

Wide scope of application

ARKs can be assigned to anything digital, physical, or abstract: digitised or born-digital documents and objects, (scientific) publications and datasets, genealogical records, museum specimens, educational resources, authors, scholars, and more. Since 2001, over 1,000 organisations across the world have registered to assign ARKs, including national and university libraries and archives, art museums, natural history museums, publishers, data centres, government agencies, vendors and research labs.

Persistence

ARKs guarantee continuous access to digital information, ensuring stability despite technological changes (such as access URL modifications due to system upgrades or format migrations) or organisational shifts (such as the name change of an organisation). They improve link citability, simplifying users’ ability to reference and identify specific digital resources via hyperlinks. This facilitates citing resources in diverse contexts, such as publications, scientific articles, web pages, or bookmarks, and maintains the stability and functionality of hyperlinks over time.

Non-reassignment

Once an ARK identifier has been assigned and linked to an object, it remains unique indefinitely. Even in the event of resource deletion, it’s crucial that a published ARK persists, directing users either to the resource’s metadata or to an explanation of its inaccessibility. Furthermore, access to a substitute resource may be provided to the user.

Opacity

The strings of characters used in persistent identifiers are typically “opaque” (meaningless), deliberately revealing little about what they’re assigned to. Opaque identifiers remain unchanged over time, ensuring consistent access to digital resources even if their location or metadata changes. By not revealing any information about the underlying resource, they also enhance privacy and security.

Metadata

Metadata (“data about data”, i.e., information about a resource) makes it much easier to understand and work with opaque identifiers, which don’t give any hints about what they’re identifying. Without metadata, the only way of knowing what object an ARK refers to, would be to directly access it. Whether an ARK redirects to a webpage or a file, metadata provides crucial information about the object, such as its creation date, origin, or whether there are newer versions available. To access metadata of a resource, “?info” can simply be added to an ARK (example: https://persist.lu/ark:70795/tm9z0j?info). While creating metadata isn’t required, it’s strongly recommended to make using and managing ARKs easier.

In the case of the BnL’s persistent identifier service, metadata includes details of the organisation that created the ARK, the current access URL for the resource, and when the ARK was created and last updated. For ease of understanding, this metadata is structured as if to answer the questions “who?”, “what?” and “when?”.

The “Policy / persistence statement” outlines the organisation’s commitment to how the resource will be disseminated, whether the content will change over time or for how long it will be available.

How do persistent identifiers work?

An ARK consists of three main parts: the resolver, the unique core identifier (also called base object name) and an optional suffix.

Resolver

A persistent identifier remains functional even if the linked resource changes website. This is made possible by a resolver, which serves as a specialised website redirecting identifiers to the most suitable current location. The persist.lu website serves as a resolver for ARKs assigned to organisations in Luxembourg. To ensure effective redirection, institutions authorised to assign ARKs must keep their URLs updated and notify the resolver of any changes.

A resolver combines a protocol (e.g. “https” or “http”) with a hostname known as the Name Mapping Authority, because it is a service that accepts a name as input and returns (“maps” it to) such things as object content, object metadata or object policies. While multiple hosting arrangements are possible, only some are intended to be stable over the long term. For instance, both https://persist.lu/ark:70795/tm9z0j and https://viewer.eluxemburgensia.lu/ark:70795/tm9z0j lead to the same resource (the 3 May 2002 edition of d’Lëtzebuerger Land), but only the persist.lu URL is guaranteed to remain functional in the same manner for decades to come.

Core identifier

The base object name or the core identifier consists of three parts: the “ark:” label, the Name Assigning Authority Number (NAAN) and the assigned name of the object. The NAAN is a number or string of characters identifying an organisation that creates or assigns identifiers. In the above example, 70795 represents the NAAN assigned to the National Library, while “tm9z0j” identifies the specific object (in this case, a digitised newspaper edition).

Optional suffixes

In ARK persistent identifiers, suffixes are optional elements that provide extra details about the identified resource. They offer a flexible and versatile means of providing additional context and granularity to resource identification, allowing users to precisely specify and access different aspects or variations of a resource as needed (different versions, formats, or parts of a larger whole). While the core identifier consistently directs to the same resource or its metadata, the resolution of suffixes might not always be assured, as it often depends on the capabilities of the specific access platform.

For Luxembourgish organisations

Any organisation located in Luxembourg and managing digital collections can take advantage of the persistent identifier service. However, they must agree to:

  • Update ARK-referenced URLs if access links change.
  • Ensure long-term digital preservation of objects referenced by ARKs.
  • Establish a system for explaining why an object is inaccessible if it’s deleted.
  • Guarantee the accessibility of ARK-referenced URLs on the internet, either granting access to the digital object or providing access conditions.

The National Archives (NAAN 76610) and Musée national d’archéologie, d’histoire et d’art (NAAN 72849) are the first institutions to employ the persistent identifier service.

Usage at the BnL

By adopting ARK as its persistent identifier format and offering this service, the BnL began attributing and resolving ARKs. Initially, this initiative focused on digitised documents available on eluxemburgensia.lu, including over 100,000 ARKs assigned to digitised daily and weekly newspapers, manuscripts, books and posters.

Over the past few years, the assignment of ARK persistent identifiers has become a routine practice at the National Library. Nowadays, various materials such as born-digital publications, bibliographic records, authority records, and certain web pages or policy documents are eligible to receive ARKs.

At the same time, the National Library has undertaken the substantial task of replacing previously used references with ARKs in persist.lu URL format, across platforms like the a-z.lu portal and the BnL website.

Last update