ARCHE provides several interfaces to search and retrieve its data in programmatic ways.
ARCHE APIs are built using microservices. It means the ARCHE core service provides only a basic functionality for creating, reading, updating, and deleting (CRUD) while all other features (e.g. OAI-PMH) are handled by separate small services (called dissemination services) built on top of the core API.
ARCHE Core API #
The main ARCHE API provides the core functionality for searching, reading, creating, updating, and deleting. It allows you to access a resource’s binary content as well as its metadata in various RDF serialisations (turtle, RDF/XML, JSON-LD, n-triples), e.g. to get RDF metadata of all resources belonging to a collection with a given identifier like https://hdl.handle.net/21.11115/0000-000C-D89B-2 in turtle format you can use:
https://arche.acdh.oeaw.ac.at/api/search?property[]=https%3A%2F%2Fvocabs.acdh.oeaw.ac.at%2Fschema%23hasIdentifier&value[]=https%3A%2F%2Fhdl.handle.net%2F21.11115%2F0000-000C-D89B-2&readMode=relativesOnly&format=text%2Fturtle
A detailed documentation of the core API (following the OpenAPI 3 standard) can be found on the SwaggerHub.
Practical tips:
- If you process the output programmatically, it is strongly advised to use the n-triples format to ensure best performance, as detailed in the Metadata API performance documentation.
- The best way of dealing with RDF output provided by our metadata and search endpoints is to use a dedicated RDF parsing library (which do exist for all major programming languages). Parsing RDF with regular expressions or assuming a particular structure of an XML or JSON serialisation tree is error-prone because the same RDF data can be serialised into turtle, XML or JSON in various ways.
- If you are using PHP, there is a ready to use library on GitHub providing bindings to this API: The arche-lib.
Automatic Format Negotiation #
All ARCHE resource URIs beginning either with https://hdl.handle.net/ or https://id.acdh.oeaw.ac.at/ (so called PIDs) allow you to specify a desired serialisation of the requested resource.
If there is a dissemination service capable of providing a requested serialisation, you will be automatically redirected to it. If not, you will be redirected to the “raw” resource content, i.e. a binary file or metadata for resources without binary content (e.g. information about a person).
The requested format can be specified either by providing a media type (MIME type) or the name of the desired dissemination service in the format request query parameter, e.g.:
- To get a thumbnail image of a given resource, you can request it in the image/png media type or explicitly use the thumbnail dissemination service, e.g.: https://id.acdh.oeaw.ac.at/schnitzler/bahrschnitzler?format=image%2Fpng, https://id.acdh.oeaw.ac.at/schnitzler/bahrschnitzler?format=thumbnail, https://hdl.handle.net/21.11115/0000-000E-C8A6-5@format=image%2Fpng, https://hdl.handle.net/21.11115/0000-000E-C8A6-5@format=thumbnail.
- To get a resource’s metadata in the turtle format, you can request it in the text/turtle media type or use the metadata dissemination service, e.g.: https://id.acdh.oeaw.ac.at/schnitzler/bahrschnitzler?format=test%2Fturtle, https://id.acdh.oeaw.ac.at/schnitzler/bahrschnitzler?format=metadata, https://hdl.handle.net/21.11115/0000-000E-C8A6-5@format=text%2Fturtle, https://hdl.handle.net/21.11115/0000-000E-C8A6-5@format=metadata.
- To get a bibliographic entry of a resource in BibLaTeX format, you can request it with the application/x-bibtex media type or use the biblatex dissemination service, e.g.: https://id.acdh.oeaw.ac.at/schnitzler/bahrschnitzler?format=application%2Fx-bibtex, https://id.acdh.oeaw.ac.at/schnitzler/bahrschnitzler?format=biblatex, https://hdl.handle.net/21.11115/0000-000E-C8A6-5@format=application%2Fx-bibtex, https://hdl.handle.net/21.11115/0000-000E-C8A6-5@format=biblatex.
Practical tips:
- To get a list of all dissemination services available for a given resource you can use the special value of the format parameter __list__: https://id.acdh.oeaw.ac.at/schnitzler/bahrschnitzler?format=__list__, https://hdl.handle.net/21.11115/0000-000E-C8A6-5@format=__list__.
- Some dissemination services can only process selected ARCHE resources, e.g. the dissemination service Custom TEI to HTML transformation can only be used for ARCHE resources storing TEI/XML.
- A more in detail description of the resolution process is provided in the ARCHE Suite Dissemination Services documentation.
RDF #
An RDF serialisation of any ARCHE resource metadata can be obtained using the {resource id starting with https://id.acdh.oeaw.ac.at/}?format=metadata and/or {resource PID starting with https://hdl.handle.net/}@format=metadata (the @ instead of the ? is not a typo), e.g. https://id.acdh.oeaw.ac.at/schnitzler/bahrschnitzler?format=metadata or https://hdl.handle.net/21.11115/0000-000E-C8A6-5@format=metadata.
You can specify the desired RDF serialisation format by providing its media type (MIME type) in the HTTP Accept header (text/turtle, application/n-triples, application/rdf+xml, application/ld+json and text/html are supported), e.g.
curl -L -H ‘Accept: application/rdf+xml’ ‘https://id.acdh.oeaw.ac.at/schnitzler/bahrschnitzler/indices/listplace.xml?format=metadata’
Practical tips:
- If you try it out in a browser please be aware that each browser always adds a text/html HTTP Accept header to the request it makes so you will always get an HTML serialisation of the metadata. Use a dedicated tool like curl or Postman if you want better control over HTTP requests you are sending by hand.
- The best way of dealing with RDF output provided by our metadata and search endpoints is to use a dedicated RDF parsing library (which do exist for all major programming languages). Parsing RDF with regular expressions or assuming a particular structure of an XML or JSON serialisation tree is error-prone because the same RDF data can be serialised into turtle, XML or JSON in various ways.
- If you are processing the metadata programmatically, use the application/n-triples format. It's the fastest both on server and client side and reduces the load on the server side as documented in the Metadata API performance documentation.
BiblaTeX #
A bibliographic entry in the BibLaTeX format can be obtained for any ARCHE resource by requesting its dissemination either in biblatex or application/x-bibtex format (see the Automatic Format Negotiation section above for details), e.g. https://id.acdh.oeaw.ac.at/schnitzler/bahrschnitzler?format=biblatex.
The service takes two optional request query parameters:
- lang - preferred language (e.g. en or de). While there is no guarantee of values (like title) being available in a requested language, they will be used if present, e.g. compare https://id.acdh.oeaw.ac.at/dial?format=biblatex&lang=en and https://id.acdh.oeaw.ac.at/dial?format=biblatex&lang=de.
- override - a bibliographic entry in BibLaTeX format that overrides and/or extends field values mapped from the repository metadata. The returned entry will be a merge of the data coming from the repository and data provided using this parameter. The data provided along this parameter will take precedence over data coming from the repository. More information is included in the README of arche-bibtex on GitHub. A basic example providing an alternative title for the resource: https://id.acdh.oeaw.ac.at/dial?format=biblatex&override=title%20%3D%20%7BThe%20Digital%20%7BIlse%20Aichinger%7D%20List%20of%20Literature%7D.
Thumbnails #
Thumbnails for all ARCHE resources can be obtained by requesting its dissemination in the thumbnail format (see the Automatic format negotiation section above for details), e.g. https://id.acdh.oeaw.ac.at/digitarium/facs/17030808-000.png?format=thumbnail or https://hdl.handle.net/21.11115/0000-0010-3310-2@format=thumbnail.
The service takes two optional request query parameters width and height (with default values of 100) allowing to specify the thumbnail image size in pixels, e.g. https://id.acdh.oeaw.ac.at/digitarium/facs/17030808-000.png?format=thumbnail&width=700&height=600.
Reconcilation Service API (OpenRefine) #
ARCHE provides the Reconciliation Service API endpoint which can be used e.g. in OpenRefine.
The endpoint URL that can be used to connect ARCHE to OpenRefine is https://arche.acdh.oeaw.ac.at/openrefine/reconcile.
IIIF #
Images (TIF, PNG) stored in ARCHE can be accessed through the IIIF protocol. The IIIF endpoint is implemented with the Loris IIIF Image Server. To access an image’s IIIF endpoint you should request its dissemination in the iiif format (see the Automatic format negotiation s ection above for details), e.g.: https://id.acdh.oeaw.ac.at/ODeeg/Collections/AT-Vienna-KHM/KHM-ANSA-IV1/Photos/KHM-ANSA-IV1_im01.tif?format=iiif.
You can use the param query parameter to pass additional IIIF Image API parameters, e.g. https://id.acdh.oeaw.ac.at/ODeeg/Collections/AT-Vienna-KHM/KHM-ANSA-IV1/Photos/KHM-ANSA-IV1_im01.tif?format=iiif¶m=full/full/90/default.png to get the image rotated by 90 degrees and converted to PNG or https://id.acdh.oeaw.ac.at/pez-nachlass/425627.tif?format=iiif¶m=info.json to retrieve the image's info.json. Please consult the IIIF Image API documentation for the list of available IIIF Image API parameters.
OAI-PMH #
ARCHE implements the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). The corresponding endpoint can be found at https://arche.acdh.oeaw.ac.at/oaipmh/. Please consult the OAI-PMH specification for further information on how to interact with this endpoint.
If you want to get a list of all records in ARCHE you can try this (takes some time!): https://arche.acdh.oeaw.ac.at/oai?verb=ListIdentifiers&metadataPrefix=oai_dc.
If you want to see information for a specific record you can use this: https://arche.acdh.oeaw.ac.at/oai?verb=GetRecord&metadataPrefix=oai_dc&identifier=https://hdl.handle.net/21.11115/0000-000B-C715-D.
ARCHE provides metadata in the following representations via OAI-PMH:
- Dublin Core (example in ARCHE; available for all ARCHE resources)
- RDF/XML serialisation of the ARCHE metadata (example record in ARCHE; available for all ARCHE resources; schema description)
- CMDI (only for resources in the clarin-vlo OAI-PMH set)
- Kulturpool (only for resources in the kulturpool OAI-PMH set)
You can get a list of all supported metadata formats with this: https://arche.acdh.oeaw.ac.at/oai?verb=ListMetadataFormats.
Since resources in ARCHE cover a wide range of humanities data, OAI-PMH Sets were introduced to allow for selective harvesting. You can get a list of all sets with this: https://arche.acdh.oeaw.ac.at/oai?verb=ListSets.
External Aggregators #
The metadata for the resources stored in ARCHE is already being harvested via OAI-PMH by external aggregators.
- CLARIN Virtual Language Observatory (VLO): collects language related resources. Click here to get an overview of ARCHE’s resources in VLO and click here to get to a more technical overview of the harvested records. ARCHE’s OAI-PMH set harvested by the VLO can be found here.
- Kulturpool: collects digital Austrian cultural heritage assets. It also acts as a central data provider for Europeana. The harvested records are displayed here and can be found on Europeana as well. The set provided by ARCHE can be found here.
GUIDANCE
Important Information
Dissemination Services
Web applications designed to process and visualise specific data types or formats