Filenames
For file and directory naming use only alphanumeric characters without special characters such as quotes, punctuation marks, characters with diacritics, spaces, slashes and the like. Underscores (_) and hyphens (-) can be used. For further guidance have a look at the recommendations from IANUS.
Formats
You are strongly encouraged to provide the resources in standard formats acknowledged by the respective research communities. We will support you in converting the data if this is necessary and feasible.
Suitable formats should be widely in use and, if possible, be in compliance with open and non-proprietary standards. Files should not be password protected, encrypted or compressed in a lossy way. If files depend on references to other files, fonts or other external data, these objects should be deposited as well, or at least described in e.g. a plain text README file. Whenever a choice for encoding is possible choose UTF-8 without the byte order mark (BOM) (see FAQ).
If file conversions become necessary, potential loss of information should be minimised. If lossless conversion into an open or recommended format cannot be achieved the original files will be kept together with the converted versions.
The preferred format for annotated textual data in our repository is TEI/XML (Text Encoding Initiative) with metadata in teiHeaders. Additionally, all language resources have to be described in CMDI (Component Metadata Infrastructure). We will gladly support you in creating this metadata. For an overview of recommended standard formats have a look at the CLARIN standards recommendations.
For other formats not covered in the CLARIN standards, for general text formats, and media formats refer to the table for preferred and accepted formats provided by us. The table is based on the formats listed at IANUS and at the Archaeology Data Service.
Preferred and accepted formats in ARCHE (08. 2017). Preferred formats are suitable for long-term preservation. Accepted formats require conversion.
EXTENSION | FORMAT NAME & VERSION | PREFERENCE | |
---|---|---|---|
PDF DOCUMENTS | |||
PDF/A-1 | preferred | ||
PDF/A-2 | preferred | ||
PDF/A-3 | accepted | ||
other PDF variants | accepted | ||
TEXT DOCUMENTS | |||
odt | Open Document Format | preferred | |
docx | Office Open XML Document (Microsoft) | preferred | |
doc | Microsoft Word | accepted | |
rtf | Rich Text Format | accepted | |
sxw | Open Office XML | accepted | |
txt | plain text | preferred | |
xml | eXtensible Markup Language | preferred | |
sgml | Markup text | preferred | |
html, htm | HyperText Markup Language | preferred | |
dtd | document type definition | preferred | |
xsd | xml schema definition | preferred | |
IMAGES (RASTER) | |||
tiff, tif | Baseline TIFF v. 6, uncompressed | preferred | |
dng | Adobe Digital Negative | preferred | |
png | Portable Network Graphics | accepted | |
jpeg, jpg | Joint Photographic Expert Group | accepted | |
gif | Graphics Interchange Format | accepted | |
bmp | Bit-Mapped Graphics Format (Microsoft) | accepted | |
psd | Photoshop (Adobe) | accepted | |
cpt | CorelPaint | accepted | |
jp2, jpx | JPEG2000 | accepted | |
IMAGES (VECTOR) | |||
svg | Scalable Vector Graphis 1.1, uncompressed | preferred | |
cgm | Computer Graphics Metafile, WebCGM | accepted | |
dxf | Drawing Interchange Format (Autodesk) | accepted | |
dwg | Drawing (Autodesk) | accepted | |
ps, eps | PostScript, Encapsulated PostScript | accepted | |
ai, indd | Adobe Illustrator, Adobe InDesign | accepted | |
dwf | Design Web Format (Autodesk) | accepted | |
TABLES & SPREADSHEETS | |||
csv | Comma Separated Values | preferred | |
tsv | Tab Separated Values | preferred | |
ods | Open Document Format | preferred | |
xlsx | Office Open XML Workbook (Microsoft) | preferred | |
sxc | OpenOffice XML | accepted | |
xls | Microsoft Excel | accepted | |
DATABASES | |||
siard | Software Independent Archiving of Relational Databases | preferred | |
sql | Structured Query Language | preferred | |
json | JavaScript Object Notation | accepted | |
mdb, accdb | Microsoft Access Databases | accepted | |
fp5, fp7, fmp12 | FileMaker Databases | accepted | |
dbf | dBase Databases | accepted | |
bak, db, dmp | binary export formats for databases | accepted | |
odb | Open Document Databases | accepted | |
VIDEO | |||
mkv | Matroska | preferred | |
mj2 | Motion JPEG 2000 | accepted | |
mp4 | MPEG-4 | accepted | |
mxf | Material eXchange Format | accepted | |
mpeg | MPEG-2 | accepted | |
avi | Audio Video Interleave | accepted | |
mov | QuickTime File Format | accepted | |
asf, wmv | Advanced Systems Format (ASF/WMV) | accepted | |
ogg, ogv, ogx, ogm, spx | Ogg | accepted | |
flv, f4v | Flash | accepted | |
AUDIO | |||
flac | Free Lossless Audio Codec | preferred | |
wav | Waveform Audio File Format | preferred | |
bwf | Broadcast Wave Format | preferred | |
wav | RF64/MBWF | accepted | |
aac, mp4 | Advanced Audio Coding/MP4 | accepted | |
mp3 | MP3 | accepted | |
aiff | Audio Interchange File Format | accepted | |
wma | Windows Media Audio | accepted | |
3D & VIRTUAL REALITY | |||
x3d | eXtensible 3D Graphics | preferred | |
dae | COLLADA | preferred | |
obj | Wavefront .obj file | preferred | |
ply | Polygon File Format | preferred | |
vrml | Virtual Reality Modeling Language | accepted | |
u3d | Universal 3D Format | accepted | |
stl | Standard Tessellation Language | accepted | |
WEBSITES | |||
xhtml, xht | Extensible HyperText Markup Language | preferred | |
mht, mhtml | MIME Encapsulation of Aggregate HTML Documents | preferred | |
warc | WebArchive | preferred | |
maff | Mozilla Archive Format | accepted |
Metadata
Metadata should answer basic questions regarding your data allowing others to understand, discover and share the data. Good metadata provides information about how data was produced, who was involved in the making and what the data is about. Using metadata is an essential part in complying to the FAIR Data Principles, to make data Findable, Accessible, Interoperable, and Reusable (see FAQ).
Metadata can cover different levels like collection-level, file-level and even data unit-level. Ideally metadata is implemented accurately and as completely as possible making use of a standard format. The Archaeology Data Service and IANUS provide format agnostic collection-level metadata which can be applied to all types of domains. Additionally in the respective sections in IANUS’ IT-Empfehlungen file-level metadata is presented, which in general is more technical and heavily depends on the data type and the methods used.
The metadata required when depositing in ARCHE is detailed in the table for metadata requirements. At ARCHE additionally project-level metadata is used alongside collection-level and file-level metadata. Mandatory fields required by ARCHE are marked as such, but using recommended fields is essential for increased findability, understandability and citability of data. The metadata schema of ARCHE is also available in OWL-format annotated and with extensive documentation, of which also a tabular representation exists.
Properties are listed for projects, collections and resources.
m = mandatory, r = recommended, o = optional, and * = property can be used multiple times.
PROPERTY | MACHINE NAME | PROJECT | COLLECTION | RESOURCE |
---|---|---|---|---|
Title | hasTitle | m | m | m |
Alternative Title | hasAlternativeTitle | o* | o* | o* |
Identifier(s) | hasIdentifier | m* | m* | m* |
Identifier(s) without Link | hasNonLinkedIdentifier | r* | r* | o* |
URL | hasUrl | r* | o* | o* |
Description | hasDescription | m* | r* | r* |
Language | hasLanguage | r* | r* | r* |
Life Cycle Status | hasLifeCycleStatus | r | r | - |
Completeness statement | hasCompleteness | - | r | - |
Version | hasVersion | - | o | o |
Table of Contents | hasTableOfContents | - | o | o |
Extent | hasExtent | - | r | r |
Category | hasCategory | - | - | m |
Applied Method | hasAppliedMethod | o* | o* | o* |
Description of Applied Method | hasAppliedMethodDescription | o* | o* | o* |
Technical Information | hasTechnicalInfo | o* | o* | o* |
Hardware | hasUsedHardware | - | o* | o* |
Software | hasUsedSoftware | - | o* | o* |
Character Encoding | hasCharacterEncoding | - | - | o |
Schema | hasSchema | - | - | o* |
Description of Arrangement | hasArrangement | - | o | - |
Description of naming scheme used | hasNamingScheme | - | o | - |
Editorial practice used | hasEditorialPractice | o | o | o |
Note | hasNote | o | o | o |
Custom citation | hasCustomCitation | o | o | o |
OAI-PMH set | hasOaiSet | o* | o* | o* |
Handle for CMDI record | hasMetadataPid | o | o | o |
Last Update Date | hasUpdatedDate | r | r | r |
ACTORS INVOLVED | ||||
Principal Investigator(s) | hasPrincipalInvestigator | r* | r* | r* |
Contact(s) | hasContact | m* | o* | o* |
Creator(s) | hasCreator | - | o* | o* |
Author(s) | hasAuthor | - | o* | o* |
Editor(s) | hasEditor | - | o* | o* |
Contributor(s) | hasContributor | o* | o* | o* |
Digitising Agent(s) | hasDigitisingAgent | - | o* | o* |
Funder(s) | hasFunder | o* | o* | o* |
Metadata Creator(s) | hasMetadataCreator | m* | m* | m* |
Licensor(s) | hasLicensor | - | m* | m* |
COVERAGE | ||||
Research Discipline | hasRelatedDiscipline | m* | r* | o* |
Coverage | hasCoverage | o* | o* | o* |
Has Actor | hasActor | r* | r* | o* |
Spatial Coverage | hasSpatialCoverage | r* | r* | o* |
Subject | hasSubject | m* | r* | o* |
Temporal Coverage | hasTemporalCoverage | r* | r* | o* |
Temporal ID | hasTemporalCoverageIdentifier | r* | r* | o* |
Coverage start date | hasCoverageStartDate | r* | r* | o* |
Coverage end date | hasCoverageEndDate | r* | r* | o* |
RIGHTS & ACCESS | ||||
Owner(s) | hasOwner | - | m* | m* |
Rights Holder(s) | hasRightsHolder | - | m* | m* |
License | hasLicense | - | o | m |
License Summary | hasLicenseSummary | - | m | - |
Access Restriction | hasAccessRestriction | - | - | o |
Access Restriction Summary | hasAccessRestrictionSummary | - | o* | - |
Access rights for | hasRestrictionRole | - | o* | o* |
DATES | ||||
Start Date | hasStartDate | m | - | - |
End Date | hasEndDate | r | - | - |
Creation Date Start | hasCreatedStartDate | - | o* | o* |
Creation Date End | hasCreatedEndDate | - | o* | o* |
Creation Date Original Start | hasCreatedStartDateOriginal | - | - | o* |
Creation Date Original End | hasCreatedEndDateOriginal | - | - | o* |
Start of data collection | hasCollectedStartDate | o* | o* | - |
End of data collection | hasCollectedEndDate | o* | o* | - |
Date | hasDate | o* | o* | o* |
RELATIONS TO OTHER PROJECTS, COLLECTIONS OR RESOURCES | ||||
See also | relation | o* | o* | o* |
Related Project(s) | hasRelatedProject | - | o* | - |
Related Collection(s) | hasRelatedCollection | m* | - | - |
Is title image of | isTitleImageOf | - | - | o |
Continues | continues | o* | o* | o* |
Documents | documents | - | o* | o* |
Is source of | isSourceOf | - | o* | o* |
New Version of | isNewVersionOf | - | o* | o* |
Description of version changes | hasVersionInfo | - | o* | o* |
Part of | isPartOf | - | o* | m* |
CURATION, AUTOMATIC | ||||
Depositor(s) | hasDepositor | - | m* | m* |
Available Since | hasAvailableDate | m | m | m |
Handle | hasPid | o* | o* | o* |
Number of Items | hasNumberOfItems | - | o | - |
Binary Size | hasBinarySize | - | o | o |
Format | hasFormat | - | - | r |
File Path | hasLocationPath | - | o | o |
Data Curator(s) | hasCurator | - | r* | r* |
Hosted by | hasHosting | - | m* | m* |
Date of Submission | hasSubmissionDate | - | r | r |
Date of Acceptance | hasAcceptedDate | - | r | r |
Date of Transfer | hasTransferDate | - | r* | r |
Transfer Method | hasTransferMethod | - | o* | o |