Property | Value(s) |
acdh:hasUpdatedDate |
acdh:hasAccessRestrictionSummary |
public: 5781
acdh:hasCollectedEndDate |
acdh:aclWrite |
acdh:hasCustomCitation |
author = {Resch, Claudia and Kampkaspar, Dario}
acdh:hasRightsHolder | |
acdh:hasDigitisingAgent | |
acdh:hasUpdatedRole |
acdh:hasAppliedMethodDescription |
The methodical approach used for this collection follows a sequence of steps to be taken for every issue: Once digital pages provided by ANNO - Austrian Newspaper Online have been obtained, they were subjected to preprocessing and automatic deskewing. The recognition phase with layout analysis and text recognition relied on the Transkribus software. As results in recognizing broken scripts like German blackletter typeface are usually far from satisfactory, the project started out with a few issues that were transcribed completely by hand and used to train an initial model. Subsequently, the following issues to be processed were first recognized by the software and then corrected to an accuracy of around 99.7%. Each such corrected set then served as a training and test data set for a new model that was applied to the next batch of issues. The current model specifically trained for the Diarium generates text with an error rate of less than 1 per 100 characters within a standard paragraph from a good quality image.
In a next step the transcribed full-text was exported as a single TEI XML file per issue. Some post-processing was then required to prepare the text for publishing, e.g.the application of a basic whitespace tokenization in order to be able to address every “word” in the text with a unique identifier. Additionally, the pixel coordinates of the text regions were used to find their relative position within a page. This was done by applying a series of XSLT transformations to the files exported from Transkribus. As soon as all automated processes were completed, the results were checked visually and uploaded to the project’s web application DIGITARIUM. |
acdh:aclRead |
acdh:hasAvailableDate |
acdh:hasIdentifier |,,
acdh:hasLicensor | |
acdh:hasOwner | |
acdh:hasContributor | |
acdh:hasLanguage | |
acdh:hasCoverageStartDate |
acdh:hasDepositor | |
acdh:hasRelatedDiscipline | |
acdh:hasDescription |
Wienerisches DIGITARIUM is a digital collection of more than 300 transcribed full text issues of the “Wien[n]erisches Diarium”, a historical newspaper that was founded in Vienna in 1703 and is still published under the title “Wiener Zeitung”. The issues provided as facsimiles and in XML/TEI P5 including different layers of annotation are evenly distributed over the 18th century and offer a reliable basis for a wide range of research interests.
The collection was created within the project “Das Wien[n]erische Diarium: Digitaler Datenschatz für die geisteswissenschaftlichen Disziplinen“ (PI: Claudia Resch) which was funded by the “go!digital2.0” program of the Austrian Academy of Sciences and carried out at the Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH) from 1 March 2017 to 29 February 2020 (Project-Nr. GD 2016/16, ÖAW 0704). |
acdh:hasLicenseSummary |
CC0 1.0: 5447 / CC BY 4.0: 336
acdh:hasCollectedStartDate |
acdh:hasCurator | |
acdh:hasCreator | |
rdf:type | |
acdh:hasEditorialPractice |
In order to create a scientifically sound basis for philological interpretation and allow as many research interests as possible, only complete issues have been included. Normalising interventions were preferably avoided and the historical language was reproduced as close to the printed original as possible. Hence, the typography of the original was largely retained, i.e. "u" and "v" or "i" and "j" have been preserved as well as ligatures, small caps or the change between Fraktur and Antiqua printing – only the differentiation of the two variants of "s" and "r" (so-called "long s" and "round r") was omitted. Consonantal ligatures, such as those found in "tz", "ct", "st" or "ff", are resolved in the transcribed text. Double hyphens (in compositions such as "Reichs=Raht") are represented by equal signs, since these come closest to the print image of the time. Unreadable passages as well as uncertain passages added by the editors have been marked in the transcription with angle brackets.
acdh:hasOaiSet | |
acdh:hasCreatedEndDate |
acdh:hasSubject |
historical newspapers
acdh:hasTitle |
acdh:hasCompleteness |
Project completed, no further changes.
acdh:hasContact | |
acdh:hasPid | |
acdh:hasHosting | |
acdh:hasCoverageEndDate |
acdh:hasBinarySize |
13.43 GB
acdh:hasNumberOfItems |
acdh:hasCreatedStartDate |
acdh:hasLifeCycleStatus | |
acdh:hasNonLinkedIdentifier |
Austrian Academy of Sciences programme "go!digital 2.0": GD 2016/16
acdh:hasMetadataCreator | |
acdh:hasAlternativeTitle |
Wienerisches DIGITARIUM
acdh:createdBy |
Available since 23 12 2022
Wienerisches DIGITARIUM is a digital collection of more than 300 transcribed full text issues of the “Wien[n]erisches Diarium”, a historical newspaper that was founded in Vienna in 1703 and is still published under the title “Wiener Zeitung”. The issues provided as facsimiles and in XML/TEI P5 including different layers of annotation are evenly distributed over the 18th century and offer a reliable basis for a wide range of research interests.
The collection was created within the project “Das Wien[n]erische Diarium: Digitaler Datenschatz für die geisteswissenschaftlichen Disziplinen“ (PI: Claudia Resch) which was funded by the “go!digital2.0” program of the Austrian Academy of Sciences and carried out at the Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH) from 1 March 2017 to 29 February 2020 (Project-Nr. GD 2016/16, ÖAW 0704).
Wienerisches DIGITARIUM is a digital collection of more than 300 transcribed full text issues of the “Wien[n]erisches Diarium”, a historical newspaper that was founded in Vienna in 1703 and is still published under the title “Wiener Zeitung”. The issues provided as facsimiles and in XML/TEI P5 including different layers of annotation are evenly distributed over the 18th century and offer a reliable basis for a wide range of research interests.
The collection was created within the project “Das Wien[n]erische Diarium: Digitaler Datenschatz für die geisteswissenschaftlichen Disziplinen“ (PI: Claudia Resch) which was funded by the “go!digital2.0” program of the Austrian Academy of Sciences and carried out at the Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH) from 1 March 2017 to 29 February 2020 (Project-Nr. GD 2016/16, ÖAW 0704).
Show Less
Citation / Title | Relation Type |
Title | Relation type | Type |