Overview
Vienna-Oxford International Corpus of English (VOICE) XML
Type(s):
acdh:TopCollection
fingerprint
PID:
https://hdl.handle.net/21.11115/0000-000E-BCE0-1
device_hub
Principal Investigator(s):
Marie-Luise Pitzl-Hagin
,
Barbara Seidlhofer
person_add
Contact(s):
Marie-Luise Pitzl-Hagin
people
Creator(s):
Angelika Breiteneder
,
Hans Christian Breuer
,
Nora Dorn
,
Theresa Klimpfinger
,
Stefan Majewski
,
Ruth Osimk-Teasdale
,
Hannes Pirker
,
Marie-Luise Pitzl-Hagin
,
Michael Radeka
,
Stefanie Riegler
,
Daniel Schopper
,
Barbara Seidlhofer
,
Omar Siam
,
Daniel Stoxreiter
today
Created Start Date:
1 Jun 2005
,
1 Apr 2020
today
Created End Date:
31 Jan 2013
,
30 Sep 2021
today
Available Date:
6 Apr 2022
attachment
Number of Items:
806
attachment
Binary Size:
0.3 GB
copyright
Licensor:
Austrian Centre for Digital Humanities and Cultural Heritage
copyright
Owner:
Austrian Centre for Digital Humanities and Cultural Heritage
,
Department of English and American Studies | University of Vienna
web
Url:
https://voice.acdh.oeaw.ac.at
Vienna-Oxford International Corpus of English (VOICE) XML
Property | Value(s) |
---|---|
acdh:aclRead |
dstoxreiter
|
acdh:aclWrite |
dstoxreiter
|
acdh:createdBy |
dstoxreiter
|
acdh:hasAccessRestrictionSummary
|
public: 775
|
acdh:hasAppliedMethod
|
VOICE, the Vienna-Oxford International Corpus of English, is a one-million-word corpus of naturally-occurring, non-scripted, face-to-face interactions carried out using English as a lingua franca (ELF), i.e. English used as a common means of communication among speakers from different first-language backgrounds. The interactions recorded and transcribed are complete speech events from different domains (educational, leisure, professional) and represent different speech event types (conversation, interview, meeting, panel, press conference, question-answer session, seminar discussion, service encounter, working group discussion, workshop discussion).
|
acdh:hasAppliedMethodDescription
|
VOICE is based on audio-recordings carried out between July 2001 and November 2007, usually using portable mini-disc recorders with external microphones. These audio-recordings capture 151 naturally-occurring, non-scripted, face-to-face interactions involving 753 identified individuals from 49 different first language backgrounds using English as a lingua franca (ELF), i.e. English used as a common means of communication among speakers from different first-language backgrounds. Most of the audio-recordings are supplemented by detailed field notes including information about the nature of the speech event and the interaction taking place as well as about the participants engaging in these ELF interactions. The audio-recordings were transcribed, checked and proof-read by trained transcribers and researchers in accordance with the VOICE mark-up and spelling conventions. See sub-collection Documentation for more information on mark-up and spelling conventions. Details for each electronic text are given in the individual text headers. The principles and practices underlying the selection and design of the corpus are documented in the project and sampling description of the Corpus Header.
|
acdh:hasArrangement
|
The Vienna-Oxford International Corpus of English (VOICE) is stored in a TEI-based XML format. Each sub-collection in ARCHE is one version of VOICE from 1.0, 1.1, 2.0, 2.0 POS to 3.0. These collections were divided in further sub-collections which represent the five domains represented in VOICE (ED: educational, LE: leisure, PB: professional buisness, PO: professional organizational, PR: professional research and science). Domains in VOICE denote socially defined situations or areas of activity. The domain collections contain resources (i.e. the individual corpus texts, which are transcripts of the speech events) in a TEI-based XML format. In addition to versions of VOICE, the Top Collection VOICE in ARCHE also contains the sub-collection Documentation.
|
acdh:hasAvailableDate
|
2022-04-06
|
acdh:hasBinarySize
|
0.3 GB
|
acdh:hasContact
|
|
acdh:hasCreatedEndDate
|
2013-01-31
,
2021-09-30
|
acdh:hasCreatedStartDate
|
2005-06-01
,
2020-04-01
|
acdh:hasCreator
|
|
acdh:hasCurator
|
|
acdh:hasCustomCitation
|
year = {2021},
date = {2021-09-30T00:00:00.000000}, author = {Seidlhofer, Barbara and Pitzl, Marie-Luise and Schopper, Daniel and Breiteneder, Angelika and Breuer, Hans-Christian and Dorn, Nora and Klimpfinger, Theresa and Majewski, Stefan and Osimk-Teasdale, Ruth and Pirker, Hannes and Radeka, Michael and Riegler, Stefanie and Siam, Omar and Stoxreiter, Daniel}, |
acdh:hasDepositor
|
|
acdh:hasDescription
|
The most wide-spread contemporary use of English throughout the world is that of English as a lingua franca (ELF), i.e. English used as a common means of communication among speakers from different first-language backgrounds (Seidlhofer 2011). Nevertheless, linguistic descriptions before the mid-2000s focused almost entirely on English as spoken and written by its native speakers. Starting in 2005, the VOICE project sought to redress the balance by compiling the first general corpus capturing spoken ELF interactions as they happen naturally in various contexts. VOICE was designed and compiled to make possible linguistic descriptions of this most common contemporary use of English by providing a corpus of spoken ELF interactions which has been freely accessible to linguistic researchers all over the world since 2009. The Vienna-Oxford International Corpus of English (VOICE) was initially created by Barbara Seidlhofer (founding director) and Angelika Breiteneder, Theresa Klimpfinger, Stefan Majewski, Marie-Luise Pitzl (project researchers) from 2005 to 2011 at the English Department at the University of Vienna. VOICE 1.0 Online was released in 2009, VOICE 1.0 XML in 2011. VOICE POS XML 2.0 was the first part-of-speech tagged version of VOICE and was based on the same data as VOICE 2.0 XML. Both VOICE 2.0 XML and VOICE 2.0 POS XML were released in 2013. Additional researchers centrally involved in the creation of VOICE 2.0 POS XML were Ruth Osimk-Teasdale, Michael Radeka and Nora Dorn. VOICE 2.0 XML and VOICE POS XML 2.0 included minor revisions with regard to previous versions. VOICE 3.0 XML and VOICE 3.0 Online are based on the same data as VOICE 1.0/2.0 and were created from spring 2020 to autumn 2021 in the VOICE CLARIAH project. VOICE 3.0 XML is a new, merged TEI-conform XML version of VOICE 2.0 XML and VOICE POS XML 2.0, which contains spoken mark-up as well as part-of-speech and lemma information in TEI-XML format. The members of the VOICE CLARIAH team who created VOICE 3.0 were: Marie-Luise Pitzl (PI), Daniel Schopper, Barbara Seidlhofer, Hans Christian Breuer, Ruth Osmik-Teasdale, Hannes Pirker, Stefanie Riegler, Omar Siam.
|
acdh:hasHosting
|
|
acdh:hasLanguage
|
|
acdh:hasLicenseSummary
|
CC BY-NC-SA 3.0 AT: 638 / CC BY 4.0: 167
|
acdh:hasLicensor
|
|
acdh:hasMetadataCreator
|
|
acdh:hasNumberOfItems
|
806
|
acdh:hasOaiSet
|
|
acdh:hasOwner
|
|
acdh:hasPid
|
|
acdh:hasPrincipalInvestigator
|
|
acdh:hasRelatedDiscipline
|
|
acdh:hasRightsHolder
|
|
acdh:hasSubject
|
conversation
,
educational
,
English as a lingua franca
,
interaction
,
interculturality
,
interview
,
leisure
,
meeting
,
multilingualism
,
panel
,
press conference
,
professional business
,
professional organizational
,
professional research and science
,
question-answer session
,
seminar discussion
,
service encounter
,
working group discussion
,
workshop discussion
|
acdh:hasTitle
|
Vienna-Oxford International Corpus of English (VOICE) XML
|
acdh:hasUpdatedDate
|
2022-04-06T11:01:54.566152
|
acdh:hasUpdatedRole |
dstoxreiter
|
acdh:hasUrl
|
|
rdf:type | |
acdh:hasIdentifier
|
Inverse Data
Property | Value(s) |
---|
Dissemination Services
Summary
info_outline
Related Discipline(s):
Applied linguistics
,
Corpus linguistics
,
Digital humanities
,
English studies
,
Linguistics and Literature
,
Sociolinguistics
info_outline
Subject(s):
conversation
,
educational
,
English as a lingua franca
,
interaction
,
interculturality
,
interview
,
leisure
,
meeting
,
multilingualism
,
panel
,
press conference
,
professional business
,
professional organizational
,
professional research and science
,
question-answer session
,
seminar discussion
,
service encounter
,
working group discussion
,
workshop discussion
info_outline
Description:
The most wide-spread contemporary use of English throughout the world is that of English as a lingua franca (ELF), i.e. English used as a common means of communication among speakers from different first-language backgrounds (Seidlhofer 2011). Nevertheless, linguistic descriptions before the mid-2000s focused almost entirely on English as spoken and written by its native speakers. Starting in 2005, the VOICE project sought to redress the balance by compiling the first general corpus capturing spoken ELF interactions as they happen naturally in various contexts. VOICE was designed and compiled to make possible linguistic descriptions of this most common contemporary use of English by providing a corpus of spoken ELF interactions which has been freely accessible to linguistic researchers all over the world since 2009. The Vienna-Oxford International Corpus of English (VOICE) was initially created by Barbara Seidlhofer (founding director) and Angelika Breiteneder, Theresa Klimpfinger, Stefan Majewski, Marie-Luise Pitzl (project researchers) from 2005 to 2011 at the English Department at the University of Vienna. VOICE 1.0 Online was released in 2009, VOICE 1.0 XML in 2011. VOICE POS XML 2.0 was the first part-of-speech tagged version of VOICE and was based on the same data as VOICE 2.0 XML. Both VOICE 2.0 XML and VOICE 2.0 POS XML were released in 2013. Additional researchers centrally involved in the creation of VOICE 2.0 POS XML were Ruth Osimk-Teasdale, Michael Radeka and Nora Dorn. VOICE 2.0 XML and VOICE POS XML 2.0 included minor revisions with regard to previous versions. VOICE 3.0 XML and VOICE 3.0 Online are based on the same data as VOICE 1.0/2.0 and were created from spring 2020 to autumn 2021 in the VOICE CLARIAH project. VOICE 3.0 XML is a new, merged TEI-conform XML version of VOICE 2.0 XML and VOICE POS XML 2.0, which contains spoken mark-up as well as part-of-speech and lemma information in TEI-XML format. The members of the VOICE CLARIAH team who created VOICE 3.0 were: Marie-Luise Pitzl (PI), Daniel Schopper, Barbara Seidlhofer, Hans Christian Breuer, Ruth Osmik-Teasdale, Hannes Pirker, Stefanie Riegler, Omar Siam.
Cite Resource
Copy
Citation information copied!
Type:
acdh:Collection
today
Available Date:
6 Apr 2022
info
The collection 'Documentation' contains the TEI/XML schema for Voice 3.0 and previous versions of VOICE and manuals explaining the search functions of the new VOICE 3.0 Online interface, the part-of-speech (POS) tagging in VOICE and the VOICE Transcription Conventions (mark-up and spelling conventions).
Type:
acdh:Collection
today
Available Date:
6 Apr 2022
—
Version:
1.0
info
The most wide-spread contemporary use of English throughout the world is that of English as a lingua franca (ELF), i.e. English used as a common means of communication among speakers from different first-language backgrounds (Seidlhofer 2011). Nevertheless, linguistic descriptions before the mid-2000s focused almost entirely on English as spoken and written by its native speakers. Starting in 2005, the VOICE project sought to redress the balance by compiling the first general corpus capturing spoken ELF interactions as they happen naturally in various contexts. VOICE was designed and compiled to make possible linguistic descriptions of this most common contemporary use of English by providing a corpus of spoken ELF interactions which has been freely accessible to linguistic researchers all over the world since 2009. The Vienna-Oxford International Corpus of English (VOICE) was initially created by Barbara Seidlhofer (founding director) and Angelika Breiteneder, Theresa Klimpfinger, Stefan Majewski, Marie-Luise Pitzl (project researchers) from 2005 to 2011 at the English Department at the University of Vienna. VOICE 1.0 Online was released in 2009, VOICE 1.0 XML in 2011.
Type:
acdh:Collection
today
Available Date:
6 Apr 2022
—
Version:
1.1
info
The most wide-spread contemporary use of English throughout the world is that of English as a lingua franca (ELF), i.e. English used as a common means of communication among speakers from different first-language backgrounds (Seidlhofer 2011). Nevertheless, linguistic descriptions before the mid-2000s focused almost entirely on English as spoken and written by its native speakers. Starting in 2005, the VOICE project sought to redress the balance by compiling the first general corpus capturing spoken ELF interactions as they happen naturally in various contexts. VOICE was designed and compiled to make possible linguistic descriptions of this most common contemporary use of English by providing a corpus of spoken ELF interactions which has been freely accessible to linguistic researchers all over the world since 2009. The Vienna-Oxford International Corpus of English (VOICE) was initially created by Barbara Seidlhofer (founding director) and Angelika Breiteneder, Theresa Klimpfinger, Stefan Majewski, Marie-Luise Pitzl (project researchers) from 2005 to 2011 at the English Department at the University of Vienna. Minor revisions and corrections for VOICE 1.1 were made by Ruth Osimk. VOICE 1.0 Online was released in 2009, VOICE 1.0 XML and VOICE 1.1 XML in 2011.
Type:
acdh:Collection
today
Available Date:
6 Apr 2022
—
Version:
POS 2.0
info
The most wide-spread contemporary use of English throughout the world is that of English as a lingua franca (ELF), i.e. English used as a common means of communication among speakers from different first-language backgrounds (Seidlhofer 2011). Nevertheless, linguistic descriptions before the mid-2000s focused almost entirely on English as spoken and written by its native speakers. Starting in 2005, the VOICE project sought to redress the balance by compiling the first general corpus capturing spoken ELF interactions as they happen naturally in various contexts. VOICE was designed and compiled to make possible linguistic descriptions of this most common contemporary use of English by providing a corpus of spoken ELF interactions which has been freely accessible to linguistic researchers all over the world since 2009. The Vienna-Oxford International Corpus of English (VOICE) was initially created by Barbara Seidlhofer (founding director) and Angelika Breiteneder, Theresa Klimpfinger, Stefan Majewski, Marie-Luise Pitzl (project researchers) from 2005 to 2011 at the English Department at the University of Vienna. VOICE 1.0 Online was released in 2009, VOICE 1.0 XML in 2011. VOICE POS XML 2.0 was the first part-of-speech tagged version of VOICE and was based on the same data as VOICE 2.0 XML. VOICE 2.0 XML and VOICE POS XML 2.0 included minor revisions with regard to previous versions and were released in 2013. Additional researchers centrally involved in the creation of VOICE 2.0 POS XML were Ruth Osimk-Teasdale, Michael Radeka and Nora Dorn.
Type:
acdh:Collection
today
Available Date:
6 Apr 2022
—
Version:
2.0
info
The most wide-spread contemporary use of English throughout the world is that of English as a lingua franca (ELF), i.e. English used as a common means of communication among speakers from different first-language backgrounds (Seidlhofer 2011). Nevertheless, linguistic descriptions before the mid-2000s focused almost entirely on English as spoken and written by its native speakers. Starting in 2005, the VOICE project sought to redress the balance by compiling the first general corpus capturing spoken ELF interactions as they happen naturally in various contexts. VOICE was designed and compiled to make possible linguistic descriptions of this most common contemporary use of English by providing a corpus of spoken ELF interactions which has been freely accessible to linguistic researchers all over the world since 2009. The Vienna-Oxford International Corpus of English (VOICE) was initially created by Barbara Seidlhofer (founding director) and Angelika Breiteneder, Theresa Klimpfinger, Stefan Majewski, Marie-Luise Pitzl (project researchers) from 2005 to 2011 at the English Department at the University of Vienna. VOICE 1.0 Online was released in 2009, VOICE 1.0 XML in 2011. VOICE 2.0 XML includes minor revisions with regard to previous versions, but is otherwise based on the same data as VOICE 1.0 XML. These minor revisions were gathered by Ruth Osimk-Teasdale and Michael Radeka and corrections were made by Ruth Osimk-Teasdale. VOICE 2.0 XML was released in 2013.
Type:
acdh:Collection
today
Available Date:
6 Apr 2022
—
Version:
3.0
info
The most wide-spread contemporary use of English throughout the world is that of English as a lingua franca (ELF), i.e. English used as a common means of communication among speakers from different first-language backgrounds (Seidlhofer 2011). Nevertheless, linguistic descriptions before the mid-2000s focused almost entirely on English as spoken and written by its native speakers. Starting in 2005, the VOICE project sought to redress the balance by compiling the first general corpus capturing spoken ELF interactions as they happen naturally in various contexts. VOICE was designed and compiled to make possible linguistic descriptions of this most common contemporary use of English by providing a corpus of spoken ELF interactions which has been freely accessible to linguistic researchers all over the world since 2009. The Vienna-Oxford International Corpus of English (VOICE) was initially created by Barbara Seidlhofer (founding director) and Angelika Breiteneder, Theresa Klimpfinger, Stefan Majewski, Marie-Luise Pitzl (project researchers) from 2005 to 2011 at the English Department at the University of Vienna. VOICE 1.0 Online was released in 2009, VOICE 1.0 XML in 2011. VOICE POS XML 2.0 was the first part-of-speech tagged version of VOICE and was based on the same data as VOICE 2.0 XML. Both VOICE 2.0 XML and VOICE 2.0 POS XML were released in 2013. Additional researchers centrally involved in the creation of VOICE 2.0 POS XML were Ruth Osimk-Teasdale, Michael Radeka and Nora Dorn. VOICE 2.0 XML and VOICE POS XML 2.0 included minor revisions with regard to previous versions. VOICE 3.0 XML and VOICE 3.0 Online are based on the same data as VOICE 1.0/2.0 and were created from spring 2020 to autumn 2021 in the VOICE CLARIAH project. VOICE 3.0 XML is a new, merged TEI-conform XML version of VOICE 2.0 XML and VOICE POS XML 2.0, which contains spoken mark-up as well as part-of-speech and lemma information in TEI-XML format. The members of the VOICE CLARIAH team who created VOICE 3.0 were: Marie-Luise Pitzl (PI), Daniel Schopper, Barbara Seidlhofer, Hans Christian Breuer, Ruth Osmik-Teasdale, Hannes Pirker, Stefanie Riegler, Omar Siam.