A TEI Schema for Corpora of Parliamentary Proceedings
v0.3
2022-05-03

Table of contents

1. Introduction

Parliamentary proceedings corpora (PPCs) are a quintessential resource for a wide range of research questions from a number of SSH disciplines, such as history and sociolinguistics. Their most distinguishing characteristic is that they are (typically edited) transcriptions of spoken language produced in controlled and regulated circumstances. They are also rich in invaluable (sociodemographic) metadata as well as easily available under the Freedom of Information Acts set in place to enable informed participation by the public and to improve effective functioning of democratic systems, making the datasets even more valuable.

Given these reasons and the fact that parliamentary proceedings are often available on-line, many researchers have already compiled corpora of parliamentary proceedings. However, these corpora are encoded in a variety of different annotation schemes, limiting their interchange and re-use.

In order to overcome this problem, the presented recommendations, called Parla-CLARIN, propose a schema that can be used for encoding of parliamentary proceedings corpora, primarily for the purposes of scholarly investigations, and that could serve as a storage and interchange format for such corpora. These recommendations attempt to take into account the following aspects of PPC:

The recommendations are implemented as a parameterisation of the TEI Guidelines, which are XML-based recommendations for encoding texts for scholarly purposes. As opposed to most other such recommendations, the TEI Guidelines have the ambition to be applicable to texts in any natural language, of any date, in any literary genre or text type, without restriction on form or content. The TEI parameterisation proposed for Parla-CLARIN also allows a wide range of PPC to be encoded, while making explicit recommendations on the manner of encoding various phenomena. The recommendations are written as a TEI ODD document, on the basis of which it is possible to derive an XML schema expressed either as a RelaxNG schema, a DTD or a W3C schema.

When using these recommendations, the following points should be taken in consideration:

The rest of these recommendations are structured as follows:

1.1. Scope and purpose

These recommendations consist of guidelines, a formal TEI schema, and derived XML schemas in various schema languages. They are intended for the encoding of corpora of parliamentary proceedings, regardless of the language or country of origin, for the purposes of scholarly investigations, be they from the field of linguistics, political science, history or other humanities and social sciences disciplines. The recommendations are, in principle, not meant as the primary storage format of parliamentary proceedings, such as kept by governmental offices, for which Akoma Ntoso might be preferred.

In developing a schema for structuring data, two approaches can be adopted: a descriptive one, where as much as possible of the original data distinctions are kept in the target encoding; or a prescriptive one, where the target encoding is severely constrained, to enable seamless data interchange and esp. interoperability with software tools. The Parla-CLARIN recommendations adopt the descriptive approach, as the source data, time and effort devoted to converting it, the intended applications, as well as country-specific rules of parliamentary proceedings will differ considerably, and it is likely that any prescriptive schema would soon turn out to be too restrictive. Nevertheless, the recommendations do try to limit the plethora of encoding options otherwise available in TEI to those that could be sensibly applied to corpora of parliamentary proceedings, and where more than one option is available in TEI to encode a given phenomenon, the schema and especially the text guidelines attempt to recommend only one option.

It should be noted that for those preferring a prescriptive approach, the ParlaMint corpora and their schema should be consulted, which are interoperable and are accompanied by a number of scripts to validate and convert the corpora. But, while the ParlaMint corpora do contain rich (meta)data, they nevertheless support a more restricted view on encoding parliamentary proceedings than does Parla-CLARIN.

1.2. Background

1.2.1. CLARIN and parliamentary proceedings

The Parla-CLARIN proposal is developed as a project of the CLARIN European Research Infrastructure for Language Resources and Technology. Before the beginning of this development, CLARIN had already organised a number of initiatives and events that deal with parliamentary corpora:

At these events, it became clear that existing parliamentary corpora are encoded in many different ways, presenting a barrier to their interchange. Therefore the CLARIN Interoperability Committee organised a focused CLARIN ParlaFormat workshop (May 23-24, 2019, Amersfoort) with selected participants at which the idea of the Parla-CLARIN recommendations were introduced, the participants presented their own experiences with encoding parliamentary corpora and gave their comments to the draft proposal. The details are given in the slides of the workshop, with the introduction and response to comments being:

After the initial publication of Parla-CLARIN, the ParlaMint project developed corpora of 17 European parliaments, encoded to a specialisation of the Parla-CLARIN recommendations. As the ParlaMint corpora follow a much stricter encoding, they are also interoperable, and samples of the corpora, the schema used, and various scripts to process the corpora are available from the ParlaMint GitHub repository, while the complete corpora can be found on the CLARIN.SI repository. Certain decisions taken in the encoding of the ParlaMint corpora have also had an impact on the current Parla-CLARIN recommendations.

1.2.2. Akoma Ntoso

While there are a large number of recommendations and standards that could in principle be used to encode parliamentary proceedings, the Akoma Ntoso standard stands out in that it was explicitly developed as an XML format for encoding legislative and judiciary documents including parliamentary proceedings. Akoma Ntoso is an OASIS standard and has already been used to encode various legal documents in a number of countries. It defines an XML schema for modelling for legal documents (called AKN), uses FRBR concepts and has a built-in relation to ontologies.

It is thus a reasonable question why CLARIN did not simply adopt AKN, rather than developing a conceptually very different encoding schema. The main reasons for this are the following:

  • Unfamiliarity of corpus compilers and users with AKN, and relatively good familiarity with TEI: for example, of the 16 talks presenting PPCs at the CLARIN ParlaFormat workshop, 10 used a TEI encoding, or an encoding ‘inspired by TEI’. It should also be noted that for most corpus compiler PPCs will be only one type of the corpora they will be compiling, so it is somewhat unrealistic for them to learn AKN and develop conversion scripts from it for just one type of corpus; on the other hand, TEI can be used for practically any type of corpus.
  • AKN makes no provisions for storing speaker meta-data, which is rather accessed from external data sources and using a specialised referencing system; on the other hand, TEI has a number of elements for recording details about persons. For reasons of completeness, uniform and easier processing, and experimental replicability it is better to include these data directly in the corpus.
  • AKN has no built-in support for linguistic annotation (apart from named entities). And while it would be possible to add elements for such annotation via a different namespace to AKN, AKN has no provisions for extending its schema, while TEI already has such elements available. As the main purpose of this proposal is to cater for linguistic analyses of parliamentary proceedings, TEI seems a better choice.

Nevertheless, AKN is an important schema for modelling parliamentary proceedings, and some solutions of AKN were used in developing the Parla-CLARIN proposal, in particular the typology of divisions of a document. Also developed was a (partial) conversion for AKN to Parla-CLARIN, which covers some example documents, further discussed in the Section on the Conversion from Akoma Ntoso.

1.2.3. RDF

The Resource Description Framework (RDF) is a W3C specification and a standard model for computer-processable data interchange on the Web. It is also the base format for modelling information in the context of Linked Open Data (LOD), an influential model for linking data on the Web. And while LOD data is typically not concerned with language data, there is also the ‘Linguistic Linked Open Data Cloud’ (LLOD) initiative which is explicitly targeted towards language resources. RDF/LOD has also been quite popular for modelling parliamentary debates (e.g. the already mentioned Talk of Europe LOD dataset), so, again, the question arises why not use RDF to encode such data, rather than developing a TEI-based solution.

In this case, the answer is, at least partially, to do with the community addressed by the Parla-CLARIN recommendations and the preferences of this community: while LOD is targeted to computer scientists and the concept of a machine-processable and linked Word-Wide Web, TEI much more addresses researchers from (digital) humanities and of internally complete resources. In practical terms, this is seen in the difference between RDF encoded data, which is machine processable but hardly human readable or editable and highly connected to external data sources, while TEI documents are relatively self-explanatory, esp. after some exposure to the TEI Guidelines, can be edited in any XML editor and mostly self-contained. And while there exists the LLOD initiative that addresses linguistic annotation, the initiative seems to have lost momentum, and also has other concerns to addressing detailed encoding of a particular type of language resources - in fact, the same can be said for most RDF attempts at encoding parliamentary data, which encode only rather shallow aspects of such data.

But while this specification does not see RDF data as the ideal framework to encode parliamentary proceedings corpora for storage and interchange, it does take RDF as a useful down-stream model for exploiting such corpora, i.e. developing a TEI to RDF conversion is a concern, which is taken further in the Section on the Conversion to RDF.

2. General requirements

A Parla-CLARIN corpus should, in general, capture as much of the text and markup from the source as possible, while the presence of graphical items or other elements that could not or were not transcribed should be indicated by markup, in particular with the use of <gap> as further explained in the Section on Gaps.

2.1. Characters

The corpus should be encoded in Unicode, using the UTF-8 character encoding, at least for European languages. In cases where the original contains characters from the Unicode Private Use Area, these should, if possible, be given their closest Unicode equivalents, or substituted by the Unicode replacement character U+FFFD.1

End-of-line hyphens can be removed, and the split words joined in order to enhance searching the corpus and to simplify linguistic processing. It is recommended that this practice is documented in the TEI header of the corpus, in the <hyphenation> element as explained in the Section on Documenting the encoding process.

The following characters, esp. prevalent when the source documents were in Word or HTML, deserve special mention:

  • TAB (U+0009) character helps the alignment of strings on successive lines. As Parla-CLARIN is not interested in preserving the details of the layout, it is recommended that tab characters are substituted by the space character (U+0020).
  • NO-BREAK SPACE (U+00A0) prevents, with some applications, an automatic line break at its position and collapsing consecutive such space characters into a single space. As the use of this character complicates (or breaks) further processing, esp. linguistic annotation, it is recommended that these characters are substituted by the normal space character (U+0020). The same holds for other variants of spaces (U+2000 - U+200A), which are, however, used much less frequently.
  • NON-BREAKING HYPHEN (U+2011), similarly to NO-BREAK SPACE, prevents a line break, in this case following its position. With a similiar reasoning as above, and, additionally, complicates searching, it is recommended that this character is substituted by the normal hyphen character ('-', U+002D).
  • SOFT HYPHEN (U+00AD) indicates that a word can be hyphenated at that point. Occurrences of this character should be removed from the corpus.

While not required, it is sensible to also normalise sequences of whitespace characters into a single space or end-of-line character. Again, this simplifies further (esp. linguistic) processing.

2.2. Documenting the encoding process

Difficult encoding situations that are not covered by the TEI Guidelines should be documented in the <editorialDecl> of the corpus TEI header. In particular, if the source texts has been changed (so, omitting or normalising figures, text, EOL hyphens, quotes, special characters, etc. as discussed above) this practice should be documented in the <correction>, <normalization>, <quotation>, and <hyphenation> element of the editorial declaration. Two further elements, namely <segmentation> and <interpretation> can also be used to document these aspects of the encoding process. The example below illustrates the use of these elements:
<editorialDecl>  <correction>   <p>Found typos in the source have been silently corrected.</p>  </correction>  <normalization>   <p>Tables have been omitted from the corpus. Spacing has been normalised      to single space. Soft hyphens have been removed.</p>  </normalization>  <hyphenation>   <p>End-of-line hyphens have been silently removed.</p>  </hyphenation>  <quotation>   <p>Quotation marks have been left in the text and are not explicitly      marked up.</p>  </quotation>  <segmentation>   <p>The texts are segmented into utterances, segments (corresponding to      paragraphs in the source transcription), sentences, words and      punctuation.</p>  </segmentation>  <interpretation>   <p>Word-level linguistic annotation comprises the lemma of a word and its      morphosyntactic description, which follow the   <ref target="http://nl.ijs.si/ME/V6/msd/">MULTEXT-East morphosyntactic        specification Version 6</ref> for Slovene.</p>  </interpretation> </editorialDecl>
When automatic procedures have been used to encode the texts (most prominently, to add linguistic markup, as discussed in the Section on Linguistic annotation) this should be documented in the <appInfo> element of the <encodingDesc>, as shown in the example below:
<appInfo>  <application version="1.0"   ident="reldi-tagger">   <label>ReLDI morphosyntactic tagger and lemmatiser</label>   <desc>Part-of-speech tagging and lemmatisation performed with ReLDI Tagger      trained for Slovene and available from   <ref target="https://github.com/clarinsi/reldi-tagger">GitHub</ref>.</desc>  </application> </appInfo>

2.3. Languages

The language of an element's text content is in TEI, as in XML, signaled by the value of its xml:lang attribute. The Parla-CLARIN recommendations require that each element that contains text is either marked by this attribute, or one of its ancestors is; in particular, the root element of the corpus should always have an xml:lang attribute. For multilingual documents (excluding cases where only a minor part of the text is in another language), the language code of the root element should be ‘mul’ for ‘multiple languages’. Note that if going by the ancestor axis, the values of two xml:lang are in conflict, the one closer to the context node is relevant one.

The values of xml:lang should follow BCP 47, cf. also xml:lang in XML document schemas.

It is good practice to document the languages used in the <langUsage> element of the TEI header. The language names can be given in more than one language, and, when more languages are used in the transcriptions, the percentage of their use can also be indicated in the usage element of the <language> elements, as illustrated in the example below.
<langUsage>  <language ident="enxml:lang="en">English</language>  <language ident="enxml:lang="nl">Engels</language>  <language usage="45ident="nl"   xml:lang="en">Dutch</language>  <language usage="45ident="nl"   xml:lang="nl">Nederlands</language>  <language usage="55ident="fr"   xml:lang="en">French</language>  <language usage="55ident="fr"   xml:lang="nl">Frans</language> </langUsage>

Apart from the above considerations, a related question is where to draw the line between the object and meta languages, i.e. the language of the corpus and the language of the markup. The TEI defines the names of the elements and attributes in English, and the language of the corpus will, of course, depend on the country of the parliament. It is less straightforward to decide in which language the attribute values (such as the values of the type attribute) should be. Parla-CLARIN recommends that these should also be in English.

2.4. Identifiers and referencing

In order to simply refer to elements of a TEI document (i.e. a Parla-CLARIN corpus), elements can be marked with an ID, i.e. given the xml:id attribute with a unique value, obeying certain format requirements as defined by W3C.

Parla-CLARIN requires an xml:id attribute on the root element of each corpus file, which should, furthermore, be identical to the filename (modulo the file extension). Parla-CLARIN also recommends that the divisions of the document (element <div>) should also be given identifiers. While any element can be given an xml:id, this is, in general, not a good idea; rather, only those elements that will or could be referenced should be marked with this attribute.

TEI offers a number of attributes that contain (URI) pointers. Where the reference is to an element inside the document, the value of the xml:id being referred to should be preceded by a hash (#), as mandated by the XML standard. If the ID pointed to is from another document, then the full URI needs to be used.

However, as such URIs can be very long, TEI also offers another way of pointing, which can be used to shorten such long URIs, and this is defined by the <prefixDef> element in the TEI header, as illustrated below:
<prefixDef ident="mtematchPattern="(.+)"  replacementPattern="http://nl.ijs.si/ME/V6/msd/tables/msd-fslib-sl.xml#$1">  <p xml:lang="en">Private URIs with this prefix point to feature-structure elements defining the Slovenian MULTEXT-East Version 6 MSDs.</p> </prefixDef>
With such a definition, we can use much shorter pointers in the markup of words, such as mte:Pd-nsg, which are then, via a regular expression mapping in the prefix definition, converted to the full URI http://nl.ijs.si/ME/V6/msd/tables/msd-fslib-sl.xml#Pd-nsg.

2.5. Temporal information

Parliamentary corpora can contain significant time-related information, e.g. the date and time of a sitting, the start and end of an MP's affiliation to a particular party, the dates of the beginning and end of a political party etc. In general, such information in TEI is stored in the attributes of the pertinent element, which take as their values a date and possibly time, according to the ISO 8601 Date and Time Formats, and specified in the XML Schema Part 2: Datatypes Second Edition. TEI offers a very rich set of attributes and ancillary elements to specify time-related information, which are discussed in the Section on Dates and Times of the TEI Guidelines.

Parla-CLARIN users are free to use any of the TEI temporal attributes and elements, however, for most purposes, the following five attributes will suffice:

  • when: when a certain event happened;
  • from, to: the start and end of an event or state;
  • notBefore, notAfter: the earliest and latest known time that an event or state took place, used in cases where the exact time is not known.
To illustrate, we give below two elements that are marked with temporal attributes.
<birth when="1965-06-06">  <placeName>Oudenaarde</placeName> </birth> ... <event to="2007-05-02from="2003-06-05"  xml:id="period_51">  <label>Legislative period 51</label> </event>

2.6. Files

While these recommendations make the assumption that a complete PPC is one TEI XML document, this does not mean that it also has to be stored in one file, as the file structure is distinct from the concept of XML documents. To enable one XML document to be composed of many files, the XInclude mechanism should be used. Typically, a corpus will then be composed of a file containing the root XML element and the corpus header, while individual text files will be included in the corpus using the <include> element from the XInclude namespace, as illustrated by the following example:

<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="Sk-11/SI-1990-05-07-01.xml"/>

As mentioned, we recommend that the file has the same name as the value of the xml:id attribute of the root element of the file. This e.g. guarantees that each file of the corpus has a unique name.

3. Overall document structure

3.1. Corpus structure

A Parla-CLARIN corpus has the <teiCorpus> element as the root element of the corpus XML document. It contains the <teiHeader> of the corpus containing the metadata for the corpus as a whole, followed by a series of <TEI> elements that each contain one corpus component.

<teiCorpus xml:lang="xx" xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader> <!-- Common corpus metadata -->  </teiHeader>  <TEI xml:id="id.1">   <teiHeader> <!-- Document metadata -->   </teiHeader>   <text>    <body> <!-- Document text -->    </body>   </text>  </TEI> <!-- More TEI elements here --> </teiCorpus>

We do not specify what an individual component should contain, as the size and granularity of parliamentary proceedings corpora, not to mention the national rules of structuring the workings of the parliament, differ substantially. Typically, however, an individual <TEI> element contains one sitting or session or one day, in any case such data that has metadata (encoded in its <teiHeader>) that distinguishes it from the other components of the corpus.

As illustrated below, the <text> element can, apart from the obligatory <body>, also contain front matter in <front> and back matter in <back>. While the <body> will contain the transcription proper (i.e. the speeches), the former contains preamble text, and the latter various appendices or texts that are related to the speeches.

<TEI xml:id="id_1" xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader> <!-- Document metadata -->  </teiHeader>  <text>   <front> <!-- Front matter -->   </front>   <body> <!-- Transcription text -->   </body>   <back> <!-- Back matter -->   </back>  </text> </TEI>

3.2. Text divisions

In the text <body>, as well as in the <front> and <back> matter, the element that further organises the content into sections is the division, <div>, and each of the three text-bearing elements (if present) should contain at least one division.

The divisions can be nested, as shown in the example below:
<body>  <div> ...  <div>    <head>Representation of members of the Federal Government</head>      ...   </div>   <div>    <head>Hour of topical interest</head>      ...   </div>   <div>    <head>Announcement of an urgent request</head>      ...   </div>  </div> </body>
In general it is a complicated question what should constitute a division of a transcription, with the pragmatic solution being that a division will be whatever has a heading (than can be recognised by the up-conversion software) in the source transcription.
The divisions can be further characterised by their type and, possibly, subtype attributes. The use of these attributes makes sense in cases where the source either explicitly (e.g. via its structure, as in Akoma Ntoso) or implicitly (e.g. via pattern matching the content of the headings) indicates what kind of a division it is. For example:
<body>  <div> ...  <div type="representation">    <head>Representation of members of the Federal Government</head>      ...   </div>   <div type="topical">    <head>Hour of topical interest</head>      ...   </div>   <div type="request">    <head>Announcement of an urgent request</head>      ...   </div>  </div> </body>
If used, the values of the type and subtype attributes will depend on the parliamentary rules of the particular country, on the need to distinguish the types of divisions, as well as on the ability to automatically recognise them or the available effort to manually add them. The Parla-CLARIN specification does therefore not enforce the use of these attributes nor does it restrict their values. However, the definition of <div> does give a set of sample values for type, which correspond to the names of the structure elements defined by Akoma Ntoso. Below we give an example of a relatively complex structure made on the basis of an Akoma Ntoso document, where the subtype attribute encodes the values of the corresponding name attribute of AKN:
<body>  <div type="prayers">   <head>Prayers</head>    ...  </div>  <div type="oralStatements">   <head>Speaker’s Statement</head>    ...  </div>  <div type="questions">   <head>Oral Answers to Questions</head>   <div type="debateSectionsubtype="topic">    <head>Health</head>    <div type="debateSection"     subtype="askedPerson">     <head>The Secretary of State was asked—</head>     <div type="debateSection"      subtype="questionAnswer">      <head>Ambulance Waiting Times</head>          ...     </div>    </div>   </div>  </div>  <div type="pointOfOrder">   <head>Points of Order</head>    ...  </div> </body>

3.3. Document variants

Parliamentary proceedings often exist in two versions, the original, ‘raw’ transcription, and the edited, ‘redacted’ transcription, which is then published. If both versions can be available, it is interesting for some research questions to also have both versions transcriptions available for study in the scope of a PPC.

TEI offers a number of options on how to encode variant texts, most of them discussed in the Chapter on Linking, Segmentation, and Alignment. We here present the simplest option, where it is assumed that each transcription exists in a separate TEI document and that the segments that should be aligned between the raw and redacted transcriptions are explicitly marked up in the text. As shown in the example below, which gives one segment from the file trans-raw.xml and one from the file trans-red.xml it is in this case enough to specify the xml:id on both elements and use the corresp attribute to point to the aligned segment:
<!-- From trans-raw.xml: --><seg xml:id="raw.1"  corresp="trans-red.xml#red.1">What did, uh, who say that?</seg> <!-- From trans-red.xml: --> <seg xml:id="red.1"  corresp="trans-raw.xml#raw.1">Who said that?</seg>
It should be noted that the relation between the aligned elements does not need to be 1-1: if the relation is 0-1 or 1-0, then the non-aligned element is simply not given in a corresp; if the relation is n-1 or 1-n, then several IDs are given as values of the corresp attribute, e.g. corresp="trans-raw.xml#raw.3 trans-raw.xml#raw.4".

4. Corpus metadata

TEI allows significant metadata to be added to a document. The metadata is contained in the <teiHeader> element, which in corpora can appear at two levels:

It is recommended that the metadata that is common to the whole corpus is stored in the corpus TEI header, whereas the text-specific metadata is in the corpus text TEI header.

It is outside the scope of this specification to give all the details of a <teiHeader> element, for this, the user is referred to the Section on the TEI header of the TEI Guidelines, and, of course, to the example corpora that are part of the Parla-CLARIN Git repository. Here we do, however, give some examples and concentrate on the metadata that is esp. relevant to parliamentary proceedings corpora.

4.1. Typologies

Parla-CLARIN corpora can make many distinctions, such as types of governance or sessions, that can have controlled vocabularies. TEI supports encoding and referrencing of formalised ‘ontologies’ further explained in the Section on Classification Declaration of the TEI Guidelines, by using the <taxonomy> element. To illusrate, we give below the start of a speaker-type taxonomy:
<category xml:id="chair">  <catDesc xml:lang="is">   <term>Þingforseti</term>: forseti alþingis</catDesc>  <catDesc xml:lang="en">   <term>Chairperson</term>: chairman of a meeting</catDesc> </category> <category xml:id="regular">  <catDesc xml:lang="is">   <term>Venjulegur</term>: venjulegur ræðumaður á fundi</catDesc>  <catDesc xml:lang="en">   <term>Regular</term>: a regular speaker at a meeting</catDesc> </category>

4.2. Speaker metadata

Unlike some other proposals, in particular Akoma Ntoso, the Parla-CLARIN recommendations assume that speaker meta-data is also included in the corpus,2 as this allows the corpus to be stand-alone with all the relevant data necessary for analysis already included in it.

Information on speakers is given in the corpus TEI header, in particular in the <listPerson> element, itself a part of the participant description, i.e. the <particDesc> element.

A <listPerson> typically contains <person> elements, which give information on an individual person, as the example below illustrates.
<person xml:id="KučanMilan">  <persName>   <forename>Milan</forename>   <surname>Kučan</surname>  </persName>  <sex value="M"/>  <birth when="1941-01-14">   <placeName ref="http://www.geonames.org/3197229">Križevci</placeName>  </birth> </person>
Each <person> must have an xml:id attribute, so that it can be referred to from the transcription. Apart from that, the only required element is <persName>, giving the name of the person. This can be contained directly in the element, or, as the preferred option, further decomposed into the person surname(s) and forename(s) or even other elements, as explained in the Section on Personal Names of the TEI Guidelines.

As illustrated above, further person metadata can contain the sex of the person and their birth date and place. Other potentially useful elements are the persons <death> date and place, as well as (possibly time stamped) <education>, <occupation>, and <affiliation>.

In the context of PPCs, the <affiliation> element is especially important, as it can denote the person's membership in a political party or in the parliament. Different types of affiliations are distinguished by the different values of the role attribute, such as ‘member’. As affiliations are not necessarily fixed, and the <affiliation> element can be marked for its temporal duration, as explained in the Section on Temporal information. The example below illustrates the encoding of various affiliations.
<person xml:id="AnderličAnton">  <persName>   <surname>Anderlič</surname>   <forename>Anton</forename>  </persName>  <affiliation role="member"   ref="#party.ZSMSfrom="1990-05-08to="1990-11-09"/>  <affiliation role="member"   ref="#party.LDS.1from="1990-11-10to="1994-03-11"/>  <affiliation role="member"   ref="#DruzPolZbfrom="1990-05-08to="1992-12-22"   ana="#SK.11"/>  <affiliation role="memberref="#DZ"   from="1992-12-23to="1996-11-27ana="#DZ.1"/> </person>

The example above uses, via the ref attribute, stand-off annotation to refer to the names of the parties, which should then be defined in a <taxonomy> element in the <teiHeader>, as explained in the Section on Party metadata. The same can be done for the affiliation to the national assembly (i.e. that the person is an MP), where, additionally, a reference, via the ana attribute, is given to the legislative period in which the person was an MP.

Alternatively (or additionally), the name of the party or national assembly can also be given directly in the <affiliation> element using the <orgName> element, as illustrated below. The example also illustrates the additional information, encoded as a subordinate <affiliation> with the appropriate role, that the person was elected as an MP to represent a certain constituency, giving its name and location.
<affiliation role="memberref="#DruzPolZb"  from="1990-05-08to="1996-11-27">  <orgName xml:lang="sl">Državni zbor Republike Slovenije</orgName>  <orgName xml:lang="en">National Assembly of the Republic of Slovenia</orgName>  <affiliation role="consituency">   <placeName ref="https://www.geonames.org/3194351">Novo mesto</placeName>  </affiliation> </affiliation>

Persons can have further attributes, and TEI offers various elements (typically typed) to express them; they are introduced in the Section on Personal Characteristics of the TEI Guidelines. The two more general ones are <state>, which contains the description of some status or quality attributed to a person (or organisation), often at some specific time or for a specific date range and <trait>, which differs from <state> that it is independent of the volition or action of the holder and usually not at some specific time or for a specific date range. The former could, for example, be used to encode the fact that a PM was jailed for a given period of time, while the latter would e.g. be used for the information that a PM is handicapped.

It is often advantageous to refer to external knowledge sources about a person, such as Wikipedia or VIAF. This is encoded using the <idno> element, whose content is typically a URI, while the type attribute denotes the kind of knowledge source referred to.
<person xml:id="KučanMilan">  <persName>   <surname>Kučan</surname>   <forename>Milan</forename>  </persName>  <idno type="wikimediaxml:lang="sl">https://sl.wikipedia.org/wiki/Milan_Ku%C4%8Dan</idno>  <idno type="wikimediaxml:lang="en">https://en.wikipedia.org/wiki/Milan_Ku%C4%8Dan</idno>  <idno type="viaf">https://viaf.org/viaf/68121580/</idno> </person>

4.3. Party metadata

Information on political parties, as well as other groupings of people, such as ministries, is contained in the <listOrg> element, which is, just as <listPerson> element, contained in the <particDesc> element of the corpus TEI header. The Section on Organizational Data of the TEI Guidelines gives the particulars on how to encode data on organisations. An example of two <org> elements with a complex name structure and interdependencies is given below:
<org xml:id="pp.SDZrole="politicalParty"  xml:lang="sl">  <event from="1989-01-11to="1991-10-13">   <label xml:lang="en">existence</label>  </event>  <orgName full="yes">Slovenska demokratična zveza</orgName>  <orgName full="yesxml:lang="en">Slovenian Democratic Union</orgName>  <orgName full="init">SDZ</orgName>  <idno type="wikimedia">https://en.wikipedia.org/wiki/Slovenian_Democratic_Union</idno> </org> <org xml:id="pp.DSrole="politicalParty">  <orgName full="yesfrom="1989-02-16"   to="2003-09">Socialdemokratska stranka Slovenije</orgName>  <orgName full="yesxml:lang="en"   from="1989-02-16to="2003-09">Social Democratic Party of Slovenia</orgName>  <orgName full="initfrom="1989-02-16"   to="2003-09">SDSS</orgName>  <orgName full="yesfrom="2003-09">Slovenska demokratska stranka</orgName>  <orgName full="yesxml:lang="en"   from="2003-09">Slovenian Democratic Party</orgName>  <orgName full="initfrom="2003-09">SDS</orgName>  <idno type="wikimedia">https://en.wikipedia.org/wiki/Slovenian_Democratic_Party</idno> </org>
As with persons, each organisation must contain an xml:id attribute, so that <person> elements can refer to it. The fact that the organisation is a political party is encoded in the role attribute, where the suggested values are defined in the Parla-CLARIN schema. The name(s) of the party are given in the <orgName> element, which also uses the full attribute to distinguish between the full name of the part and the initials of the party. Parties are also created and dissolved, and can also change their name. The former is indicated by the <event> element, which, in the example above, we have typed as "existence" and where the dates of its existence are given in the from and to attributes. The same attributes on the <orgName> elements indicate the temporal duration of the parties names.3

4.4. Relationships between people and parties

It is also possible to encode relations between people and parties, e.g. kinship between MPs, or that one party is the successor of another one. For this purpose, TEI defines the <listRelation> element explained in detail in the Section on Personal Relationships of the TEI Guidelines. This element can be contained by the <listPerson> or <listOrg> elements and, in turn, contains the <relation> elements, each of which defines one relation. The example below shows a relation between people.
<relation name="parent"  passive="#sbi243926active="#sbi243926-0 #sbi243926-1"/>
The relation defines the relationship "parent" between the active persons of this relationship (i.e. those defined in the <person> element with the xml:id values of sbi243926-0 and sbi243926-1) and the passive person of this relationship (i.e. the <person> element with the xml:id value of sbi243926).
Relationships can also be mutual, e.g. in the case of spouses, as illustrated below
<relation name="spouse"  mutual="#sbi243926 #sbi243929"/>
Relationship can also exist between parties (i.e. organisations), for example the fact that one party is a successor of another. This can be expressed as in the example below.
<relation name="successor"  passive="#pp.SDZactive="#pp.DS"/>
Another example is the set of parties that make up the coalition or opposition in a given time interval. As shown below, a coalition is also encoded as a <relation>, where the references to the parties comprising it are encoded in the mutual attribute. In contrast, the opposition <relation> distinguishes the active parties which comprise it, and the passive pointer to the definition of the government to which the parties are opposed:
<listRelation>  <relation name="coalition"   mutual="#party.S #party.Ffrom="2013-05-23to="2017-01-10"/>  <relation name="opposition"   active="#party.Vg #party.Bf #party.V #party.P #party.Sfpassive="#GOV_LVfrom="2013-05-23"   to="2017-01-10"/> </listRelation>
If a coalition or opposition needs to have more information associated with it, e.g. its name, then a more complex encoding needs to be used, as illustrated below. Here all the information about the coalition (i.e. an <org> of type coalition) is encoded in a <listOrg> element, which then defines the name of the coalition, followed by its <listRelation>, as before.
<listOrg>  <org xml:id="cl.DEMOSrole="coalition">   <orgName full="init">DEMOS</orgName>  </org>  <listRelation>   <relation type="coalition"    pasive="#cl.DEMOS"    active="#pp.SDZ #pp.SDSS #pp.SKD #pp.SKZ #pp.SOS #pp.ZS"/>  </listRelation> </listOrg>

5. Transcriptions

The transcriptions of the parliamentary debates are the central part of these recommendations and this section explains how to encode the transcriptions of speeches proper, and how to treat the commentary inserted by the transcribers, which often marks various verbal and non-verbal incidents, such as applause, interruption of speeches, voting (with results) etc. Most of these elements are explained in the Chapter on Transcription of Speech of the TEI Guidelines, and we illustrate the essential ones below.

5.1. Utterances

A speech is marked up using the <u> (utterance) element, as illustrated below:
<u who="#DavidPriorana="#regular">  <seg>I ask that the draft Regulations laid before the House on 5 December be approved.</seg>  <seg>The relevant document is the 20th Report from the Legislation Committee.</seg> </u>
The main attribute of an utterance is who, which gives the pointer to the <person> element containing the metadata of the speaker. The <u> element can also have the ana attribute giving one or more pointers to a typology of types of speakers. In our case, it would point to a category that specifies that the speaker is a regular speaker (rather than, e.g., the chair) of the session.

The utterances can (but are not required to) be segmented using the generic TEI element for segments, <seg>, which encodes the paragraphs of the source transcription.4

5.2. Transcriber comments

Transcriber comments encode information such as interruptions, notes on what is happening in the chamber, results of voting, etc. Note that while section headings can also be taken as a kind of transcriber comments, these serve to structure a document, and are therefore treated in the Section on Text divisions.

In general, transcriber comments are encoded using <note>, although some comments can be encoded using more precise TEI elements, as explained below. Whether <note> or these more specific elements will be used, and, indeed, whether comments are preserved in the encoding at all, will depend on the needs and resources of each particular project.

The example below lists some typically transcriber comments, which are here further qualified by their type attribute:
<note type="speaker">The president, Dr. Milan Brglez:</note> <note type="time">The sesssion began at 10 o'clock.</note> <note type="vote-ayes">84 voted for the adotion of the measure.</note> <note type="vote-noes">2 voted against the adotion of the measure.</note>

The first note simply gives the speaker of the utterance that would follow it, the second one gives the time when the session started, while the third and fourth are notes on the voting results. Note that in all three cases, instead of, or in addition to retaining such notes, the information contained in them can also be further explicated in dedicated markup. So, the information given in the first note will typically be encoded in the who attribute, the second could be encoded in the u/@when attribute, while how to encode the last two is explained in the Section on Voting results. It is therefore a matter of editorial policy whether to faithfully reproduce the transcription, including comments, or whether to remove the comments when they have already been taken into account when processing the transcriptions.

Some types of transcriber comments can also be encoded using more specific TEI elements than <note>. In particular, the TEI module for Transcription of Speech (Elements unique to spoken texts) defines the following ‘incident’ elements that correspond to various types of transcriber's comments:
  • <vocal> marks any vocalized but not necessarily lexical phenomenon, e.g. laughter, sounds of (dis)agreement from the benches etc.
  • <kinesic> marks any communicative phenomenon, not necessarily vocalized, for example clapping, a gesture or frown, etc.
  • <incident> marks any phenomenon or occurrence, not necessarily vocalized or communicative, for example incidental noises or other events affecting communication.
The example below illustrates the use of these three elements:
<u xml:lang="slwho="#SDZ5.AhačičMonika">With this I disolve the parliament!</u> <vocal who="#opposition">  <desc xml:lang="en">shouting</desc> </vocal> <kinesic who="#SDZ5.AhačičMonika">  <desc xml:lang="en">banging the gavel</desc> </kinesic> <incident>  <desc xml:lang="en">army storms the parliament</desc> </incident> <kinesic who="#governmet">  <desc xml:lang="en">clapping</desc> </kinesic>
As the example shows, the content of the transcriber comment is retained in the <desc> element, while the who attribute can be used to specify who is responsible for the incident. All three elements can also be further qualified using their type and subtype attributes.

It should be noted that <note> and the four ‘incident’ elements can be placed either on the same level as the <u> elements, or directly inside them (so, on the same level as <seg> elements) or even inside <seg> elements. It is recommended to place them as far up the hierarchy as possible, and especially avoid having them inside elements with otherwise contain text only (i.e. utterances or segments), as placing elements there leads to mixed content, which is difficult to process further, in particular if the text is to be linguistically annotated. However, if a transcriber comment is placed in the middle of the text, then it needs to be encoded inside the utterance, except if the utterance is split, as is further explained in the Section on Interrupted utterances.

5.3. Gaps

The transcribers can also comment that a part of the speech was not transcribed, e.g. because the recording was not understood, sometimes also noting the reason why, such as that the microphone was not turned on, that there was noise in the chamber, or that the speaker was speaking too quietly. These notes can be encoded as the <gap> element, which is then also marked by reason=inaudible. The original transcriber note can be left in the <desc> element, as illustrated below.
... I would further state that <gap reason="inaudible">  <desc>Microphone off</desc> </gap> and furthermore ....
The other reason for omitting a part of the transcription can be an editorial decision of the corpus compilers. The transcript can, for example, contain material that they do not want to include in the corpus, such as tables, or parts of the transcription that cannot be converted to text. In these cases, the reason given should be "editorial", while the <desc> should contain what has been omitted, as illustrated below.
<gap reason="editorial">  <desc xml:lang="en">Table omitted</desc> </gap>

5.4. Interrupted utterances

A special case occurs when a transcription note states that somebody interrupted the speaker and gives the transcript of the interruption, with the main speaker then continuing with their speech, as in the following snippet:
Boris Johnson: I propose a no-deal Brexit. /Jeremy Corbyn: Traitor!/ Because England does not want any dealings with the European Union.
While the interruption might be simply encoded as a <note> or <vocal> element, as explained above, it is more precisely encoded as a separate utterance, which brings with it the problem that nested utterances are not allowed, so the main utterance then needs to be split into two (or more) pieces. The example below illustrates how this is encoded:
<u who="#BorisJohnsonxml:id="GB001.8.3"  next="#GB001.8.5">I propose a no-deal Brexit.</u> <u who="#JeremyCorbynxml:id="GB001.8.4">Traitor!</u> <u who="#BorisJohnsonxml:id="GB001.8.5"  prev="#GB001.8.3">Because England does not want any dealings with the European Union.</u>
As can be seen, the split is indicated by the use of the next attribute on the first part of the split utterance and by the prev attribute of the next part of the split utterance. The values of the attributes are pointers to the next of previous identifiers of the appropriate part of the split utterance.5

5.5. Addressees, questions and answers

Sometimes particular speeches are in fact questions (or statements) addressed to a person, or answers to particular questions, again, directed at the person that posed the question, and it might be advantageous to encode this fact. The encoding of a question and answer, and to whom it is directed is illustrated below.
<u xml:id="q_1who="#kappatoWhom="#eta"  ana="#question">  <seg>I would like to ask the Mr. Eta about ...</seg> </u> <u xml:id="a_1who="#etatoWhom="#kappa"  ana="#answer">  <seg>Mr. Kappa, BNAT was the only umbrella professional body for ...</seg> </u>
The person at whom a speech is directed is encoded in the toWhom attribute on the <u> element, which is, same as who, a reference which must point to a defined person. For the fact that the first speech is a question, and the second an answer, we here presuppose a taxonomy in the <teiHeader> of the corpus, which defines the types of utterances we wish to distinguish, and the question and answer are IDs of the appropriate categories.

5.6. Voting results

One further aspects of the transcripts, which can be of particular interest for some researchers, needs to be mentioned, namely voting results. Voting results are typically mentioned in the transcripts as a note, and we follow Akoma Ntoso in its treatment. Assuming a <taxonomy> in the TEI header that defines "ayes" and "noes", the note can be marked up using the <measure> element, as the following example shows:
<note type="summary">(Question carried by <measure xml:id="quantity_1"   corresp="#ayesquantity="72">72</measure> to <measure xml:id="quantity_2"   corresp="#noesquantity="56">56</measure> votes)</note>
In addition to marking up the voting results in-line, they can also be more formally encoding at the start of the <body>, using the <listEvent> element to contain the list of individual votes in the <event> elements. Below we give a more complicated example, where the voting was followed by a recount, and this fact is encoded in the <relation> element, of course, again assuming the appropriate taxonomies in the teiHeader:
<listEvent>  <event type="votingxml:id="vot1"   ana="#approvedcorresp="true">   <desc>    <measure type="quorum"     xml:id="vot1-quo1ana="#majorityquantity="80"/>    <measure type="countxml:id="vot1-cnt2"     ana="#ayescorresp="#quantity_1quantity="72"/>    <measure type="countxml:id="vot1-cnt3"     ana="#noescorresp="#quantity_2quantity="34"/>   </desc>  </event>  <event type="recountxml:id="rct1"   ana="#approvedcorresp="true">   <desc>    <measure type="countxml:id="vot-cnt1"     ana="#ayescorresp="#quantity_3quantity="76"/>   </desc>  </event>  <listRelation>   <relation name="recountactive="#rct1"    passive="#vot1"/>  </listRelation> </listEvent>

6. Linguistic annotation

This section introduces basic types of linguistic annotation that can be added to PPCs; the examples should be sufficient for users to be able to add further types of linguistic annotation to PPC corpora.

It is recommended that linguistically annotated PPCs are stored in two versions, one with the linguistic annotations, and the other without them. The reason for this is that many users prefer to use the ‘plain-text’ version, e.g. because they want to perform their own linguistic annotation, or this kind of annotation is simply not relevant for their research questions.

The TEI Guidelines discuss basic linguistic annotation in their Chapter on Simple Analytic Mechanisms and we follow one particular option given there. In particular, it is recommended that (where possible) the annotation is in-line (as opposed to stand-off), i.e. that the linguistic annotation is given in the main document, and therefore mixed with the other annotations, rather than in a separate document with pointers into the base text.

6.1. Basic word-level annotation

Basic linguistic annotation comprises sentence segmentation, tokenisation, part-of-speech tagging and lemmatisation. The Parla-CLARIN recommendations specialise the recommendations given in the Section on Lightweight Linguistic Annotation of the TEI Guidelines. The following example shows the basic principles of the annotation:
<s>  <w msd="UPosTag=DET|Case=Gen|Gender=Neut|Number=Sing|PronType=Dem"   lemma="ta">Tega</w>  <w msd="UPosTag=PRON|PronType=Prs|Reflex=Yes|Variant=Short"   lemma="se">se</w>  <w msd="UPosTag=PARTlemma="sploh">sploh</w>  <w msd="UPosTag=AUX|Mood=Ind|Number=Sing|Person=1|Polarity=Neg|Tense=Pres|VerbForm=Fin"   lemma="biti">nisem</w>  <w msd="UPosTag=VERB|Aspect=Perf|Gender=Masc|Number=Sing|VerbForm=Part"   lemma="zavestijoin="right">zavedel</w>  <pc msd="UPosTag=PUNCT">.</pc> </s>
Sentences are marked up using the <s> element, words with the <w> element and punctuation symbols with the <pc> element. To retain the linguistically significant whitespace, the join element is used, with the possible values being no (assumed to be the default), right (no whitespace to the right of the token) and left (no whitespace to the left of the token) and both (no whitespace to either side of the token). While, in the preceding example, it would be more intuitive to have the value left marked on the full-stop, we recommend that only the value right is used on the preceding token, as this simplifies processing.

The base form of a word is given in the lemma attribute,6 while the situation with the part-of-speech tags is somewhat more complicated. For analytic tagsets, where a "part-of-speech tag" is actually a set of attribute-values, as in the example above, the msd attribute should be used. For synthetic tagsets, such as the Penn Treebank tagset, which have atomic tags that cannot always be decomposed into attribute-value pairs (e.g. "TO" for the word "to"), a better alternative is to use of the pos attribute.

There is also a third option, for tags that are look like strings, however, they are meant as a shorthand for a feature-structure representation, as is the case with the MULTEXT-East tagset. For these, it is best to use the generic ana attribute, whose value is a pointer, as shown in the following example:
<s>  <w ana="#Pd-nsglemma="ta">Tega</w>  <w ana="#Px------ylemma="se">se</w>  <w ana="#Qlemma="sploh">sploh</w>  <w ana="#Va-r1s-ylemma="biti">nisem</w>  <w ana="#Vmep-smlemma="zavesti"   join="right">zavedel</w>  <pc ana="#Z">.</pc> </s>
Here, the tags are pointers to identifiers, where the elements bearing these identifiers define the appropriate feature-structures, i.e. pairs of attribute-values, as in the example below:
<fs xml:id="Pd-nsgxml:lang="en">  <f name="CATEGORY">   <symbol value="Pronoun"/>  </f>  <f name="Type">   <symbol value="demonstrative"/>  </f>  <f name="Gender">   <symbol value="neuter"/>  </f>  <f name="Number">   <symbol value="singular"/>  </f>  <f name="Case">   <symbol value="genitive"/>  </f> </fs>
Such feature structures are grouped together in the feature-value library (<fvLib>) element, which can be contained in its own <TEI> element of the corpus. As ana is a pointer, it can also contain complete URLs (e.g. http://nl.ijs.si/ME/V6/msd/tables/msd-fslib2-sl.xml#Pd-nsg) which enables the feature-structure definitions to be stored externaly to the corpus. However, prefixing such PoS tags for each token by the complete URL would lead to very large files. This is why the TEI offers a mechanism to shorten references to URLs. This mechanism is explained in the Section Using Abbreviated Pointers of the TEI Guidelines, and we give below and example:
<s>  <w ana="mte:Pd-nsglemma="ta">Tega</w>  <w ana="mte:Px------ylemma="se">se</w>  <w ana="mte:Qlemma="sploh">sploh</w>  <w ana="mte:Va-r1s-ylemma="biti">nisem</w>  <w ana="mte:Vmep-smlemma="zavesti"   join="right">zavedel</w>  <pc ana="mte:Z">.</pc> </s>
As can be seen, the only difference to the preceding example is that the values (IDs) of the tags are preceded by mte: rather than #. This prefix should be then expanded by the processing software to whatever the <prefixDef> element, defined in the TEI header, specifies, as shown in the example below:
<prefixDef ident="mtematchPattern="(.+)"  replacementPattern="http://nl.ijs.si/ME/V6/msd/tables/msd-fslib-sl.xml#$1">  <p xml:lang="en">Private MSD URIs with the prefix "mte" point to fs elements    defining the Slovene MULTEXT-East Version 6 MSDs, cf. <ref target="http://nl.ijs.si/ME/V6/">http://nl.ijs.si/ME/V6/</ref> and <ref target="https://github.com/clarinsi/mte-msd">https://github.com/clarinsi/mte-msd</ref>.</p> </prefixDef>

6.2. Normalised and syntactic words

In certain contexts a word (or, in general, a token) in the transcription needs to be normalised in a certain way. In the context of PPCs, this can happen with historical transcripts, which contain archaic wordforms and where we wish to annotate them with their modernised forms, or when the transcript is linguistically annotated, and the annotation framework distinguishes original words form syntactic words (i.e. has the concept of ‘multiwords’), as is the case in the Universal Dependencies framework.

For simple normalisation, where a one-word token is normalised into another word token, the norm attribute on word or punctuation tokens should be used, as explained at the end of the Section Lightweight Linguistic Annotation of the TEI Guidelines.

More challenging is the case where one original word token must be represented as several normalised words, either in the context of historical corpora or, as mentioned above, in the context of multiword units. For this, we use embedded empty words with associated norm attributes, and possibly other attributes with linguistic annotation. For example, Czech has the word ‘abyste’ which is decomposed into two syntactic words, ‘aby’ and ‘byste’. This should be encoded as in the following example:7
<w>abyste <w norm="abylemma="aby"/>  <w norm="bystelemma="být"/> </w>
There are also cases where two (or more) original words correspond to one normalised word. Here, it is the outer word that carries the norm and possibly other linguistic attributes, while the inner words are the original ones. For example, Slovene used to form the superlative form of adjectives with the word ‘naj’ written separately (and often as ‘nar’), while in contemporary Slovene the ‘naj’ is a prefix of the adjective. This case should be encoded as follows:
<w norm="najlepšilemma="lep">  <w>nar</w>  <w>lepši</w> </w>

6.3. Segmental annotation

A common annotation type, used e.g. for marking named entities or terms, is segmental annotation, where a stretch of text or tokens is simply enclosed in XML tags, as the following example illustrates:
<s>  <name type="person">   <w>John</w>   <w>Malkovič</w>  </name>  <w>went</w>  <w>to</w>  <name type="location">   <w>New</w>   <w>York</w>  </name>  <pc>.</pc> </s>
TEI offers a number of elements that can be used for such annotations, e.g.:
  • <term> for marking up terms, discussed in the Section on Terms, Glosses, Equivalents, and Descriptions of the TEI Guidelines;
  • <name> for various types of names, or, the more general <rs> for ‘referring string’, e.g. <rs type="person"> her husband</rs>, discussed in the Section on Referring Strings of the TEI Guidelines;
  • <num> for numbers and <measure>, usually comprising a number, a unit, and a commodity name (e.g. <measure type="weight" quantity="5000" unit="ton" commodity="coal">five thousand tons of coal</measure>, discussed in the Section on Numbers and Measures of the TEI Guidelines;
  • <date> and <time>, discussed in the Section on Dates and Times of the TEI Guidelines;
  • <seg> for cases where TEI does not have a specific element for some type of segmental markup, e.g. <seg type="swearword" subtype="religious">Damn</seg>; this element is discussed in the Section on Blocks, Segments, and Anchors of the TEI Guidelines.
It should be noted that for cases of discontinuity of the segment, the prev and next attributes can be used to link its parts together. Furthermore, the part attribute can be used to specify the type of the fragments, as shown in the following example:
<term xml:id="t1part="Inext="#t3">di-</term> and <term xml:id="t2part="Inext="#t3">poli</term> <term xml:id="t3part="F">methyl</term>

6.4. Linking annotation

For analyses that establish relations between tokens or segments, such as syntactic dependency analysis or semantic role labelling, the <linkGrp> element, explained in the Chapter on Linking, Segmentation, and Alignment is used. It is composed of <link> elements, which give two or more references to IDs, as illustrated in the following example:
<s xml:id="ssj1.1.5">  <w xml:id="ssj1.1.5.t1">Tega</w>  <w xml:id="ssj1.1.5.t2">se</w>  <w xml:id="ssj1.1.5.t3">sploh</w>  <w xml:id="ssj1.1.5.t4">nisem</w>  <w join="rightxml:id="ssj1.1.5.t5">zavedel</w>  <pc xml:id="ssj1.1.5.t6">.</pc>  <linkGrp type="UD-SYN"   targFunc="head argument">   <link ana="ud-syn:obj"    target="#ssj1.1.5.t5 #ssj1.1.5.t1"/>   <link ana="ud-syn:expl"    target="#ssj1.1.5.t5 #ssj1.1.5.t2"/>   <link ana="ud-syn:advmod"    target="#ssj1.1.5.t5 #ssj1.1.5.t3"/>   <link ana="ud-syn:aux"    target="#ssj1.1.5.t5 #ssj1.1.5.t4"/>   <link ana="ud-syn:root"    target="#ssj1.1.5 #ssj1.1.5.t5"/>   <link ana="ud-syn:punct"    target="#ssj1.1.5.t5 #ssj1.1.5.t6"/>  </linkGrp>  <linkGrp type="SRL"   targFunc="head argumentcorresp="#ssj1.1.5">   <link ana="srl:PAT"    target="#ssj1.1.5.t5 #ssj1.1.5.t1"/>  </linkGrp> </s>
In the example, each token, as well as the sentence element are given an ID, and the first link group specifies the Universal Dependencies syntactic analysis of the sentence, while the second one gives its semantic role labels. They are distinguished by their type attribute8, while the targFunc attribute explains the functions of the references given in the target attributes of the contained <link> elements.

The contained links then give the references to the head and argument tokens of the relation, while the ana attribute specifies what kind of a relation this is. It should be noted that the value of the analysis attribute is a pointer, and, in the example, we use the TEI prefix mechanism, which is then expanded via the <prefixDef> element in the TEI header to resolve into a URI pointer (as explained in Section on Identifiers and referencing), most likely to pointing to <taxonomy> categories that give the definitions of the relations. A further point to notice is that the sentence serves as the Root element of the sentence, i.e. the fifth link of the UD analysis ties together the sentence with the top-most token of the sentence.

7. Multimedia

Parliamentary corpora can also have data from other modalities associated with the transcripts, in particular audio or video recordings, and the facsimile of the original transcripts, particularly relevant for older parliamentary proceedings. This section explains how to encode such data in the TEI encoded documents, where it is assumed that the actual speech, video and images are stored in separate files, and the TEI document makes reference to them.

7.1. Speech and video

The transcription can refer to and align with external audio and video data using the <timeline> element, explained in the Section on Placing Synchronous Events in Time of the TEI Guidelines, and further elaborated in ISO 24624:2016 Language resource management -- Transcription of spoken language. While the ISO standard is better elaborated, it also changes and adds element definitions, so we are using the standard TEI variant of speech encoding as far as the schema is concerned, while taking into account, as much as possible, specific encoding choices as proposed by the ISO standard.

First, TEI offers the <recordingStmt> element (a part of <fileDesc> of the TEI header) which contains the information about the recording(s) of the transcription. This information can be unstructured (i.e. a series of <ab> elements) or structured (contained in the <recording> element); Parla-CLARIN recommends the structured version. As shown in the example below, the element contains information of the type of recording (audio / video), its duration9 and a pointer to the file, possibly a responsibility statement (<respStmt>) of the person or agency that made the recording, the date when the recording file was made and the equipment used:
<recordingStmt>  <recording type="audiodur="PT43M45S">   <media mimeType="audio/wav"    url="WAV/Session_2018-12-01a.wav"/>   <respStmt>    <resp>Audio capture</resp>    <name>John Dury</name>   </respStmt>   <time>2016-04-15</time>   <equipment>    <ab>Video downloaded from U.K. parliament site.</ab>    <ab>Audio extracted from video with Audacity 1.4</ab>   </equipment>  </recording> </recordingStmt>

The mapping of time intervals of the recording to IDs in the TEI document is encoded in the <timeline> element, in particular in the contained <when> elements. As explained below, these IDs are then used to link elements in the transcription to the timeline, therefore each <when> element must have the xml:id attribute. The <when> elements must also be in the same order as the time-points they encode.

As the example below shows, the timeline gives the unit in which the intervals are specified (typically second, s) and the time origin of the timeline, here referring to the first <when> element, at the very start of the recording, so specified with the absolute time. Further <when> elements give the interval between this origin point and their end:
<timeline unit="sorigin="#T0">  <when xml:id="T0absolute="00:00:00.0"/>  <when xml:id="T1interval="1.13"   since="#T0"/>  <when xml:id="T2interval="3.84"   since="#T0"/>  <when xml:id="T3interval="5.33"   since="#T0"/>  <when xml:id="T4interval="9.35"   since="#T0"/>  <when xml:id="T5interval="12.62"   since="#T0"/> </timeline>
The IDs of the timeline synchronisation are then used by the <u> elements in the transcription via their start and end attributes. In the examples below we give three cases of such linking: the first one gives a straightforward temporal structure on the <u>; the second one uses the empty <anchor> element to give additional temporal structure for cases where the synchronised parts of the utterance are not further marked-up (or the synchronisation is required for elements that don't have the start and end attributes); while the third and fourth demonstrate the case where two utterances are partially overlapping:10:
<u who="#SPK0start="#T0end="#T1"  xml:id="u2">Good morning!</u> <u who="#SPK1start="#T1end="#T3">Good morning, <anchor synch="#T2"/>Mr. president!</u> <u who="#SPK0start="#T4end="#T7">You do not have the <anchor synch="#T5"/>floor!</u> <u who="#SPK1start="#T5end="#T6">Sorry, <anchor synch="#T2"/>mate!</u>

7.2. Facsimile

In cases where the facsimile of the original transcription is available (especially valuable for older parliamentary proceedings, where the exact appearance of the original proceedings is of interest), it is advantageous to enable viewing the original together with the encoded transcription. How to achieve this, in general, is explained in the Chapter on Representation of Primary Sources of the TEI Guidelines.

The simplest but also the most limiting way to achieve this is to have per-page facsimile files and use the page break i.e. <pb> element to mark page boundaries in the transcript and then directly specify the image file of the page with the facs attribute, as illustrated in the example below:
<body>  <pb facs="PNG/page1.png"/> <!-- text contained on page 1 encoded here -->  <pb facs="PNG/page2.png"/> <!-- text contained on page 2 encoded here --> </body>
By convention, this encoding indicates that the image indicated by the facs attribute represents the whole of the text following the <pb> element, up to the next <pb> element. The page break element can also contain a reference to the source HTML of web-harvested proceeding using the source attribute, and per-page links to audio or vidoe (cf. the Section on Speech and video), as illustrated in the following example:
<pb n="1"  source="https://www.psp.cz/eknih/2013ps/stenprot/044schuz/s044051.htm"  xml:id="ParlaMint-CZ_2016-04-13-ps2013-044-02-006-162.pb1corresp="#ps2013-044-02-006-162.audio1"/>
A more complicated solution to referring to facsimiles, where it is possible to have several images per page (e.g. in different resolutions) or to specify areas of a page is enabled by the <facsimile> element, which should appear immediately before the <text> element of a TEI document. The example below refers to the first and third pages directly with the <graphic> element, whereas the second page is encoded as a <surface> which then contains the page image in two resolutions:
<facsimile>  <graphic xml:id="page1"   url="PNG/page1.png"/>  <surface xml:id="page2">   <graphic type="600dpi"    url="PNG/page2-highRes.png"/>   <graphic type="300dpi"    url="PNG/page2-lowRes.png"/>  </surface>  <graphic xml:id="page3"   url="PNG/page3.png"/> </facsimile>
Such definitions are then referred to via local references in facs attribute of <pb>, as discussed previosly.

More complicated cases, such as delimiting portions of a page are also supported by the TEI Guidelines, and for these the reader is referred directly to the Section on Digital Facsimiles.

8. Conversions

A TEI encoded document is, in general, not meant to be used directly by software programs, rather it serves as an interchange and storage format. Furthermore, most TEI documents are not "born TEI", but rather converted into TEI from some source format. In this section we discuss some up- and down-conversion scripts that have already been developed for transforming source formats into Parla-CLARIN and from Parla-CLARIN into formats immediately usable by software and are available in the Git repository of Parla-CLARIN.

8.1. Conversion from Akoma Ntoso

As mentioned in the Section on Introducing Akoma Ntoso, this standard is used to encode parliamentary proceedings of several countries, and this section introduces the developed conversion of AKN documents to Parla-CLARIN. The example documents and conversion script can be found in the Examples/AkomaNtoso folder of the Parla-CLARIN Git repository. For a detailed treatment of the conversion, the XSLT script akn2tei.xsl should be consulted, while we here mention only some of the more aspects of the conversion.

First, and on a minor point, the conversion to TEI attempts to preserve the IDs of the source AKN document, however, Akoma Ntoso distinguishes three ID-bearing attributes, namely eId (expression identifier), wId (work identifier) and GUID (globally unique identifier). The first is simply mapped to the xml:id attribute of the TEI document, while the latter two are given in the <publicationStmt> of the TEI header and in the <idno> element, as the following example illustrates:
<idno type="wIdcorresp="#section_2_1">section_2_2</idno> <idno type="GUIDcorresp="#section_2_2">X13242</idno>

8.1.1. FRBR data

Akoma Ntoso makes use of FRBR, in particular to distinguish a ‘work’ from its ‘expression’, and this one from its ‘manifestation’. This information is in AKN encoded in the <identification> element, which appears inside the <meta> element, as shown in the example below:
<identification source="#palmirani">  <FRBRWork>   <FRBRthis value="/akn/ke/debaterecord/2011-06-10/!main"/>   <FRBRuri value="/akn/ke/debaterecord/2011-06-10"/>   <FRBRdate date="2011-06-10"    name="generation"/>   <FRBRauthor href="#parliament"    as="#author"/>   <FRBRcountry value="ak"/>  </FRBRWork>  <FRBRExpression>   <FRBRthis value="/akn/ke/minutes/2011-06-10/eng@/!main"/>   <FRBRuri value="/akn/ke/minutes/2011-06-10/eng@"/>   <FRBRdate date="2011-06-25name="markup"/>   <FRBRauthor href="#palmirani"    as="#editor"/>   <FRBRlanguage language="eng"/>  </FRBRExpression>  <FRBRManifestation>   <FRBRthis value="/akn/ke/minutes/2011-06-10/eng@/!main.xml"/>   <FRBRuri value="/akn/ke/minutes/2011-06-10/eng@.akn"/>   <FRBRdate date="2011-06-25"    name="publication"/>   <FRBRauthor href="#palmirani"    as="#editor"/>  </FRBRManifestation> </identification>
Some of these elements are mapped to specific TEI elements or attributes, e.g. the language of the text specified in the <FRBRExpression> maps to the xml:lang of the <TEI> element, but this does not hold for all the FRBR information, which we also wanted to retain in the TEI document.
To convert all FRBR information to TEI, we used the recommendations of Best Practices for TEI in Libraries, in particular as given the Section on The TEI Header and FRBR,11 where it is recommended that FRBR information is encoded in the <sourceDesc> element of the TEI header as a <listRelation>. As opposed to the original AKN <identification> the <listRelation> contains a simple list of <relation> elements, so these must also specify the relation between the particular piece of data and the fact that it belongs to the FRBR ‘work’, ‘expression’, or ‘manifestation’. These and other formalised relations are taken from the formal vocabularies of W3C (for RDF and OWL) and (via purl.org) of Dublin Core and vocab.org. The example below exemplifies the conversion of the preceding AKN example into TEI:
<listRelation type="FRBRresp="#palmirani">  <relation ref="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"   active="/akn/ke/debaterecord/2011-06-10/!main"   passive="http://purl.org/vocab/frbr/core#Work"/>  <relation ref="http://www.w3.org/2002/07/owl#sameAs"   active="http://purl.org/vocab/frbr/core#Work"   passive="https://w3id.org/akn/ontology/allot/FRBRWork"/>  <relation ref="http://purl.org/dc/terms/isPartOf"   active="/akn/ke/debaterecord/2011-06-10/!main"   passive="/akn/ke/debaterecord/2011-06-10"/>  <relation name="generation"   ref="http://purl.org/dc/elements/1.1/date"   active="/akn/ke/debaterecord/2011-06-10/!mainpassive="2011-06-10"/>  <relation ref="http://purl.org/dc/elements/1.1/creator"   active="/akn/ontology/organization/akn/parliament"   passive="/akn/ke/debaterecord/2011-06-10/!main"/>  <relation ref="/akn/ontology/role/akn/author"   active="/akn/ontology/organization/akn/parliament"   passive="/akn/ke/debaterecord/2011-06-10/!main"/>  <relation ref="http://purl.org/vocab/frbr/core#Place"   active="/akn/ke/debaterecord/2011-06-10/!main"   passive="http://eulersharp.sourceforge.net/2003/03swap/countries#ak"/>  <relation active="/akn/ke/debaterecord/2011-06-10/!main"   ref="http://purl.org/vocab/frbr/core#realization"   passive="/akn/ke/minutes/2011-06-10/eng@/!main"/>  <relation ref="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"   active="/akn/ke/minutes/2011-06-10/eng@/!main"   passive="http://purl.org/vocab/frbr/core#Expression"/>  <relation ref="http://www.w3.org/2002/07/owl#sameAs"   active="http://purl.org/vocab/frbr/core#Expression"   passive="https://w3id.org/akn/ontology/allot/FRBRExpression"/>  <relation ref="http://purl.org/dc/terms/isPartOf"   active="/akn/ke/minutes/2011-06-10/eng@/!main"   passive="/akn/ke/minutes/2011-06-10/eng@"/>  <relation name="markup"   ref="http://purl.org/dc/elements/1.1/date"   active="/akn/ke/minutes/2011-06-10/eng@/!mainpassive="2011-06-25"/>  <relation ref="http://purl.org/dc/elements/1.1/creator"   active="/akn/ontology/person/ita/editors/palmirani"   passive="/akn/ke/minutes/2011-06-10/eng@/!main"/>  <relation ref="/akn/ontology/role/akn/editor"   active="/akn/ontology/person/ita/editors/palmirani"   passive="/akn/ke/minutes/2011-06-10/eng@/!main"/>  <relation ref="http://purl.org/dc/elements/1.1/language"   active="/akn/ke/minutes/2011-06-10/eng@/!mainpassive="eng"/>  <relation active="/akn/ke/minutes/2011-06-10/eng@/!main"   ref="http://purl.org/vocab/frbr/core#embodiment"   passive="/akn/ke/minutes/2011-06-10/eng@/!main.xml"/>  <relation ref="http://www.w3.org/1999/02/22-rdf-syntax-ns#type"   active="/akn/ke/minutes/2011-06-10/eng@/!main.xml"   passive="http://purl.org/vocab/frbr/core.html#term-Manifestation"/>  <relation ref="http://www.w3.org/2002/07/owl#sameAs"   active="http://purl.org/vocab/frbr/core.html#term-Manifestation"   passive="https://w3id.org/akn/ontology/allot/FRBRManifestation"/>  <relation ref="http://purl.org/dc/terms/isPartOf"   active="/akn/ke/minutes/2011-06-10/eng@/!main.xml"   passive="/akn/ke/minutes/2011-06-10/eng@.akn"/>  <relation name="publication"   ref="http://purl.org/dc/elements/1.1/date"   active="/akn/ke/minutes/2011-06-10/eng@/!main.xmlpassive="2011-06-25"/>  <relation ref="http://purl.org/dc/elements/1.1/creator"   active="/akn/ontology/person/ita/editors/palmirani"   passive="/akn/ke/minutes/2011-06-10/eng@/!main.xml"/>  <relation ref="/akn/ontology/role/akn/editor"   active="/akn/ontology/person/ita/editors/palmirani"   passive="/akn/ke/minutes/2011-06-10/eng@/!main.xml"/> </listRelation>
There are few points to notice in this conversion:
  • Where AKN qualifies its FRBR statements with the additional name attribute, which does not have a controlled vocabulary, its value is retained in the name attribute of the TEI <relation>.
  • The conversion also specifies the formal equivalence (OWL sameAs) of the official FRBR definition of ‘work’, ‘expression’, and ‘manifestation’ (so, http://purl.org/vocab/frbr/) with the AKN recommended ontology of these terms, namely https://w3id.org/akn/ontology/allot/.

8.1.2. Converting addressee, role, questions and answers

Akoma Ntoso can directly specify not only the speaker of an utterance (as the value of the by attribute) but also the role of the speaker for a particular utterance (as attribute) and to whom the speech is addressed to (to attribute), as exemplified below:
<speech by="#khalwaleto="#speaker"  as="#pm">  <p>Mr. Speaker, Sir, I beg to give notice of the following Motion:-</p> ... </speech>
TEI has the attribute toWhom, which directly corresponds to the to attribute. TEI, however, does not have an attribute that would correspond to the AKN as, so this is encoded as the value of the general ana attribute, which should point to the appropriate <category> of the pertinent taxonomy for defining the roles of the speakers. The Parla-CLARIN encoding is thus:
<u who="#khalwaleana="#pm"  toWhom="#speaker">  <seg>Mr. Speaker, Sir, I beg to give notice of the following Motion:-</seg> ... </u>
A similar problem, and its solution, concerns the distinction made in Akoma Ntoso between questions and answers, each of which has in AKN a dedicated element, as shown in the following example:
<question eId="question_1by="#kappa"  to="#ministerEducation">  <p>I would like to ask the Minister for Education about ...</p> </question> <answer eId="answer_1by="#eta"  as="#ministerEducation">  <p>Mr. Speaker, BNAT was the only umbrella professional body for...</p> </answer>
In the conversion to TEI we note that someting is a question or answer (or, in fact, any other type of utterance) by assuming a taxonomy specifying the categories corresponding to the types of speeches that we wish to distinguish, and the question and answer are IDs of the appropriate categories. We can then refer to these categories in the ana attribute of the utterance. Furthermore, and in case we wish to direclty link the question and answer, we can use for this the <relation> element using its name with the value questionAnswer. Such <listRelation> can be placed in an arbitrary portion of the document, as it contains links to IDs, but is, by convention, best placed inside the answer. We exemplify the Parla-CLARIN encoding of a question and answer below:
<u xml:id="question_1who="#kappa"  toWhom="#ministerEducationana="#question">  <seg>I would like to ask the Minister for Education about ...</seg> </u> <u xml:id="answer_1who="#eta"  ana="#ministerEducation #answertoWhom="#kappa">  <seg>Mr. Speaker, BNAT was the only umbrella professional body for ...</seg>  <listRelation>   <relation name="questionAnswer"    active="#question_1passive="#answer_1"/>  </listRelation> </u>

8.2. Conversion to RDF

As explained in the Section on Introducing RDF, this data model would be a useful ‘down-conversion’ of Parla-CLARIN corpora.

On the tei-l mailing list there have already been discussions on how to link TEI with RDF, summarised and with further links in Issue #1860 of the TEI GitHub project. The best way seems to be to either directly turn TEI markup into RDF triples, or, where this is not possible, to use the RDFa attribute on TEI elements.

An implementation might be best approached from the opposite direction, i.e. developing an ‘up-conversion’ to TEI of an existing RDF-encoded CPP.

9. Acknowledgements

The authors would like to thank all the participants of the CLARIN ParlaFormat workshop (May 23-24, 2019, Amersfoort) for their very useful comments and suggestions.

This proposal was inspired by a number of related projects, in particular: Best Practices for TEI in Libraries, the DARIAH and ELEXIS funded initiative TEI Lex0 to develop an interchange encoding for machine readable dictionaries, and the ELTeC corpus initiative by the COST Action CA16204 ‘Distant Reading for European Literary History’.

The work on these recommendations was funded by the CLARIN Research Infrastructure for Language Resources and Tools.

Appendix A Example document

This section gives a complete example document that validates according to Parla-CLARIN and aims to illustrate the encoding of various aspects of parliamentary proceedings corpora.

<!-- Does not render correctly! --><teiCorpus xml:id="Parla-CLARIN-Exemplar"  xml:lang="en"  xml:base="../Examples/Parla-CLARIN-Exemplar.xml" xmlns="http://www.tei-c.org/ns/1.0"    xmlns:tei="http://www.tei-c.org/ns/1.0"> <teiHeader> <fileDesc> <titleStmt> <title>Exemplar to illustrate Parla-CLARIN encoding</title> <!-- Persons responsible for creating a corpus, for example: --> <!-- author of the corpus --> <author ref="http://viaf.org/viaf/305936424"> <forename>Andrej</forename> <surname>Pančur</surname> </author> <!-- editor of the corpus --> <editor ref="https://orcid.org/0000-0002-1560-4099 http://viaf.org/viaf/15145066459666591823"> <forename>Tomaž</forename> <surname>Erjavec</surname> </editor> <!-- other responsibilities in building the corpus --> <respStmt> <resp>TEI corpus encoding</resp> <persName ref="http://viaf.org/viaf/305936424">Andrej Pančur</persName> <persName ref="https://orcid.org/0000-0002-1560-4099 http://viaf.org/viaf/15145066459666591823">Tomaž Erjavec</persName> </respStmt> <funder>CLARIN ERIC</funder> </titleStmt> <editionStmt> <edition>0.3</edition> </editionStmt> <extent> <measure unit="textsquantity="1">1 text</measure> <measure unit="utterancesquantity="6">6 utterances</measure> </extent> <publicationStmt> <authority>CLARIN ERIC</authority> <availability> <licence target="http://creativecommons.org/licenses/by/4.0/"/> <p>This work is licensed under the <ref target="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ref>.</p> </availability> <distributor>CLARIN Git repository</distributor> <!-- date of corpus construction or publication --> <date when="2019-09-04">September 4th, 2019</date> </publicationStmt> <!-- Source description for the whole corpus: --> <sourceDesc> <!-- Use a more or less structured bibliographic record in accordance with the guidelines in https://tei-c.org/release/doc/tei-p5-doc/en/html/HD.html#HD3 Minimum requirements: - Use the <bibl> element with the following child elements: - mandatory <title> and - optional <idno> and <date> --> <bibl> <title type="main">Website of the National Assembly</title> <title type="sub">Hansard</title> <idno type="URI">https://www.dz-rs.si/wps/portal/Home/deloDZ/seje/sejeDrzavnegaZbora/PoDatumuSeje/!ut/p/z1/04_Sj9CPykssy0xPLMnMz0vMAfIjo8zivT39gy2dDB0N3INMjAw8Db0tQ3x8fQwNvM30wwkpiAJKG-AAjgb6BbmhigCWEc4T/dz/d5/L2dBISEvZ0FBIS9nQSEh/</idno> <date from="1990-05-08to="2018-06-22"/> </bibl> </sourceDesc> </fileDesc> <encodingDesc> <editorialDecl> <correction> <p>No correction of source texts was performed.</p> </correction> <normalization> <p>Only parts relevant to the example document were retained.</p> </normalization> <hyphenation> <p>No end-of-line hyphens were present in the source.</p> </hyphenation> <quotation> <p>Quotation marks have been left in the text and are not explicitly marked up.</p> </quotation> <segmentation> <p>The texts are segmented into utterances (speeches) and segments (corresponding to paragraphs in the source transcription).</p> </segmentation> </editorialDecl> <appInfo> <application version="1.0"  ident="web-scraper"> <label>WebScraper WWW spider</label> <desc>Tool used to download source documents for this corpus.</desc> </application> </appInfo> <classDecl> <!-- One or more optional taxonomies, with which we further classify the content and structure of parliamentary debates --> <taxonomy> <desc>Types of speakers</desc> <category xml:id="chair"> <catDesc> <term>Chairperson</term>: chairman of a meeting. See also <term>The Speaker</term>: an MP who has been elected by other MPs to act as Chair during debates in the House of Commons.</catDesc> </category> </taxonomy> <!-- Project-specific classification of the structure of parliamentary periods: --> <taxonomy> <category xml:id="parla.term"> <catDesc> <term>Legislative period</term>: term of the parliament between general elections.</catDesc> <category xml:id="parla.session"> <catDesc> <term>Legislative session</term>: the period of time in which a legislature is convened for purpose of lawmaking, usually being one of two or more smaller divisions of the entire time between two elections. A session is a meeting or series of connected meetings devoted to a single order of business, program, agenda, or announced purpose.</catDesc> <category xml:id="parla.meeting"> <catDesc> <term>Meeting</term>: Each meeting may be a separate session or part of a group of meetings constituting a session. The session/meeting may take one or more days.</catDesc> <category xml:id="parla.sitting"> <catDesc> <term>Sitting</term>: sitting day</catDesc> </category> </category> </category> </category> </taxonomy> </classDecl> </encodingDesc> <profileDesc> <settingDesc> <setting> <!-- Location (posible values: city, street, address) of parliamentary sessions --> <name type="city">Ljubljana</name> <!-- In which country the parliament is located: in attribute @key use ISO 3166 country code --> <name type="countrykey="SI">Slovenia</name> <!-- Time range of the whole corpus of parliamentary debates --> <date from="1991-05-05to="2018-06-22"/> </setting> </settingDesc> <particDesc> <!-- List of speakers with their metadata --> <listPerson> <person xml:id="KučanMilan"> <persName> <surname>Kučan</surname> <forename>Milan</forename> </persName> <sex value="M">male</sex> <birth when="1941-01-14"> <placeName ref="http://www.geonames.org/3197229">Križevci</placeName> </birth> <idno type="wikimediaxml:lang="sl">https://sl.wikipedia.org/wiki/Milan_Ku%C4%8Dan</idno> <idno type="wikimediaxml:lang="en">https://en.wikipedia.org/wiki/Milan_Ku%C4%8Dan</idno> <idno type="viaf">https://viaf.org/viaf/68121580/</idno> </person> <person xml:id="JohnDoe1960"> <persName> <surname>John</surname> <forename>Doe</forename> </persName> <affiliation ref="#party.SDZ"  from="1990-05-16to="1991-05-08"/> <affiliation ref="#party.DZ"  from="1991-05-08"/> </person> <person xml:id="JohnsonBoris1964"> <persName> <surname>Johnson</surname> <forename>Boris</forename> </persName> <sex value="M">male</sex> <birth when="1964-06-19"> <placeName>New York City, U.S.</placeName> </birth> <idno type="wikimedia">https://en.wikipedia.org/wiki/Boris_Johnson</idno> </person> <person xml:id="CorbynJeremy1949"> <persName> <surname>Corbyn</surname> <forename>Jeremy</forename> <forename>Bernard</forename> </persName> <sex value="M">male</sex> <birth when="1949-05-26"> <placeName>Chippenham, England, United Kingdom</placeName> </birth> <idno type="wikimedia">https://en.wikipedia.org/wiki/Jeremy_Corbyn</idno> </person> </listPerson> <!-- List of "organisations", i.e. political parties and other formally established groupings --> <listOrg> <org xml:id="DZ"> <orgName xml:lang="sl">Državni zbor Republike Slovenije</orgName> <orgName>National Assembly of the Republic of Slovenia</orgName> <event from="1991-11-11"> <label>existence</label> </event> </org> <org xml:id="party.SDZ"  role="politicalPartyxml:lang="sl"> <event from="1989-01-11to="1991-10-13"> <label>existence</label> </event> <orgName full="yesxml:lang="sl">Slovenska demokratična zveza</orgName> <orgName full="yes">Slovenian Democratic Union</orgName> <orgName full="initxml:lang="sl">SDZ</orgName> <idno type="wikimedia">https://en.wikipedia.org/wiki/Slovenian_Democratic_Union</idno> </org> <org xml:id="party.DS"  role="politicalParty"> <orgName full="yesfrom="1989-02-16"  to="2003-09xml:lang="sl">Socialdemokratska stranka Slovenije</orgName> <orgName full="yesfrom="1989-02-16"  to="2003-09">Social Democratic Party of Slovenia</orgName> <orgName full="initfrom="1989-02-16"  to="2003-09xml:lang="sl">SDSS</orgName> <orgName full="yesfrom="2003-09"  xml:lang="sl">Slovenska demokratska stranka</orgName> <orgName full="yesfrom="2003-09">Slovenian Democratic Party</orgName> <orgName full="initfrom="2003-09"  xml:lang="sl">SDS</orgName> <idno type="wikimedia">https://en.wikipedia.org/wiki/Slovenian_Democratic_Party</idno> </org> <listRelation> <relation name="successor"  passive="#pp.SDZactive="#pp.DS"/> <relation xml:id="opposition.1"  name="oppositionactive="#party.SDZpassive="#DZ"/> </listRelation> </listOrg> </particDesc> <langUsage> <language ident="en">English</language> <language ident="sl">Slovenian</language> </langUsage> </profileDesc> </teiHeader> <TEI xml:id="document.idxml:lang="en"> <teiHeader> <fileDesc> <titleStmt> <!-- There are no rules on how these titles should be written --> <title>The parliament of the Republic of Slovenia</title> <title>Continuation of the second session</title> <title>30th January 2011</title> <!-- All relevant information about the type of session/meeting/sitting (in accordance with the project specific taxonomy) is given in the meeting element: - @n: ordinal number of the session/meeting/sitting - @corresp: a link to an organization holding a meeting - @ana: one or more links to taxonomy on the different types of parliamentary sessions --> <meeting n="2corresp="#DZ"  ana="#parla.meeting"/> </titleStmt> <!-- Publication statement same as in teiCorpus/teiHeader --> <publicationStmt> <authority>CLARIN ERIC</authority> <availability> <licence target="http://creativecommons.org/licenses/by/4.0/"/> <p>This work is licensed under the <ref target="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0 International License</ref>.</p> </availability> <distributor>CLARIN Git repository</distributor> <date when="2019-07-24">24. 7. 2019</date> </publicationStmt> <sourceDesc> <bibl> <title>Continuation of the second session</title> <idno type="URI">https://www.dz-rs.si/wps/portal/Home/deloDZ/seje/evidenca?mandat=III&amp;type=sz&amp;uid=6A9C9127BB26C19AC12569E600561164</idno> <date when="2001-01-30">30. 1. 2001</date> </bibl> </sourceDesc> </fileDesc> <profileDesc> <settingDesc> <setting> <!-- Location (posible values: city, street, address) of parliamentary sessions --> <name type="city">Ljubljana</name> <!-- In which country the parliament is located: in attribute @key use ISO 3166 country code --> <name type="countrykey="SI">Slovenia</name> <!-- Date of the parliamentary debate. An ana attribute may contain additional classifications --> <date when="2011-06-10ana="#parl.sitting">10th June 2011</date> </setting> </settingDesc> </profileDesc> </teiHeader> <text> <front> <div type="preface"> <!-- text before speeches started --> <head>THE PARLIAMENT OF THE REPUBLIC OF SLOVENIA</head> <head>Continuation of the second session</head> <docDate when="2011-01-30">30th January 2011</docDate> </div> </front> <body> <!-- Metadata about voting --> <listEvent> <event xml:id="vote.1type="voting"  corresp="#agenda.1 #quorum.1 #vote.1.ayes #vote.1.noes"> <desc> <measure type="quorumquantity="76"/> <measure type="ayesquantity="47"/> <measure type="noesquantity="19"/> <time when="2011-01-30T15:49:50"/> </desc> </event> <event xml:id="recount.1type="recount"  corresp="#agenda.1 #recount.1.ayes"> <desc> <measure type="ayesquantity="48"/> <time when="2011-01-30T15:52:35"/> </desc> </event> <listRelation> <relation name="recount"  active="#recount.1passive="#vote.1"/> </listRelation> </listEvent> <div> <!-- An example of starting a new meeting and recorded time --> <note type="time">The meeting opened at <time from="2011-01-30T10:03:00">10.03</time>.</note> <note type="speaker">MILAN KUČAN:</note> <u who="#KučanMilanana="#chair"> <seg>Dear Members, Colleagues, ladies and gentlemen!</seg> <seg>I begin with the continuation of the second session of the National Assembly.</seg> <seg>How many members of the parliament are present?</seg> </u> <!-- example of quorum --> <note type="quorum">Present <measure xml:id="quorum.1quantity="76">76</measure>.</note> <!-- example of new text division: discussing the agenda item --> <div xml:id="agenda.1"> <head>THIRD AMENDMENT</head> <!-- example of voting --> <note type="speaker">MILAN KUČAN:</note> <u who="#KučanMilanana="#chair"> <seg>We will now vote on the Third Amendment.</seg> <seg>How are you going to vote?</seg> </u> <note type="speaker">John Doe:</note> <u who="#JohnDoe1960"> <seg>Of course, I will vote in favor.</seg> </u> <incident> <desc>Applause.</desc> </incident> <note type="vote">For <measure xml:id="vote.1.ayestype="ayes"  quantity="47">47</measure>. Against <measure xml:id="vote.1.noestype="noes"  quantity="19">19</measure>.</note> <note type="speaker">MILAN KUČAN:</note> <u who="#KučanMilanana="#chair"> <seg>There was an error voting. I ask you to repeat the vote.</seg> </u> <!-- example of recount of votes --> <note type="recount">For <measure xml:id="recount.1.ayes"  type="ayesquantity="48">48</measure>.</note> </div> <!-- new agenda (British example) --> <div xml:id="agenda.2"> <head>BREXIT</head> <!-- Interrupted utterances: Boris Johnson: I propose a no-deal Brexit. /Jeremy Corbyn: Traitor!/ Because England does not want any dealings with the European Union. --> <u who="#JohnsonBoris1964"  xml:id="GB001.8.3next="#GB001.8.5"> <seg>I propose a no-deal Brexit.</seg> </u> <u who="#CorbynJeremy1949"  xml:id="GB001.8.4"> <seg>Traitor!</seg> </u> <u who="#JohnsonBoris1964"  xml:id="GB001.8.5prev="#GB001.8.3"> <seg>Because England does not want any dealings with the European Union.</seg> </u> <!-- Incidents --> <vocal who="#opposition.1"> <desc xml:lang="en">shouting</desc> </vocal> <kinesic who="#CorbynJeremy1949"> <desc xml:lang="en">banging of the gavel</desc> </kinesic> <incident> <desc xml:lang="en">Army storms the parliament</desc> </incident> </div> </div> </body> <back> <!-- example of conclusions, annexes etc. --> <div type="conclusions"> <note type="date">Date, <date when="2019-10-31">31st October, 2019</date> </note> </div> </back> </text> </TEI> </teiCorpus>

Appendix B Formal specification

Appendix B.1 Elements

Appendix B.1.1 <TEI>

<TEI> (TEI document) contains a single TEI-conformant document, combining a single TEI header with one or more members of the model.resource class. Multiple <TEI> elements may be combined within a <TEI> (or <teiCorpus>) element. [4. Default Text Structure 15.1. Varieties of Composite Text]
Moduletextstructure — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype)
versionspecifies the version number of the TEI Guidelines against which this document is valid.
StatusOptional
Datatypeteidata.version
Note

Major editions of the Guidelines have long been informally referred to by a name made up of the letter P (for Proposal) followed by a digit. The current release is one of the many releases of the fifth major edition of the Guidelines, known as P5. This attribute may be used to associate a TEI document with a specific release of the P5 Guidelines, in the absence of a more precise association provided by the source attribute on the associated <schemaSpec>.

Member of
Contained by
core: teiCorpus
textstructure: TEI
May contain
header: teiHeader
linking: standOff
textstructure: TEI text
Note

This element is required. It is customary to specify the TEI namespace http://www.tei-c.org/ns/1.0 on it, using the xmlns attribute.

Example
<TEI version="3.3.0" xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader>   <fileDesc>    <titleStmt>     <title>The shortest TEI Document Imaginable</title>    </titleStmt>    <publicationStmt>     <p>First published as part of TEI P2, this is the P5          version using a name space.</p>    </publicationStmt>    <sourceDesc>     <p>No source: this is an original work.</p>    </sourceDesc>   </fileDesc>  </teiHeader>  <text>   <body>    <p>This is about the shortest TEI document imaginable.</p>   </body>  </text> </TEI>
Example
<TEI version="2.9.1" xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader>   <fileDesc>    <titleStmt>     <title>A TEI Document containing four page images </title>    </titleStmt>    <publicationStmt>     <p>Unpublished demonstration file.</p>    </publicationStmt>    <sourceDesc>     <p>No source: this is an original work.</p>    </sourceDesc>   </fileDesc>  </teiHeader>  <facsimile>   <graphic url="page1.png"/>   <graphic url="page2.png"/>   <graphic url="page3.png"/>   <graphic url="page4.png"/>  </facsimile> </TEI>
Schematron
<sch:ns prefix="tei"  uri="http://www.tei-c.org/ns/1.0"/> <sch:ns prefix="xs"  uri="http://www.w3.org/2001/XMLSchema"/>
Schematron
<sch:ns prefix="rng"  uri="http://relaxng.org/ns/structure/1.0"/>
Content model
<content>
 <sequence>
  <elementRef key="teiHeader"/>
  <alternate>
   <sequence>
    <classRef key="model.resource"
     minOccurs="1" maxOccurs="unbounded"/>
    <elementRef key="TEI" minOccurs="0"
     maxOccurs="unbounded"/>
   </sequence>
   <elementRef key="TEI" minOccurs="1"
    maxOccurs="unbounded"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration
element TEI
{
   tei_att.global.attributes,
   tei_att.typed.attributes,
   attribute version { text }?,
   ( tei_teiHeader, ( ( tei_model.resource+, tei_TEI* ) | tei_TEI+ ) )
}

Appendix B.1.2 <ab>

<ab> (anonymous block) contains any arbitrary component-level unit of text, acting as an anonymous container for phrase or inter level elements analogous to, but without the semantic baggage of, a paragraph. [16.3. Blocks, Segments, and Anchors]
Modulelinking — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype) att.declaring (@decls) att.fragmentable (@part) att.written (@hand)
Member of
Contained by
May contain
Note

The <ab> element may be used at the encoder's discretion to mark any component-level elements in a text for which no other more specific appropriate markup is defined.

Example
<div type="bookn="Genesis">  <div type="chaptern="1">   <ab>In the beginning God created the heaven and the earth.</ab>   <ab>And the earth was without form, and void; and      darkness was upon the face of the deep. And the      spirit of God moved upon the face of the waters.</ab>   <ab>And God said, Let there be light: and there was light.</ab> <!-- ...-->  </div> </div>
Schematron
<sch:report test=" (ancestor::tei:p or ancestor::tei:ab) and not( ancestor::tei:floatingText |parent::tei:exemplum |parent::tei:item |parent::tei:note |parent::tei:q |parent::tei:quote |parent::tei:remarks |parent::tei:said |parent::tei:sp |parent::tei:stage |parent::tei:cell |parent::tei:figure )"> Abstract model violation: ab may not occur inside paragraphs or other ab elements. </sch:report>
Schematron
<sch:report test=" (ancestor::tei:l or ancestor::tei:lg) and not( ancestor::tei:floatingText |parent::tei:figure |parent::tei:note )"> Abstract model violation: Lines may not contain higher-level divisions such as p or ab, unless ab is a child of figure or note, or is a descendant of floatingText. </sch:report>
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element ab
{
   tei_att.global.attributes,
   tei_att.typed.attributes,
   tei_att.declaring.attributes,
   tei_att.fragmentable.attributes,
   tei_att.written.attributes,
   tei_macro.paraContent
}

Appendix B.1.3 <abbr>

<abbr> (abbreviation) contains an abbreviation of any sort. [3.6.5. Abbreviations and Their Expansions]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (type, @subtype)
type(type) allows the encoder to classify the abbreviation according to some convenient typology.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Sample values include:
suspension
(suspension) the abbreviation provides the first letter(s) of the word or phrase, omitting the remainder.
contraction
(contraction) the abbreviation omits some letter(s) in the middle.
brevigraph
the abbreviation comprises a special symbol or mark.
superscription
(superscription) the abbreviation includes writing above the line.
acronym
(acronym) the abbreviation comprises the initial letters of the words of a phrase.
title
(title) the abbreviation is for a title of address (Dr, Ms, Mr, …)
organization
(organization) the abbreviation is for the name of an organization.
geographic
(geographic) the abbreviation is for a geographic name.
Note

The type attribute is provided for the sake of those who wish to classify abbreviations at their point of occurrence; this may be useful in some circumstances, though usually the same abbreviation will have the same type in all occurrences. As the sample values make clear, abbreviations may be classified by the method used to construct them, the method of writing them, or the referent of the term abbreviated; the typology used is up to the encoder and should be carefully planned to meet the needs of the expected use. For a typology of Middle English abbreviations, see 6.2.

Member of
Contained by
May contain
Note

If abbreviations are expanded silently, this practice should be documented in the <editorialDecl>, either with a <normalization> element or a <p>.

Example
<choice>  <expan>North Atlantic Treaty Organization</expan>  <abbr cert="low">NorATO</abbr>  <abbr cert="high">NATO</abbr>  <abbr cert="highxml:lang="fr">OTAN</abbr> </choice>
Example
<choice>  <abbr>SPQR</abbr>  <expan>senatus populusque romanorum</expan> </choice>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element abbr
{
   tei_att.global.attributes,
   tei_att.typed.attribute.subtype,
   attribute type { text }?,
   tei_macro.phraseSeq
}

Appendix B.1.4 <abstract>

<abstract> contains a summary or formal abstract prefixed to an existing source document by the encoder. [2.4.4. Abstracts]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
header: profileDesc
May contain
Note

This element is intended only for cases where no abstract is available in the original source. Any abstract already present in the source document should be encoded as a <div> within the <front>, as it should for a born-digital document.

Example
<profileDesc>  <abstract resp="#LB">   <p>Good database design involves the acquisition and deployment of      skills which have a wider relevance to the educational process. From      a set of more or less instinctive rules of thumb a formal discipline      or "methodology" of database design has evolved. Applying that      methodology can be of great benefit to a very wide range of academic      subjects: it requires fundamental skills of abstraction and      generalisation and it provides a simple mechanism whereby complex      ideas and information structures can be represented and manipulated,      even without the use of a computer. </p>  </abstract> </profileDesc>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.pLike"/>
  <classRef key="model.listLike"/>
 </alternate>
</content>
    
Schema Declaration
element abstract
{
   tei_att.global.attributes,
   ( tei_model.pLike | tei_model.listLike )+
}

Appendix B.1.5 <activity>

<activity> (activity) contains a brief informal description of what a participant in a language interaction is doing other than speaking, if anything. [15.2.3. The Setting Description]
Modulecorpus — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
corpus: setting
May contain
Note

For more fine-grained description of participant activities during a spoken text, the <event> element should be used.

Example
<activity>driving</activity>
Content model
<content>
 <macroRef key="macro.phraseSeq.limited"/>
</content>
    
Schema Declaration
element activity { tei_att.global.attributes, tei_macro.phraseSeq.limited }

Appendix B.1.6 <add>

<add> (addition) contains letters, words, or phrases inserted in the source text by an author, scribe, or a previous annotator or corrector. [3.5.3. Additions, Deletions, and Omissions]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.transcriptional (@status, @cause, @seq) (att.editLike (@evidence, @instant)) (att.written (@hand)) att.placement (@place) att.typed (@type, @subtype) att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max, @confidence))
Member of
Contained by
May contain
Note

In a diplomatic edition attempting to represent an original source, the <add> element should not be used for additions to the current TEI electronic edition made by editors or encoders. In these cases, either the <corr> or <supplied> element are recommended.

In a TEI edition of a historical text with previous editorial emendations in which such additions or reconstructions are considered part of the source text, the use of <add> may be appropriate, dependent on the editorial philosophy of the project.

Example
The story I am going to relate is true as to its main facts, and as to the consequences <add place="above">of these facts</add> from which this tale takes its title.
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element add
{
   tei_att.global.attributes,
   tei_att.transcriptional.attributes,
   tei_att.placement.attributes,
   tei_att.typed.attributes,
   tei_att.dimensions.attributes,
   tei_macro.paraContent
}

Appendix B.1.7 <addName>

<addName> (additional name) contains an additional name component, such as a nickname, epithet, or alias, or any other descriptive phrase used within a personal name. [13.2.1. Personal Names]
Modulenamesdates — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.personal (@full, @sort) (att.naming (@role, @nymRef) (att.canonical (@key, @ref)) ) att.typed (@type, @subtype)
Member of
Contained by
May contain
Example
<persName>  <forename>Frederick</forename>  <addName type="epithet">the Great</addName>  <roleName>Emperor of Prussia</roleName> </persName>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element addName
{
   tei_att.global.attributes,
   tei_att.personal.attributes,
   tei_att.typed.attributes,
   tei_macro.phraseSeq
}

Appendix B.1.8 <addSpan>

<addSpan> (added span of text) marks the beginning of a longer sequence of text added by an author, scribe, annotator or corrector (see also <add>). [11.3.1.4. Additions and Deletions]
Moduletranscr — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.transcriptional (@status, @cause, @seq) (att.editLike (@evidence, @instant)) (att.written (@hand)) att.placement (@place) att.typed (@type, @subtype) att.spanning (@spanTo) att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max, @confidence))
Member of
Contained by
May containEmpty element
Note

Both the beginning and the end of the added material must be marked; the beginning by the <addSpan> element itself, the end by the spanTo attribute.

Example
<handNote xml:id="HEOL"  scribe="HelgiÓlafsson"/> <!-- ... --> <body>  <div> <!-- text here -->  </div>  <addSpan n="added_gatheringhand="#HEOL"   spanTo="#P025"/>  <div> <!-- text of first added poem here -->  </div>  <div> <!-- text of second added poem here -->  </div>  <div> <!-- text of third added poem here -->  </div>  <div> <!-- text of fourth added poem here -->  </div>  <anchor xml:id="P025"/>  <div> <!-- more text here -->  </div> </body>
Schematron
<sch:assert test="@spanTo">The @spanTo attribute of <sch:name/> is required.</sch:assert>
Schematron
<sch:assert test="@spanTo">L'attribut spanTo est requis.</sch:assert>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element addSpan
{
   tei_att.global.attributes,
   tei_att.transcriptional.attributes,
   tei_att.placement.attributes,
   tei_att.typed.attributes,
   tei_att.spanning.attributes,
   tei_att.dimensions.attributes,
   empty
}

Appendix B.1.9 <addrLine>

<addrLine> (address line) contains one line of a postal address. [3.6.2. Addresses 2.2.4. Publication, Distribution, Licensing, etc. 3.12.2.4. Imprint, Size of a Document, and Reprint Information]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
core: address
May contain
Note

Addresses may be encoded either as a sequence of lines, or using any sequence of component elements from the model.addrPart class. Other non-postal forms of address, such as telephone numbers or email, should not be included within an <address> element directly but may be wrapped within an <addrLine> if they form part of the printed address in some source text.

Example
<address>  <addrLine>Computing Center, MC 135</addrLine>  <addrLine>P.O. Box 6998</addrLine>  <addrLine>Chicago, IL</addrLine>  <addrLine>60680 USA</addrLine> </address>
Example
<addrLine>  <ref target="tel:+1-201-555-0123">(201) 555 0123</ref> </addrLine>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element addrLine { tei_att.global.attributes, tei_macro.phraseSeq }

Appendix B.1.10 <address>

<address> (address) contains a postal address, for example of a publisher, an organization, or an individual. [3.6.2. Addresses 2.2.4. Publication, Distribution, Licensing, etc. 3.12.2.4. Imprint, Size of a Document, and Reprint Information]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
May contain
Note

This element should be used for postal addresses only. Within it, the generic element <addrLine> may be used as an alternative to any of the more specialized elements available from the model.addrPart class, such as <street>, <postCode> etc.

ExampleUsing just the elements defined by the core module, an address could be represented as follows:
<address>  <street>via Marsala 24</street>  <postCode>40126</postCode>  <name>Bologna</name>  <name>Italy</name> </address>
ExampleWhen a schema includes the names and dates module more specific elements such as country or settlement would be preferable over generic <name>:
<address>  <street>via Marsala 24</street>  <postCode>40126</postCode>  <settlement>Bologna</settlement>  <country>Italy</country> </address>
Example
<address>  <addrLine>Computing Center, MC 135</addrLine>  <addrLine>P.O. Box 6998</addrLine>  <addrLine>Chicago, IL 60680</addrLine>  <addrLine>USA</addrLine> </address>
Example
<address>  <country key="FR"/>  <settlement type="city">Lyon</settlement>  <postCode>69002</postCode>  <district type="arrondissement">IIème</district>  <district type="quartier">Perrache</district>  <street>   <num>30</num>, Cours de Verdun</street> </address>
Content model
<content>
 <sequence>
  <classRef key="model.global"
   minOccurs="0" maxOccurs="unbounded"/>
  <sequence minOccurs="1"
   maxOccurs="unbounded">
   <classRef key="model.addrPart"/>
   <classRef key="model.global"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
 </sequence>
</content>
    
Schema Declaration
element address
{
   tei_att.global.attributes,
   ( tei_model.global*, ( tei_model.addrPart, tei_model.global* )+ )
}

Appendix B.1.11 <affiliation>

<affiliation> (affiliation) contains an informal description of a person's present or past affiliation with some organization, for example an employer or sponsor. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.editLike (@evidence, @instant) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.naming (@role, @nymRef) (att.canonical (@key, @ref)) att.typed (type, @subtype)
typecharacterizes the element in some sense, using any convenient classification scheme or typology.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Sample values include:
sponsor
recommend
discredit
pledged
Member of
Contained by
May contain
Note

If included, the name of an organization may be tagged using either the <name> element as above, or the more specific <orgName> element.

Example
<affiliation>Junior project officer for the US <name type="org">National Endowment for    the Humanities</name> </affiliation>
ExampleThis example indicates that the person was affiliated with the Australian Journalists Association at some point between the dates listed.
<affiliation notAfter="1960-01-01"  notBefore="1957-02-28">Paid up member of the <orgName>Australian Journalists Association</orgName> </affiliation>
ExampleThis example indicates that the person was affiliated with Mount Holyoke College throughout the entire span of the date range listed.
<affiliation from="1902-01-01"  to="1906-01-01">Was an assistant professor at Mount Holyoke College.</affiliation>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element affiliation
{
   tei_att.global.attributes,
   tei_att.editLike.attributes,
   tei_att.datable.attributes,
   tei_att.naming.attributes,
   tei_att.typed.attribute.subtype,
   attribute type { text }?,
   tei_macro.phraseSeq
}

Appendix B.1.12 <age>

<age> (age) specifies the age of a person. [13.3.2.1. Personal Characteristics]
Modulenamesdates — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.editLike (@evidence, @instant) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max, @confidence)) att.typed (type, @subtype)
typecharacterizes the element in some sense, using any convenient classification scheme or typology.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Sample values include:
western
sui
subjective
objective
inWorld
(in world) age of a fictional character at the time the story takes place, rather than at the time the story is told
chronological
biological
psychological
functional
valuesupplies a numeric code representing the age or age group
StatusOptional
Datatypeteidata.count
Note

This attribute may be used to complement a more detailed discussion of a person's age in the content of the element

Member of
Contained by
May contain
Note

As with other culturally-constructed traits such as sex, the way in which this concept is described in different cultural contexts may vary. The normalizing attributes are provided as a means of simplifying that variety to Western European norms and should not be used where that is inappropriate. The content of the element may be used to describe the intended concept in more detail, using plain text.

Example
<age value="2notAfter="1986">under 20 in the early eighties</age>
Content model
<content>
 <macroRef key="macro.phraseSeq.limited"/>
</content>
    
Schema Declaration
element age
{
   tei_att.global.attributes,
   tei_att.editLike.attributes,
   tei_att.datable.attributes,
   tei_att.typed.attribute.subtype,
   tei_att.dimensions.attributes,
   attribute type { text }?,
   attribute value { text }?,
   tei_macro.phraseSeq.limited
}

Appendix B.1.13 <alt>

<alt> (alternation) identifies an alternation or a set of choices among elements or passages. [16.8. Alternation]
Modulelinking — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype) att.pointing (target, @targetLang, @evaluate)
targetspecifies the destination of the reference by supplying one or more URI References
Derived fromatt.pointing
StatusOptional
Datatype2–∞ occurrences of teidata.pointer separated by whitespace
modestates whether the alternations gathered in this collection are exclusive or inclusive.
StatusRecommended
Datatypeteidata.enumerated
Legal values are:
excl
(exclusive) indicates that the alternation is exclusive, i.e. that at most one of the alternatives occurs.
incl
(inclusive) indicates that the alternation is not exclusive, i.e. that one or more of the alternatives occur.
weightsIf mode is excl, each weight states the probability that the corresponding alternative occurs. If mode is incl each weight states the probability that the corresponding alternative occurs given that at least one of the other alternatives occurs.
StatusOptional
Datatype2–∞ occurrences of teidata.probability separated by whitespace
Note

If mode is excl, the sum of weights must be 1. If mode is incl, the sum of weights must be in the range from 0 to the number of alternants.

Member of
Contained by
May containEmpty element
Example
<alt mode="excltarget="#we.fun #we.sun"  weights="0.5 0.5"/>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element alt
{
   tei_att.global.attributes,
   tei_att.pointing.attribute.targetLang,
   tei_att.pointing.attribute.evaluate,
   tei_att.typed.attributes,
   attribute target { list { * } }?,
   attribute mode { "excl" | "incl" }?,
   attribute weights { list { * } }?,
   empty
}

Appendix B.1.14 <altGrp>

<altGrp> (alternation group) groups a collection of <alt> elements and possibly pointers. [16.8. Alternation]
Modulelinking — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.pointing.group (@domains, @targFunc) (att.pointing (@targetLang, @target, @evaluate)) (att.typed (@type, @subtype))
modestates whether the alternations gathered in this collection are exclusive or inclusive.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
excl
(exclusive) indicates that the alternation is exclusive, i.e. that at most one of the alternatives occurs.[Default]
incl
(inclusive) indicates that the alternation is not exclusive, i.e. that one or more of the alternatives occur.
Member of
Contained by
May contain
core: desc ptr
linking: alt
Note

Any number of alternations, pointers or extended pointers.

Example
<altGrp mode="excl">  <alt target="#dm #lt #bb"   weights="0.5 0.25 0.25"/>  <alt target="#rl #dbweights="0.5 0.5"/> </altGrp>
Example
<altGrp mode="incl">  <alt target="#dm #rlweights="0.90 0.90"/>  <alt target="#lt #rlweights="0.5 0.5"/>  <alt target="#bb #rlweights="0.5 0.5"/>  <alt target="#dm #dbweights="0.10 0.10"/>  <alt target="#lt #dbweights="0.45 0.90"/>  <alt target="#bb #dbweights="0.45 0.90"/> </altGrp>
Content model
<content>
 <sequence>
  <classRef key="model.descLike"
   minOccurs="0" maxOccurs="unbounded"/>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <elementRef key="alt"/>
   <elementRef key="ptr"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration
element altGrp
{
   tei_att.global.attributes,
   tei_att.pointing.group.attributes,
   attribute mode { "excl" | "incl" }?,
   ( tei_model.descLike*, ( tei_alt | tei_ptr )* )
}

Appendix B.1.15 <analytic>

<analytic> (analytic level) contains bibliographic elements describing an item (e.g. an article or poem) published within a monograph or journal and not as an independent publication. [3.12.2.1. Analytic, Monographic, and Series Levels]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
May contain
Note

May contain titles and statements of responsibility (author, editor, or other), in any order.

The <analytic> element may only occur within a <biblStruct>, where its use is mandatory for the description of an analytic level bibliographic item.

Example
<biblStruct>  <analytic>   <author>Chesnutt, David</author>   <title>Historical Editions in the States</title>  </analytic>  <monogr>   <title level="j">Computers and the Humanities</title>   <imprint>    <date when="1991-12">(December, 1991):</date>   </imprint>   <biblScope>25.6</biblScope>   <biblScope>377–380</biblScope>  </monogr> </biblStruct>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <elementRef key="author"/>
  <elementRef key="editor"/>
  <elementRef key="respStmt"/>
  <elementRef key="title"/>
  <classRef key="model.ptrLike"/>
  <elementRef key="date"/>
  <elementRef key="textLang"/>
  <elementRef key="idno"/>
  <elementRef key="availability"/>
 </alternate>
</content>
    
Schema Declaration
element analytic
{
   tei_att.global.attributes,
   (
      tei_author
    | tei_editor
    | tei_respStmt
    | tei_title
    | tei_model.ptrLike
    | tei_date
    | tei_textLang
    | tei_idno
    | tei_availability
   )*
}

Appendix B.1.16 <anchor>

<anchor> (anchor point) attaches an identifier to a point within a text, whether or not it corresponds with a textual element. [8.4.2. Synchronization and Overlap 16.5. Correspondence and Alignment]
Modulelinking — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype)
Member of
Contained by
May containEmpty element
Note

On this element, the global xml:id attribute must be supplied to specify an identifier for the point at which this element occurs within a document. The value used may be chosen freely provided that it is unique within the document and is a syntactically valid name. There is no requirement for values containing numbers to be in sequence.

Example
<s>The anchor is he<anchor xml:id="A234"/>re somewhere.</s> <s>Help me find it.<ptr target="#A234"/> </s>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element anchor { tei_att.global.attributes, tei_att.typed.attributes, empty }

Appendix B.1.17 <annotation>

<annotation> represents an annotation following the Web Annotation Data Model. [16.10. The standOff Container]
Modulelinking — Formal specification
Attributesatt.global (xml:id, @n, @xml:lang, @xml:base, @xml:space) att.global.rendition (@rend, @style, @rendition) att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select) att.global.analytic (@ana) att.global.facs (@facs) att.global.change (@change) att.global.responsibility (@cert, @resp) att.global.source (@source) att.pointing (target, @targetLang, @evaluate)
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
Derived fromatt.global
StatusRequired
DatatypeID
targetspecifies the destination of the reference by supplying one or more URI References
Derived fromatt.pointing
StatusRequired
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
motivation
StatusOptional
Datatype1–∞ occurrences of teidata.enumerated separated by whitespace
Legal values are:
assessing
intent is to assess the target resource in some way, rather than simply make a comment about it
bookmarking
intent is to create a bookmark to the target or part thereof
classifying
intent is to classify the target in some way
commenting
intent is to comment about the target
describing
intent is to describe the target, rather than (for example) comment on it
editing
intent is to request an edit or a change to the target resource
highlighting
intent is to highlight the target resource or a segment thereof
identifying
intent is to assign an identity to the target
linking
intent is to link to a resource related to the target
moderating
intent is to assign some value or quality to the target
questioning
intent is to ask a question about the target
replying
intent is to reply to a previous statement, either an annotation or another resource
tagging
intent is to associate a tag with the target
Note

For further detailed explanation of the suggested values, see the Web Annotation Vocabulary (WAV). The motivations described here map to URIs defined by the WAV and when exported to RDF or JSON-LD must have the URI http://www.w3.org/ns/oa# prepended.

As an RDF vocabulary, WADM permits the definition of new motivations (see Appendix C of the WAV). In TEI, new motivations may be defined in a custom ODD (see section 23.3.1.3). New motivations must also map to URIs defined by an RDF ontology extending the WAV.

Member of
Contained by
May contain
Example
<annotation xml:id="ann1"  motivation="linkingtarget="#Gallia"> <!-- See https://www.w3.org/TR/annotation-model/#lifecycle-information and https://www.w3.org/TR/annotation-model/#agents -->  <respStmt xml:id="fred">   <resp>creator</resp>   <persName>Fred Editor</persName>  </respStmt>  <revisionDesc>   <change status="created"    when="2020-05-21T13:59:00Zwho="#fred"/>   <change status="modified"    when="2020-05-21T19:48:00Zwho="#fred"/>  </revisionDesc> <!-- See https://www.w3.org/TR/annotation-model/#rights-information -->  <licence target="http://creativecommons.org/licenses/by/3.0/"/> <!-- Multiple bodies --> <!-- Pointers to sections of text in the same document -->  <ptr target="#string-range(c1p1s1,0,6)"/>  <ptr target="#string-range(c1p1s6,19,7)"/> </annotation>
Example
<annotation xml:id="TheCorrectTitle"  motivation="commentingtarget="#line1">  <note>The correct title of this specification, and the correct full name of XML, is    "Extensible Markup Language". "eXtensible Markup Language" is just a spelling error.    However, the abbreviation "XML" is not only correct but, appearing as it does in the title    of the specification, an official name of the Extensible Markup Language. </note> </annotation>
Content model
<content>
 <sequence>
  <elementRef key="respStmt" minOccurs="0"
   maxOccurs="unbounded"/>
  <elementRef key="revisionDesc"
   minOccurs="0" maxOccurs="unbounded"/>
  <elementRef key="licence" minOccurs="0"
   maxOccurs="unbounded"/>
  <classRef key="model.annotationPart.body"
   minOccurs="0" maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration
element annotation
{
   tei_att.global.attribute.n,
   tei_att.global.attribute.xmllang,
   tei_att.global.attribute.xmlbase,
   tei_att.global.attribute.xmlspace,
   tei_att.global.rendition.attribute.rend,
   tei_att.global.rendition.attribute.style,
   tei_att.global.rendition.attribute.rendition,
   tei_att.global.linking.attribute.corresp,
   tei_att.global.linking.attribute.synch,
   tei_att.global.linking.attribute.sameAs,
   tei_att.global.linking.attribute.copyOf,
   tei_att.global.linking.attribute.next,
   tei_att.global.linking.attribute.prev,
   tei_att.global.linking.attribute.exclude,
   tei_att.global.linking.attribute.select,
   tei_att.global.analytic.attribute.ana,
   tei_att.global.facs.attribute.facs,
   tei_att.global.change.attribute.change,
   tei_att.global.responsibility.attribute.cert,
   tei_att.global.responsibility.attribute.resp,
   tei_att.global.source.attribute.source,
   tei_att.pointing.attribute.targetLang,
   tei_att.pointing.attribute.evaluate,
   attribute xml:id { text },
   attribute target { list { + } },
   attribute motivation
   {
      list
      {
         (
            "assessing"
          | "bookmarking"
          | "classifying"
          | "commenting"
          | "describing"
          | "editing"
          | "highlighting"
          | "identifying"
          | "linking"
          | "moderating"
          | "questioning"
          | "replying"
          | "tagging"
         )+
      }
   }?,
   (
      tei_respStmt*,
      tei_revisionDesc*,
      tei_licence*,
      tei_model.annotationPart.body*
   )
}

Appendix B.1.18 <annotationBlock>

<annotationBlock> groups together various annotations, e.g. for parallel interpretations of a spoken segment. [8.4.6. Analytic Coding]
Modulespoken — Formal specification
Attributesatt.ascribed (@who) att.timed (@start, @end) (att.duration (att.duration.w3c (@dur)) (att.duration.iso (@dur-iso)) ) att.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
figures: cell figure
header: change licence
namesdates: occupation
textstructure: body div
transcr: metamark
May contain
Example
<annotationBlock who="#SPK1start="#T2"  end="#T3xml:id="ag20">  <u xml:id="u20">   <seg xml:id="seg37type="utterance"    subtype="modeless">    <w xml:id="w46">Yeah</w>   </seg>  </u> </annotationBlock> <annotationBlock who="#SPK1start="#T5"  end="#T6xml:id="ag21">  <u xml:id="u21">   <seg xml:id="seg38type="utterance"    subtype="modeless">    <w xml:id="w47">Mhm</w>   </seg>  </u> </annotationBlock>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <elementRef key="u"/>
  <elementRef key="spanGrp"/>
  <classRef key="model.global.spoken"/>
 </alternate>
</content>
    
Schema Declaration
element annotationBlock
{
   tei_att.ascribed.attributes,
   tei_att.timed.attributes,
   tei_att.global.attributes,
   ( tei_u | tei_spanGrp | tei_model.global.spoken )*
}

Appendix B.1.19 <appInfo>

<appInfo> (application information) records information about an application which has edited the TEI file. [2.3.11. The Application Information Element]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
header: encodingDesc
May contain
header: application
Example
<appInfo>  <application version="1.24ident="Xaira">   <label>XAIRA Indexer</label>   <ptr target="#P1"/>  </application> </appInfo>
Content model
<content>
 <classRef key="model.applicationLike"
  minOccurs="1" maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element appInfo { tei_att.global.attributes, tei_model.applicationLike+ }

Appendix B.1.20 <application>

<application> provides information about an application which has acted upon the document. [2.3.11. The Application Information Element]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod))
identsupplies an identifier for the application, independent of its version number or display name.
StatusRequired
Datatypeteidata.name
versionsupplies a version number for the application, independent of its identifier or display name.
StatusRequired
Datatypeteidata.versionNumber
Member of
Contained by
header: appInfo
May contain
linking: ab
Example
<appInfo>  <application version="1.5"   ident="ImageMarkupTool1notAfter="2006-06-01">   <label>Image Markup Tool</label>   <ptr target="#P1"/>   <ptr target="#P2"/>  </application> </appInfo>
This example shows an appInfo element documenting the fact that version 1.5 of the Image Markup Tool1 application has an interest in two parts of a document which was last saved on June 6 2006. The parts concerned are accessible at the URLs given as target for the two <ptr> elements.
Content model
<content>
 <sequence>
  <classRef key="model.labelLike"
   minOccurs="1" maxOccurs="unbounded"/>
  <alternate>
   <classRef key="model.ptrLike"
    minOccurs="0" maxOccurs="unbounded"/>
   <classRef key="model.pLike"
    minOccurs="0" maxOccurs="unbounded"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration
element application
{
   tei_att.global.attributes,
   tei_att.typed.attributes,
   tei_att.datable.attributes,
   attribute ident { text },
   attribute version { text },
   ( tei_model.labelLike+, ( tei_model.ptrLike* | tei_model.pLike* ) )
}

Appendix B.1.21 <author>

<author> (author) in a bibliographic reference, contains the name(s) of an author, personal or corporate, of a work; for example in the same form as that provided by a recognized bibliographic name authority. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.naming (@role, @nymRef) (att.canonical (@key, @ref)) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod))
Member of
Contained by
May contain
Note

Particularly where cataloguing is likely to be based on the content of the header, it is advisable to use a generally recognized name authority file to supply the content for this element. The attributes key or ref may also be used to reference canonical information about the author(s) intended from any appropriate authority, such as a library catalogue or online resource.

In the case of a broadcast, use this element for the name of the company or network responsible for making the broadcast.

Where an author is unknown or unspecified, this element may contain text such as Unknown or Anonymous. When the appropriate TEI modules are in use, it may also contain detailed tagging of the names used for people, organizations or places, in particular where multiple names are given.

Example
<author>British Broadcasting Corporation</author> <author>La Fayette, Marie Madeleine Pioche de la Vergne, comtesse de (1634–1693)</author> <author>Anonymous</author> <author>Bill and Melinda Gates Foundation</author> <author>  <persName>Beaumont, Francis</persName> and <persName>John Fletcher</persName> </author> <author>  <orgName key="BBC">British Broadcasting    Corporation</orgName>: Radio 3 Network </author>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element author
{
   tei_att.global.attributes,
   tei_att.naming.attributes,
   tei_att.datable.attributes,
   tei_macro.phraseSeq
}

Appendix B.1.22 <authority>

<authority> (release authority) supplies the name of a person or other agency responsible for making a work available, other than a publisher or distributor. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref)
Member of
Contained by
core: monogr
May contain
Example
<authority>John Smith</authority>
Content model
<content>
 <macroRef key="macro.phraseSeq.limited"/>
</content>
    
Schema Declaration
element authority
{
   tei_att.global.attributes,
   tei_att.canonical.attributes,
   tei_macro.phraseSeq.limited
}

Appendix B.1.23 <availability>

<availability> (availability) supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, any licence applying to it, etc. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default)
status(status) supplies a code identifying the current availability of the text.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
free
(free) the text is freely available.
unknown
(unknown) the status of the text is unknown.
restricted
(restricted) the text is not freely available.
Member of
Contained by
May contain
core: p
header: licence
linking: ab
Note

A consistent format should be adopted

Example
<availability status="restricted">  <p>Available for academic research purposes only.</p> </availability> <availability status="free">  <p>In the public domain</p> </availability> <availability status="restricted">  <p>Available under licence from the publishers.</p> </availability>
Example
<availability>  <licence target="http://opensource.org/licenses/MIT">   <p>The MIT License      applies to this document.</p>   <p>Copyright (C) 2011 by The University of Victoria</p>   <p>Permission is hereby granted, free of charge, to any person obtaining a copy      of this software and associated documentation files (the "Software"), to deal      in the Software without restriction, including without limitation the rights      to use, copy, modify, merge, publish, distribute, sublicense, and/or sell      copies of the Software, and to permit persons to whom the Software is      furnished to do so, subject to the following conditions:</p>   <p>The above copyright notice and this permission notice shall be included in      all copies or substantial portions of the Software.</p>   <p>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR      IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,      FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE      AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER      LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,      OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN      THE SOFTWARE.</p>  </licence> </availability>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.availabilityPart"/>
  <classRef key="model.pLike"/>
 </alternate>
</content>
    
Schema Declaration
element availability
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   attribute status { "free" | "unknown" | "restricted" }?,
   ( tei_model.availabilityPart | tei_model.pLike )+
}

Appendix B.1.24 <back>

<back> (back matter) contains any appendixes, etc. following the main part of a text. [4.7. Back Matter 4. Default Text Structure]
Moduletextstructure — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declaring (@decls)
Contained by
textstructure: text
transcr: facsimile
May contain
Note

Because cultural conventions differ as to which elements are grouped as back matter and which as front matter, the content models for the <back> and <front> elements are identical.

Example
<back>  <div type="appendix">   <head>The Golden Dream or, the Ingenuous Confession</head>   <p>TO shew the Depravity of human Nature, and how apt the Mind is to be misled by Trinkets      and false Appearances, Mrs. Two-Shoes does acknowledge, that after she became rich, she      had like to have been, too fond of Money <!-- .... -->   </p>  </div> <!-- ... -->  <div type="epistle">   <head>A letter from the Printer, which he desires may be inserted</head>   <salute>Sir.</salute>   <p>I have done with your Copy, so you may return it to the Vatican, if you please;    <!-- ... -->   </p>  </div>  <div type="advert">   <head>The Books usually read by the Scholars of Mrs Two-Shoes are these and are sold at Mr      Newbery's at the Bible and Sun in St Paul's Church-yard.</head>   <list>    <item n="1">The Christmas Box, Price 1d.</item>    <item n="2">The History of Giles Gingerbread, 1d.</item> <!-- ... -->    <item n="42">A Curious Collection of Travels, selected from the Writers of all Nations,        10 Vol, Pr. bound 1l.</item>   </list>  </div>  <div type="advert">   <head>By the KING's Royal Patent, Are sold by J. NEWBERY, at the Bible and Sun in St.      Paul's Church-Yard.</head>   <list>    <item n="1">Dr. James's Powders for Fevers, the Small-Pox, Measles, Colds, &amp;c. 2s.        6d</item>    <item n="2">Dr. Hooper's Female Pills, 1s.</item> <!-- ... -->   </list>  </div> </back>
Content model
<content>
 <sequence>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.frontPart"/>
   <classRef key="model.pLike.front"/>
   <classRef key="model.pLike"/>
   <classRef key="model.listLike"/>
   <classRef key="model.global"/>
  </alternate>
  <alternate minOccurs="0">
   <sequence>
    <classRef key="model.div1Like"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.frontPart"/>
     <classRef key="model.div1Like"/>
     <classRef key="model.global"/>
    </alternate>
   </sequence>
   <sequence>
    <classRef key="model.divLike"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.frontPart"/>
     <classRef key="model.divLike"/>
     <classRef key="model.global"/>
    </alternate>
   </sequence>
  </alternate>
  <sequence minOccurs="0">
   <classRef key="model.divBottomPart"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.divBottomPart"/>
    <classRef key="model.global"/>
   </alternate>
  </sequence>
 </sequence>
</content>
    
Schema Declaration
element back
{
   tei_att.global.attributes,
   tei_att.declaring.attributes,
   (
      (
         tei_model.frontPart
       | tei_model.pLike.front
       | tei_model.pLike
       | tei_model.listLike
       | tei_model.global
      )*,
      (
         (
            tei_model.div1Like,
            ( tei_model.frontPart | tei_model.div1Like | tei_model.global )*
         )
       | (
            tei_model.divLike,
            ( tei_model.frontPart | tei_model.divLike | tei_model.global )*
         )
      )?,
      (
         tei_model.divBottomPart,
         ( tei_model.divBottomPart | tei_model.global )*
      )?
   )
}

Appendix B.1.25 <bibl>

<bibl> (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 15.3.2. Declarable Elements]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default) att.typed (@type, @subtype) att.sortable (@sortKey) att.docStatus (@status)
Member of
Contained by
May contain
Note

Contains phrase-level elements, together with any combination of elements from the model.biblPart class

Example
<bibl>Blain, Clements and Grundy: Feminist Companion to Literature in English (Yale, 1990)</bibl>
Example
<bibl>  <title level="a">The Interesting story of the Children in the Wood</title>. In <author>Victor E Neuberg</author>, <title>The Penny Histories</title>. <publisher>OUP</publisher>  <date>1968</date>. </bibl>
Example
<bibl type="articlesubtype="book_chapter"  xml:id="carlin_2003">  <author>   <name>    <surname>Carlin</surname>      (<forename>Claire</forename>)</name>  </author>, <title level="a">The Staging of Impotence : France’s last    congrès</title> dans <bibl type="monogr">   <title level="m">Theatrum mundi : studies in honor of Ronald W.      Tobin</title>, éd.  <editor>    <name>     <forename>Claire</forename>     <surname>Carlin</surname>    </name>   </editor> et  <editor>    <name>     <forename>Kathleen</forename>     <surname>Wine</surname>    </name>   </editor>,  <pubPlace>Charlottesville, Va.</pubPlace>,  <publisher>Rookwood Press</publisher>,  <date when="2003">2003</date>.  </bibl> </bibl>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.highlighted"/>
  <classRef key="model.pPart.data"/>
  <classRef key="model.pPart.edit"/>
  <classRef key="model.segLike"/>
  <classRef key="model.ptrLike"/>
  <classRef key="model.biblPart"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration
element bibl
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   tei_att.typed.attributes,
   tei_att.sortable.attributes,
   tei_att.docStatus.attributes,
   (
      text
    | tei_model.gLike
    | tei_model.highlighted
    | tei_model.pPart.data
    | tei_model.pPart.edit
    | tei_model.segLike
    | tei_model.ptrLike
    | tei_model.biblPart
    | tei_model.global
   )*
}

Appendix B.1.26 <biblFull>

<biblFull> (fully-structured bibliographic citation) contains a fully-structured bibliographic citation, in which all components of the TEI file description are present. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2. The File Description 2.2.7. The Source Description 15.3.2. Declarable Elements]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default) att.sortable (@sortKey) att.docStatus (@status)
Member of
Contained by
May contain
Example
<biblFull>  <titleStmt>   <title>The Feminist Companion to Literature in English: women writers from the middle ages      to the present</title>   <author>Blain, Virginia</author>   <author>Clements, Patricia</author>   <author>Grundy, Isobel</author>  </titleStmt>  <editionStmt>   <edition>UK edition</edition>  </editionStmt>  <extent>1231 pp</extent>  <publicationStmt>   <publisher>Yale University Press</publisher>   <pubPlace>New Haven and London</pubPlace>   <date>1990</date>  </publicationStmt>  <sourceDesc>   <p>No source: this is an original work</p>  </sourceDesc> </biblFull>
Content model
<content>
 <alternate>
  <sequence>
   <sequence>
    <elementRef key="titleStmt"/>
    <elementRef key="editionStmt"
     minOccurs="0"/>
    <elementRef key="extent" minOccurs="0"/>
    <elementRef key="publicationStmt"/>
    <elementRef key="seriesStmt"
     minOccurs="0" maxOccurs="unbounded"/>
    <elementRef key="notesStmt"
     minOccurs="0"/>
   </sequence>
   <elementRef key="sourceDesc"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
  <sequence>
   <elementRef key="fileDesc"/>
   <elementRef key="profileDesc"/>
  </sequence>
 </alternate>
</content>
    
Schema Declaration
element biblFull
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   tei_att.sortable.attributes,
   tei_att.docStatus.attributes,
   (
      (
         (
            tei_titleStmt,
            tei_editionStmt?,
            tei_extent?,
            tei_publicationStmt,
            tei_seriesStmt*,
            tei_notesStmt?
         ),
         tei_sourceDesc*
      )
    | ( tei_fileDesc, tei_profileDesc )
   )
}

Appendix B.1.27 <biblScope>

<biblScope> (scope of bibliographic reference) defines the scope of a bibliographic reference, for example as a list of page numbers, or a named subdivision of a larger work. [3.12.2.5. Scopes and Ranges in Bibliographic Citations]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.citing (@unit, @from, @to)
Member of
Contained by
May contain
Note

When a single page is being cited, use the from and to attributes with an identical value. When no clear endpoint is provided, the from attribute may be used without to; for example a citation such as ‘p. 3ff’ might be encoded <biblScope from="3">p. 3ff</biblScope>.

It is now considered good practice to supply this element as a sibling (rather than a child) of <imprint>, since it supplies information which does not constitute part of the imprint.

Example
<biblScope>pp 12–34</biblScope> <biblScope unit="pagefrom="12to="34"/> <biblScope unit="volume">II</biblScope> <biblScope unit="page">12</biblScope>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element biblScope
{
   tei_att.global.attributes,
   tei_att.citing.attributes,
   tei_macro.phraseSeq
}

Appendix B.1.28 <biblStruct>

<biblStruct> (structured bibliographic citation) contains a structured bibliographic citation, in which only bibliographic sub-elements appear and in a specified order. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 15.3.2. Declarable Elements]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declarable (@default) att.typed (@type, @subtype) att.sortable (@sortKey) att.docStatus (@status)
Member of
Contained by
May contain
Example
<biblStruct>  <monogr>   <author>Blain, Virginia</author>   <author>Clements, Patricia</author>   <author>Grundy, Isobel</author>   <title>The Feminist Companion to Literature in English: women writers from the middle ages      to the present</title>   <edition>first edition</edition>   <imprint>    <publisher>Yale University Press</publisher>    <pubPlace>New Haven and London</pubPlace>    <date>1990</date>   </imprint>  </monogr> </biblStruct>
Content model
<content>
 <sequence>
  <elementRef key="analytic" minOccurs="0"
   maxOccurs="unbounded"/>
  <sequence minOccurs="1"
   maxOccurs="unbounded">
   <elementRef key="monogr"/>
   <elementRef key="series" minOccurs="0"
    maxOccurs="unbounded"/>
  </sequence>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.noteLike"/>
   <classRef key="model.ptrLike"/>
   <elementRef key="relatedItem"/>
   <elementRef key="citedRange"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration
element biblStruct
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   tei_att.typed.attributes,
   tei_att.sortable.attributes,
   tei_att.docStatus.attributes,
   (
      tei_analytic*,
      ( tei_monogr, tei_series* )+,
      (
         tei_model.noteLike
       | tei_model.ptrLike
       | tei_relatedItem
       | tei_citedRange
      )*
   )
}

Appendix B.1.29 <birth>

<birth> (birth) contains information about a person's birth, such as its date and place. [15.2.2. The Participant Description]
Modulenamesdates — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.editLike (@evidence, @instant) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.dimensions (@unit, @quantity, @extent, @precision, @scope) (att.ranging (@atLeast, @atMost, @min, @max, @confidence)) att.naming (@role, @nymRef) (att.canonical (@key, @ref)) att.typed (type, @subtype)
typecharacterizes the element in some sense, using any convenient classification scheme or typology.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Sample values include:
caesarean
(caesarean section)
vaginal
(vaginal delivery)
exNihilo
(ex nihilo)
incorporated
founded
established
Member of
Contained by
May contain
Example
<birth>Before 1920, Midlands region.</birth>
Example
<birth when="1960-12-10">In a small cottage near <name type="place">Aix-la-Chapelle</name>, early in the morning of <date>10 Dec 1960</date> </birth>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element birth
{
   tei_att.global.attributes,
   tei_att.editLike.attributes,
   tei_att.datable.attributes,
   tei_att.dimensions.attributes,
   tei_att.naming.attributes,
   tei_att.typed.attribute.subtype,
   attribute type { text }?,
   tei_macro.phraseSeq
}

Appendix B.1.30 <bloc>

<bloc> (bloc) contains the name of a geo-political unit consisting of two or more nation states or countries. [13.2.3. Place Names]
Modulenamesdates — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.naming (@role, @nymRef) (att.canonical (@key, @ref)) att.typed (@type, @subtype) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod))
Member of
Contained by
May contain
Example
<bloc type="union">the European Union</bloc> <bloc type="continent">Africa</bloc>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element bloc
{
   tei_att.global.attributes,
   tei_att.naming.attributes,
   tei_att.typed.attributes,
   tei_att.datable.attributes,
   tei_macro.phraseSeq
}

Appendix B.1.31 <body>

<body> (text body) contains the whole body of a single unitary text, excluding any front or back matter. [4. Default Text Structure]
Moduletextstructure — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.declaring (@decls)
Contained by
textstructure: text
May contain
Example
<body>  <l>Nu scylun hergan hefaenricaes uard</l>  <l>metudæs maecti end his modgidanc</l>  <l>uerc uuldurfadur sue he uundra gihuaes</l>  <l>eci dryctin or astelidæ</l>  <l>he aerist scop aelda barnum</l>  <l>heben til hrofe haleg scepen.</l>  <l>tha middungeard moncynnæs uard</l>  <l>eci dryctin æfter tiadæ</l>  <l>firum foldu frea allmectig</l>  <trailer>primo cantauit Cædmon istud carmen.</trailer> </body>
Content model
<content>
 <sequence>
  <classRef key="model.global"
   minOccurs="0" maxOccurs="unbounded"/>
  <sequence minOccurs="0">
   <classRef key="model.divTop"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.global"/>
    <classRef key="model.divTop"/>
   </alternate>
  </sequence>
  <sequence minOccurs="0">
   <classRef key="model.divGenLike"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.global"/>
    <classRef key="model.divGenLike"/>
   </alternate>
  </sequence>
  <alternate>
   <sequence minOccurs="1"
    maxOccurs="unbounded">
    <classRef key="model.divLike"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.global"/>
     <classRef key="model.divGenLike"/>
    </alternate>
   </sequence>
   <sequence minOccurs="1"
    maxOccurs="unbounded">
    <classRef key="model.div1Like"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.global"/>
     <classRef key="model.divGenLike"/>
    </alternate>
   </sequence>
   <sequence>
    <sequence minOccurs="1"
     maxOccurs="unbounded">
     <alternate minOccurs="1" maxOccurs="1">
      <elementRef key="schemaSpec"/>
      <classRef key="model.common"/>
     </alternate>
     <classRef key="model.global"
      minOccurs="0" maxOccurs="unbounded"/>
    </sequence>
    <alternate minOccurs="0">
     <sequence minOccurs="1"
      maxOccurs="unbounded">
      <classRef key="model.divLike"/>
      <alternate minOccurs="0"
       maxOccurs="unbounded">
       <classRef key="model.global"/>
       <classRef key="model.divGenLike"/>
      </alternate>
     </sequence>
     <sequence minOccurs="1"
      maxOccurs="unbounded">
      <classRef key="model.div1Like"/>
      <alternate minOccurs="0"
       maxOccurs="unbounded">
       <classRef key="model.global"/>
       <classRef key="model.divGenLike"/>
      </alternate>
     </sequence>
    </alternate>
   </sequence>
  </alternate>
  <sequence minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.divBottom"/>
   <classRef key="model.global"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
 </sequence>
</content>
    
Schema Declaration
element body
{
   tei_att.global.attributes,
   tei_att.declaring.attributes,
   (
      tei_model.global*,
      ( tei_model.divTop, ( tei_model.global | tei_model.divTop )* )?,
      ( tei_model.divGenLike, ( tei_model.global | tei_model.divGenLike )* )?,
      (
         ( tei_model.divLike, ( tei_model.global | tei_model.divGenLike )* )+
       | ( tei_model.div1Like, ( tei_model.global | tei_model.divGenLike )* )+
       | (
            ( ( schemaSpec | tei_model.common ), tei_model.global* )+,
            (
               (
                  tei_model.divLike,
                  ( tei_model.global | tei_model.divGenLike )*
               )+
             | (
                  tei_model.div1Like,
                  ( tei_model.global | tei_model.divGenLike )*
               )+
            )?
         )
      ),
      ( tei_model.divBottom, tei_model.global* )*
   )
}

Appendix B.1.32 <c>

<c> (character) represents a character. [17.1. Linguistic Segment Categories]
Moduleanalysis — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.segLike (@function) (att.datcat (@datcat, @valueDatcat)) (att.fragmentable (@part)) att.typed (@type, @subtype) att.notated (@notation)
Member of
Contained by
May contain
gaiji: g
character data
Note

Contains a single character, a <g> element, or a sequence of graphemes to be treated as a single character. The type attribute is used to indicate the function of this segmentation, taking values such as letter, punctuation, or digit etc.

Example
<phr>  <c>M</c>  <c>O</c>  <c>A</c>  <c>I</c>  <w>doth</w>  <w>sway</w>  <w>my</w>  <w>life</w> </phr>
Content model
<content>
 <macroRef key="macro.xtext"/>
</content>
    
Schema Declaration
element c
{
   tei_att.global.attributes,
   tei_att.segLike.attributes,
   tei_att.typed.attributes,
   tei_att.notated.attributes,
   tei_macro.xtext
}

Appendix B.1.33 <cRefPattern>

<cRefPattern> (canonical reference pattern) specifies an expression and replacement pattern for transforming a canonical reference into a URI. [2.3.6.3. Milestone Method 2.3.6. The Reference System Declaration 2.3.6.2. Search-and-Replace Method]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.patternReplacement (@matchPattern, @replacementPattern)
Contained by
header: refsDecl
May contain
core: p
linking: ab
Note

The result of the substitution may be either an absolute or a relative URI reference. In the latter case it is combined with the value of xml:base in force at the place where the cRef attribute occurs to form an absolute URI in the usual manner as prescribed by XML Base.

Example
<cRefPattern matchPattern="([1-9A-Za-z]+)\s+([0-9]+):([0-9]+)"  replacementPattern="#xpath(//div[@type='book'][@n='$1']/div[@type='chap'][@n='$2']/div[@type='verse'][@n='$3'])"/>
Content model
<content>
 <classRef key="model.pLike" minOccurs="0"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element cRefPattern
{
   tei_att.global.attributes,
   tei_att.patternReplacement.attributes,
   tei_model.pLike*
}

Appendix B.1.34 <calendar>

<calendar> (calendar) describes a calendar or dating system used in a dating formula in the text. [2.4.5. Calendar Description]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.pointing (@targetLang, @target, @evaluate)
Contained by
header: calendarDesc
May contain
core: p
linking: ab
Example
<calendarDesc>  <calendar xml:id="julianEngland">   <p>Julian Calendar (including proleptic)</p>  </calendar> </calendarDesc>
Example
<calendarDesc>  <calendar xml:id="egyptian"   target="http://en.wikipedia.org/wiki/Egyptian_calendar">   <p>Egyptian calendar (as defined by Wikipedia)</p>  </calendar> </calendarDesc>
Content model
<content>
 <classRef key="model.pLike" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element calendar
{
   tei_att.global.attributes,
   tei_att.pointing.attributes,
   tei_model.pLike+
}

Appendix B.1.35 <calendarDesc>

<calendarDesc> (calendar description) contains a description of the calendar system used in any dating expression found in the text. [2.4. The Profile Description 2.4.5. Calendar Description]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
header: profileDesc
May contain
header: calendar
Note

In the first example above, calendars and short codes for xml:ids are from W3 guidelines at http://www.w3.org/TR/xpath-functions-11/#lang-cal-country

Example
<calendarDesc>  <calendar xml:id="cal_AD">   <p>Anno Domini (Christian Era)</p>  </calendar>  <calendar xml:id="cal_AH">   <p>Anno Hegirae (Muhammedan Era)</p>  </calendar>  <calendar xml:id="cal_AME">   <p>Mauludi Era (solar years since Mohammed's birth)</p>  </calendar>  <calendar xml:id="cal_AM">   <p>Anno Mundi (Jewish Calendar)</p>  </calendar>  <calendar xml:id="cal_AP">   <p>Anno Persici</p>  </calendar>  <calendar xml:id="cal_AS">   <p>Aji Saka Era (Java)</p>  </calendar>  <calendar xml:id="cal_BE">   <p>Buddhist Era</p>  </calendar>  <calendar xml:id="cal_CB">   <p>Cooch Behar Era</p>  </calendar>  <calendar xml:id="cal_CE">   <p>Common Era</p>  </calendar>  <calendar xml:id="cal_CL">   <p>Chinese Lunar Era</p>  </calendar>  <calendar xml:id="cal_CS">   <p>Chula Sakarat Era</p>  </calendar>  <calendar xml:id="cal_EE">   <p>Ethiopian Era</p>  </calendar>  <calendar xml:id="cal_FE">   <p>Fasli Era</p>  </calendar>  <calendar xml:id="cal_ISO">   <p>ISO 8601 calendar</p>  </calendar>  <calendar xml:id="cal_JE">   <p>Japanese Calendar</p>  </calendar>  <calendar xml:id="cal_KE">   <p>Khalsa Era (Sikh calendar)</p>  </calendar>  <calendar xml:id="cal_KY">   <p>Kali Yuga</p>  </calendar>  <calendar xml:id="cal_ME">   <p>Malabar Era</p>  </calendar>  <calendar xml:id="cal_MS">   <p>Monarchic Solar Era</p>  </calendar>  <calendar xml:id="cal_NS">   <p>Nepal Samwat Era</p>  </calendar>  <calendar xml:id="cal_OS">   <p>Old Style (Julian Calendar)</p>  </calendar>  <calendar xml:id="cal_RS">   <p>Rattanakosin (Bangkok) Era</p>  </calendar>  <calendar xml:id="cal_SE">   <p>Saka Era</p>  </calendar>  <calendar xml:id="cal_SH">   <p>Mohammedan Solar Era (Iran)</p>  </calendar>  <calendar xml:id="cal_SS">   <p>Saka Samvat</p>  </calendar>  <calendar xml:id="cal_TE">   <p>Tripurabda Era</p>  </calendar>  <calendar xml:id="cal_VE">   <p>Vikrama Era</p>  </calendar>  <calendar xml:id="cal_VS">   <p>Vikrama Samvat Era</p>  </calendar> </calendarDesc>
Example
<calendarDesc>  <calendar xml:id="cal_Gregorian">   <p>Gregorian calendar</p>  </calendar>  <calendar xml:id="cal_Julian">   <p>Julian calendar</p>  </calendar>  <calendar xml:id="cal_Islamic">   <p>Islamic or Muslim (hijri) lunar calendar</p>  </calendar>  <calendar xml:id="cal_Hebrew">   <p>Hebrew or Jewish lunisolar calendar</p>  </calendar>  <calendar xml:id="cal_Revolutionary">   <p>French Revolutionary calendar</p>  </calendar>  <calendar xml:id="cal_Iranian">   <p>Iranian or Persian (Jalaali) solar calendar</p>  </calendar>  <calendar xml:id="cal_Coptic">   <p>Coptic or Alexandrian calendar</p>  </calendar>  <calendar xml:id="cal_Chinese">   <p>Chinese lunisolar calendar</p>  </calendar> </calendarDesc>
Example
<calendarDesc>  <calendar xml:id="cal_Egyptian"   target="http://en.wikipedia.org/wiki/Egyptian_calendar">   <p>Egyptian calendar (as defined by Wikipedia)</p>  </calendar> </calendarDesc>
Content model
<content>
 <elementRef key="calendar" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element calendarDesc { tei_att.global.attributes, tei_calendar+ }

Appendix B.1.36 <catDesc>

<catDesc> (category description) describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal <textDesc>. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.canonical (@key, @ref)
Contained by
header: category
May contain
Example
<catDesc>Prose reportage</catDesc>
Example
<catDesc>  <textDesc n="novel">   <channel mode="w">print; part issues</channel>   <constitution type="single"/>   <derivation type="original"/>   <domain type="art"/>   <factuality type="fiction"/>   <interaction type="none"/>   <preparedness type="prepared"/>   <purpose type="entertaindegree="high"/>   <purpose type="informdegree="medium"/>  </textDesc> </catDesc>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.limitedPhrase"/>
  <classRef key="model.catDescPart"/>
 </alternate>
</content>
    
Schema Declaration
element catDesc
{
   tei_att.global.attributes,
   tei_att.canonical.attributes,
   ( text | tei_model.limitedPhrase | tei_model.catDescPart )*
}

Appendix B.1.37 <catRef>

<catRef> (category reference) specifies one or more defined categories within some taxonomy or text typology. [2.4.3. The Text Classification]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.pointing (@targetLang, @target, @evaluate)
schemeidentifies the classification scheme within which the set of categories concerned is defined, for example by a <taxonomy> element, or by some other resource.
StatusOptional
Datatypeteidata.pointer
Contained by
core: imprint
header: textClass
May containEmpty element
Note

The scheme attribute needs to be supplied only if more than one taxonomy has been declared.

Example
<catRef scheme="#myTopics"  target="#news #prov #sales2"/> <!-- elsewhere --> <taxonomy xml:id="myTopics">  <category xml:id="news">   <catDesc>Newspapers</catDesc>  </category>  <category xml:id="prov">   <catDesc>Provincial</catDesc>  </category>  <category xml:id="sales2">   <catDesc>Low to average annual sales</catDesc>  </category> </taxonomy>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element catRef
{
   tei_att.global.attributes,
   tei_att.pointing.attributes,
   attribute scheme { text }?,
   empty
}

Appendix B.1.38 <category>

<category> (category) contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
May contain
core: desc gloss
Example
<category xml:id="b1">  <catDesc>Prose reportage</catDesc> </category>
Example
<category xml:id="b2">  <catDesc>Prose </catDesc>  <category xml:id="b11">   <catDesc>journalism</catDesc>  </category>  <category xml:id="b12">   <catDesc>fiction</catDesc>  </category> </category>
Example
<category xml:id="LIT">  <catDesc xml:lang="pl">literatura piękna</catDesc>  <catDesc xml:lang="en">fiction</catDesc>  <category xml:id="LPROSE">   <catDesc xml:lang="pl">proza</catDesc>   <catDesc xml:lang="en">prose</catDesc>  </category>  <category xml:id="LPOETRY">   <catDesc xml:lang="pl">poezja</catDesc>   <catDesc xml:lang="en">poetry</catDesc>  </category>  <category xml:id="LDRAMA">   <catDesc xml:lang="pl">dramat</catDesc>   <catDesc xml:lang="en">drama</catDesc>  </category> </category>
Content model
<content>
 <sequence>
  <alternate>
   <elementRef key="catDesc" minOccurs="1"
    maxOccurs="unbounded"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.descLike"/>
    <elementRef key="equiv"/>
    <elementRef key="gloss"/>
   </alternate>
  </alternate>
  <elementRef key="category" minOccurs="0"
   maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration
element category
{
   tei_att.global.attributes,
   (
      ( tei_catDesc+ | ( tei_model.descLike | equiv | tei_gloss )* ),
      tei_category*
   )
}

Appendix B.1.39 <cb>

<cb> (column beginning) marks the beginning of a new column of a text on a multi-column page. [3.11.3. Milestone Elements]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype) att.edition (@ed, @edRef) att.spanning (@spanTo) att.breaking (@break)
Member of
Contained by
May containEmpty element
Note

On this element, the global n attribute indicates the number or other value associated with the column which follows the point of insertion of this <cb> element. Encoders should adopt a clear and consistent policy as to whether the numbers associated with column breaks relate to the physical sequence number of the column in the whole text, or whether columns are numbered within the page. The <cb> element is placed at the head of the column to which it refers.

ExampleMarkup of an early English dictionary printed in two columns:
<pb/> <cb n="1"/> <entryFree>  <form>Well</form>, <sense>a Pit to hold Spring-Water</sense>: <sense>In the Art of <hi rend="italic">War</hi>, a Depth the Miner    sinks into the Ground, to find out and disappoint the Enemies Mines,    or to prepare one</sense>. </entryFree> <entryFree>To <form>Welter</form>, <sense>to wallow</sense>, or <sense>lie groveling</sense>.</entryFree> <!-- remainder of column --> <cb n="2"/> <entryFree>  <form>Wey</form>, <sense>the greatest Measure for dry Things,    containing five Chaldron</sense>. </entryFree> <entryFree>  <form>Whale</form>, <sense>the greatest of    Sea-Fishes</sense>. </entryFree>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element cb
{
   tei_att.global.attributes,
   tei_att.typed.attributes,
   tei_att.edition.attributes,
   tei_att.spanning.attributes,
   tei_att.breaking.attributes,
   empty
}

Appendix B.1.40 <cell>

<cell> (cell) contains one cell of a table. [14.1.1. TEI Tables]
Modulefigures — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.tableDecoration (@role, @rows, @cols)
Contained by
figures: row
May contain
Example
<row>  <cell role="label">General conduct</cell>  <cell role="data">Not satisfactory, on account of his great unpunctuality    and inattention to duties</cell> </row>
Content model
<content>
 <macroRef key="macro.specialPara"/>
</content>
    
Schema Declaration
element cell
{
   tei_att.global.attributes,
   tei_att.tableDecoration.attributes,
   tei_macro.specialPara
}

Appendix B.1.41 <change>

<change> (change) documents a change or set of changes made during the production of a source document, or during the revision of an electronic file. [2.6. The Revision Description 2.4.1. Creation 11.7. Identifying Changes and Revisions]
Moduleheader — Formal specification
Attributesatt.ascribed (@who) att.datable (@calendar, @period) (att.datable.w3c (@when, @notBefore, @notAfter, @from, @to)) (att.datable.iso (@when-iso, @notBefore-iso, @notAfter-iso, @from-iso, @to-iso)) (att.datable.custom (@when-custom, @notBefore-custom, @notAfter-custom, @from-custom, @to-custom, @datingPoint, @datingMethod)) att.docStatus (@status) att.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype)
target(target) points to one or more elements that belong to this change.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Contained by
May contain
Note

The who attribute may be used to point to any other element, but will typically specify a <respStmt> or <person> element elsewhere in the header, identifying the person responsible for the change and their role in making it.

It is recommended that changes be recorded with the most recent first. The status attribute may be used to indicate the status of a document following the change documented.

Example
<titleStmt>  <title> ... </title>  <editor xml:id="LDB">Lou Burnard</editor>  <respStmt xml:id="BZ">   <resp>copy editing</resp>   <name>Brett Zamir</name>  </respStmt> </titleStmt> <!-- ... --> <revisionDesc status="published">  <change who="#BZwhen="2008-02-02"   status="public">Finished chapter 23</change>  <change who="#BZwhen="2008-01-02"   status="draft">Finished chapter 2</change>  <change n="P2.2when="1991-12-21"   who="#LDB">Added examples to section 3</change>  <change when="1991-11-11who="#MSM">Deleted chapter 10</change> </revisionDesc>
Example
<profileDesc>  <creation>   <listChange>    <change xml:id="DRAFT1">First draft in pencil</change>    <change xml:id="DRAFT2"     notBefore="1880-12-09">First revision, mostly        using green ink</change>    <change xml:id="DRAFT3"     notBefore="1881-02-13">Final corrections as        supplied to printer.</change>   </listChange>  </creation> </profileDesc>
Content model
<content>
 <macroRef key="macro.specialPara"/>
</content>
    
Schema Declaration
element change
{
   tei_att.ascribed.attributes,
   tei_att.datable.attributes,
   tei_att.docStatus.attributes,
   tei_att.global.attributes,
   tei_att.typed.attributes,
   attribute target { list { + } }?,
   tei_macro.specialPara
}

Appendix B.1.42 <channel>

<channel> (primary channel) describes the medium or channel by which a text is delivered or experienced. For a written text, this might be print, manuscript, email, etc.; for a spoken one, radio, telephone, face-to-face, etc. [15.2.1. The Text Description]
Modulecorpus — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
modespecifies the mode of this channel with respect to speech and writing.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
s
(spoken)
w
(written)
sw
(spoken to be written) e.g. dictation
ws
(written to be spoken) e.g. a script
m
(mixed)
x
(unknown or inapplicable) [Default]
Member of
Contained by
corpus: textDesc
May contain
Example
<channel mode="s">face-to-face conversation</channel>
Content model
<content>
 <macroRef key="macro.phraseSeq.limited"/>
</content>
    
Schema Declaration
element channel
{
   tei_att.global.attributes,
   attribute mode { "s" | "w" | "sw" | "ws" | "m" | "x" }?,
   tei_macro.phraseSeq.limited
}

Appendix B.1.43 <char>

<char> (character) provides descriptive information about a character. [5.2. Markup Constructs for Representation of Characters and Glyphs]
Modulegaiji — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Contained by
gaiji: charDecl
May contain
Example
<char xml:id="circledU4EBA">  <localProp name="Name"   value="CIRCLED IDEOGRAPH 4EBA"/>  <localProp name="daikanwavalue="36"/>  <unicodeProp name="Decomposition_Mapping"   value="circle"/>  <mapping type="standard"></mapping> </char>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <elementRef key="unicodeProp"/>
  <elementRef key="unihanProp"/>
  <elementRef key="localProp"/>
  <elementRef key="mapping"/>
  <elementRef key="figure"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.noteLike"/>
  <classRef key="model.descLike"/>
 </alternate>
</content>
    
Schema Declaration
element char
{
   tei_att.global.attributes,
   (
      tei_unicodeProp
    | tei_unihanProp
    | tei_localProp
    | tei_mapping
    | tei_figure
    | tei_model.graphicLike
    | tei_model.noteLike
    | tei_model.descLike
   )*
}

Appendix B.1.44 <charDecl>

<charDecl> (character declarations) provides information about nonstandard characters and glyphs. [5.2. Markup Constructs for Representation of Characters and Glyphs]
Modulegaiji — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
header: encodingDesc
May contain
core: desc
gaiji: char glyph
Example
<charDecl>  <char xml:id="aENL">   <unicodeProp name="Name"    value="LATIN LETTER ENLARGED SMALL A"/>   <mapping type="standard">a</mapping>  </char> </charDecl>
Content model
<content>
 <sequence>
  <elementRef key="desc" minOccurs="0"/>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
   <elementRef key="char"/>
   <elementRef key="glyph"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration
element charDecl
{
   tei_att.global.attributes,
   ( tei_desc?, ( tei_char | tei_glyph )+ )
}

Appendix B.1.45 <choice>

<choice> (choice) groups a number of alternative encodings for the same point in a text. [3.5. Simple Editorial Changes]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source))
Member of
Contained by
May contain
Note

Because the children of a <choice> element all represent alternative ways of encoding the same sequence, it is natural to think of them as mutually exclusive. However, there may be cases where a full representation of a text requires the alternative encodings to be considered as parallel.

Note also that <choice> elements may self-nest.

Where the purpose of an encoding is to record multiple witnesses of a single work, rather than to identify multiple possible encoding decisions at a given point, the <app> element and associated elements discussed in section 12.1. The Apparatus Entry, Readings, and Witnesses should be preferred.

ExampleAn American encoding of Gulliver's Travels which retains the British spelling but also provides a version regularized to American spelling might be encoded as follows.
<p>Lastly, That, upon his solemn oath to observe all the above articles, the said man-mountain shall have a daily allowance of meat and drink sufficient for the support of <choice>   <sic>1724</sic>   <corr>1728</corr>  </choice> of our subjects, with free access to our royal person, and other marks of our <choice>   <orig>favour</orig>   <reg>favor</reg>  </choice>.</p>
Content model
<content>
 <alternate minOccurs="2"
  maxOccurs="unbounded">
  <classRef key="model.choicePart"/>
  <elementRef key="choice"/>
 </alternate>
</content>
    
Schema Declaration
element choice
{
   tei_att.global.attributes,
   ( tei_model.choicePart | tei_choice )+
}

Appendix B.1.46 <cit>

<cit> (cited quotation) contains a quotation from some other document, together with a bibliographic reference to its source. In a dictionary it may contain an example text with at least one occurrence of the word form, used in the sense being described, or a translation of the headword, or an example. [3.3.3. Quotation 4.3.1. Grouped Texts 9.3.5.1. Examples]
Modulecore — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.typed (@type, @subtype)
Member of
Contained by
May contain
Example
<cit>  <quote>and the breath of the whale is frequently attended with such an insupportable smell,    as to bring on disorder of the brain.</quote>  <bibl>Ulloa's South America</bibl> </cit>
Example
<entry>  <form>   <orth>horrifier</orth>  </form>  <cit type="translationxml:lang="en">   <quote>to horrify</quote>  </cit>  <cit type="example">   <quote>elle était horrifiée par la dépense</quote>   <cit type="translationxml:lang="en">    <quote>she was horrified at the expense.</quote>   </cit>  </cit> </entry>
Example
<cit type="example">  <quote xml:lang="mix">Ka'an yu tsa'a Pedro.</quote>  <media url="soundfiles-gen:S_speak_1s_on_behalf_of_Pedro_01_02_03_TS.wav"   mimeType="audio/wav"/>  <cit type="translation">   <quote xml:lang="en">I'm speaking on behalf of Pedro.</quote>  </cit>  <cit type="translation">   <quote xml:lang="es">Estoy hablando de parte de Pedro.</quote>  </cit> </cit>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.biblLike"/>
  <classRef key="model.egLike"/>
  <classRef key="model.entryPart"/>
  <classRef key="model.global"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.ptrLike"/>
  <classRef key="model.attributable"/>
  <elementRef key="pc"/>
  <elementRef key="q"/>
 </alternate>
</content>
    
Schema Declaration
element cit
{
   tei_att.global.attributes,
   tei_att.typed.attributes,
   (
      tei_model.biblLike
    | tei_model.egLike
    | tei_model.entryPart
    | tei_model.global
    | tei_model.graphicLike
    | tei_model.ptrLike
    | tei_model.attributable
    | tei_pc
    | tei_q
   )+
}

Appendix B.1.47 <citeData>

<citeData> (citation data) specifies how information may be extracted from citation structures. [3.11.4. Declaring Reference Systems 16.2.5.4. Citation Structures]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.citeStructurePart (@use)
property(property) A URI indicating a property definition.
StatusRequired
DatatypeanyURI
Contained by
May containEmpty element
Example
<citeStructure unit="book"  match="//body/divuse="@n">  <citeData property="http://purl.org/dc/terms/title"   use="head"/> </citeStructure>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element citeData
{
   tei_att.global.attributes,
   tei_att.citeStructurePart.attributes,
   attribute property { text },
   empty
}

Appendix B.1.48 <citeStructure>

<citeStructure> (citation structure) declares a structure and method for citing the current document. [3.11.4. Declaring Reference Systems 16.2.5.4. Citation Structures]
Moduleheader — Formal specification
Attributesatt.global (@xml:id, @n, @xml:lang, @xml:base, @xml:space) (att.global.rendition (@rend, @style, @rendition)) (att.global.linking (@corresp, @synch, @sameAs, @copyOf, @next, @prev, @exclude, @select)) (att.global.analytic (@ana)) (att.global.facs (@facs)) (att.global.change (@change)) (att.global.responsibility (@cert, @resp)) (att.global.source (@source)) att.citeStructurePart (@use)
delim(delimiter) supplies a delimiting string preceding the structural component.
StatusOptional
Datatypestring
Schematron
<s:rule context="tei:citeStructure[parent::tei:citeStructure]"> <s:assert test="@delim">A <s:name/> with a parent <s:name/> must have a @delim attribute.</s:assert> </s:rule>
Note

delim must contain at least one character.

match(match) supplies an XPath selection pattern using the syntax defined in [[undefined XSLT3]] which identifies a set of nodes which are citable structural components. The expression may be absolute (beginning with /) or relative. match on a <citeStructur