The structure and encoding of PressMint corpora
2025-09-16

Table of contents

1. Introduction

This document is meant to serve as a reference for the encoding of PressMint corpora of historical newspapers. In order for the PressMint corpora to be interoperable (i.e. so that the same scripts can be used to process them), their structure is fairly rigid, primarily in terms of file names and folder structure, and, partially, their TEI XML encoding. This is not to say that all the corpora have to contain exactly the same information because we distinguish obligatory information, which all the corpora should contain, from that which is optional, and present only in the corpora for which it has been possible to gather it from the corpus sources.

This document is a modification of the ParlaMint encoding guidelines, which are a customisation the TEI Guidelines. But while ParlaMint specifies many reguirements on the structures of the docuemnts and obligatory data and metadata, PressMint makes only minimal requirements for the purposes of interoperability although leaves considerable space for optional extensions.

The rest of these recommendations are structured as follows:

2. Overall corpus structure

2.1. XML structure

The newspapers of one contributing country constitute one PressMint corpus, which is stored as one XML document, with <teiCorpus> as its top-level element. It is composed of a <teiHeader>, giving the metadata for the corpus as a whole (further detailed in the Section on Corpus metadata), followed by a series of <TEI> elements that each contain one corpus component, as illustrated1 below:
             <!-- Corpus root --> <teiCorpus xmlns="http://www.tei-c.org/ns/1.0">   <teiHeader>...</teiHeader>   <TEI>...</TEI> <!-- Corpus component -->   <TEI>...</TEI> <!-- Corpus component -->   ...            <!-- More corpus components -->   </teiCorpus>           
We do not specify what exactly a corpus component should contain, as this can differ substantially between corpora, e.g. it can be a newspaper edition corresponding to a particular day, or a collection of newspapers for a month or even a year. However, at least the year of the publication must be clear.
A corpus component will thus be rooted in the <TEI> element, which then contains its metadata in its own <teiHeader>, followed by the optional <facsimile> element, giving the links to the images, and this by the obligatory <text> element, which contains the text of the particular component, as illustrated below:
<TEI xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader>...</teiHeader>  <facsimile>...</facsimile>  <text>...</text> </TEI>

The <teiHeader> of a corpus component (further detailed in the Section on Corpus metadata) contains the metadata specific for this component (along with some redundant metadata about its provenance), and which should be unique in the corpus, i.e. the corpus component metadata should distinguish it from all the other components of the corpus.

2.2. Use of XInclude

The fact that a corpus is one XML document does not mean that it is also stored in one file. In fact, PressMint requires that each corpus component is stored in a separate file, with the corpus root, i.e. the top-level <teiCorpus>, also stored as one file.

To enable one XML document to be composed of many files, we use the XInclude mechanism, and the corpus root uses this mechanism (i.e. the <include> elements in the XInclude namespace) to include its corpus component files, so a corpus root will be in fact encoded similarly to the following example:
           <!-- Corpus root file --> <teiCorpus xmlns="http://www.tei-c.org/ns/1.0" >    <teiHeader>...</teiHeader>   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"       href="1899/PressMint-SI_1899-04-16.xml"/>  <!-- Corpus component file -->   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude"       href="1899/PressMint-SI_1899-04-16.xml"/>  <!-- Corpus component file -->   ...                                            <!-- More corpus component files --> </teiCorpus>           
In case the <taxonomy> elements will be used in PressMint they will also be stored as separate files, and hence also included in the corpus root using the same XInclude mechanism as explained above.

2.3. File names and directory structure

PressMint has strict rules on how to name the various files that constitute a corpus, and how to collect them in directories.

The file names have the the following structure:

  • The corpus root file name should start with the string PressMint-, followed by the ISO 3166 country code (cf. Section on Standard values) of the country whose team is contributing the corpus, e.g. PressMint-SI.xml.
  • A corpus component filename should start with the name of the root, followed by an underscore2 and the ISO 8601 formatted date of the publication of the newspaper, for example PressMint-SI_1899-04-16.xml. In case the exact date of the publications is unknown, only the month or year can be given, e.g. PressMint-SI_1899-04.xml or PressMint-SI_1899.xml.
  • The corpus component filename can be extended with and underscore and a string containing only ASCII letters and numbers and the hyphen character, e.g. PressMint-SI_1899_KRN-NUK.xml. This extra suffix can encode an abbreviation of the newspaper's name, and serve to distinguish two different newspapers that were published on the same day, or distunguish sources from where a newspaper was obtained.
  • Certain metadata elements from the corpus root <teiHeader> are stored in separate files, in particular, in case they will be used, any PressMint taxonomies, stored in the <taxonomy> elements. The file names for such metadata files should start with the name of the corpus root, followed by a hyphen, and then the name of the element, e.g. PressMint-SI-listBil.xml for the list of bibliographic items (newspapers) that represent the sources of the Slovenian corpus. In case there are more files for instances with the same element name, as is the case for taxonomies, the filename should end with another hypen, followed by the a distinguishing suffix, e.g. PressMint-SI-taxonomy-topic.xml.
  • The file names of the corpus as a whole or corpus components that have been automatically converted from the source XML into some other format should have the same name as the corpus root or components, respectively, but with appropriate file extensions, e.g, PressMint-SI_1899_KRN-NUK.txt; this is further explained in the Section on Conversions.
  • As discussed in the Chapter on Linguistic annotation we distinguish the linguistically annotated version of the corpus from the ‘plain-text’ one, with the linguistic annotated version having the additional suffix .ana on the corpus root and components, e.g. PressMint-SI.ana.xml or PressMint-SI_1899_KRN-NUK.ana.xml.

For distribution the complete XML corpus should be stored in a directory that has the same name prefix as the corpus root file and extended with the format (e.g. TEI). The directory then contains the corpus root file (and its metadata files, if such exists), while the corpus components should be in subdirectories, one per year, for example:

 PressMint-SI.TEI/PressMint-SI.xml
PressMint-SI.TEI/PressMint-SI-taxonomy-topic.xml
...
PressMint-SI.TEI/1899/PressMint-SI_1899-01-02.xml
PressMint-SI.TEI/1899/PressMint-SI_1899-01-03.xml
PressMint-SI.TEI/1899/PressMint-SI_1899-01-04.xml
...
PressMint-SI.TEI/1900/PressMint-SI_1900-01-02.xml
PressMint-SI.TEI/1900/PressMint-SI_1900-01-03.xml
PressMint-SI.TEI/1900/PressMint-SI_1900-01-04.xml
...

The lingistically annotated version of the corpus is stored separately, with the main directory and, as mentioned, the corpus root and component filenames having the additional suffix .ana, e.g.

 PressMint-SI.TEI.ana/PressMint-SI.ana.xml
PressMint-SI.TEI.ana/PressMint-SI-taxonomy-topic.xml
...
PressMint-SI.TEI.ana/1899/PressMint-SI_1899-01-02.ana.xml
PressMint-SI.TEI.ana/1899/PressMint-SI_1899-01-03.ana.xml
PressMint-SI.TEI.ana/1899/PressMint-SI_1899-01-04.ana.xml
...
PressMint-SI.TEI.ana/1900/PressMint-SI_1900-01-02.ana.xml
PressMint-SI.TEI.ana/1900/PressMint-SI_1900-01-03.ana.xml
PressMint-SI.TEI.ana/1900/PressMint-SI_1900-01-04.ana.xml
...

3. General requirements

This section gives some general requirements a PressMint corpus has to meet, in particular those relating to the characters in a corpus, and the use of standards. It also details the structure of the file names of the PressMint root and component files, as well as the attributes expected on the <teiCorpus> and <TEI> tags.

3.1. Characters

The corpus should be encoded in Unicode, using the UTF-8 character encoding, at least for European languages. In cases where the original contains characters from the Unicode Private Use Area, these should, if possible, be given their closest Unicode equivalents or substituted by the Unicode replacement character U+FFFD. End-of-line hyphens, if present in the source files, should be removed, and the split words joined in order to enhance searching the corpus and to simplify linguistic processing.

The following characters, esp. prevalent when the source documents were in Word or HTML, deserve special mention:

  • TAB (U+0009) character helps the alignment of strings on successive lines. As PressMint is not interested in preserving the layout, all TAB chacters must be substituted by space characters (U+0020).
  • NO-BREAK SPACE (U+00A0) prevents, with some applications, an automatic line break at its position and also collapsing such consecutive characters into a single space. As the use of this character complicates (or breaks) further processing, esp. linguistic annotation, this character must be substituted by the normal space character (U+0020). The same holds for other variants of Unicode space characters (U+2000 - U+200A), which are, however, used much less frequently.
  • ZERO WIDTH NO-BREAK SPACE (U+FEFF), also used as the Byte Order Mark (BOM) in Windows files should be removed.
  • NON-BREAKING HYPHEN (U+2011), similarly to NO-BREAK SPACE, prevents a line break, in this case following its position. With a similar reasoning as above, this character should be substituted by the normal hyphen character ('-', U+002D).
  • SOFT HYPHEN (U+00AD) indicates that a word can be hyphenated at that point. Occurrences of this character should be removed from the corpus.

Text-bearing elements should also not start or end with space characters, and sequences of whitespace characters should be changed into a single space.

3.2. Standard values

Whenever possible, PressMint uses standards for information coding. In particular, the following information must be standardised:

  • As the identity of a PressMint corpus is determined by the country that is contributing the corpus, its code appears in many places. For specifying these codes, the ISO 3166 standard should be used, in particular ISO 3166-1 alpha-2 for the two letter codes of the countries.
  • The codes for the languages used in the corpora (i.e. the possible values of the xml:lang attribute) should follow BCP 47 (cf. also xml:lang in XML document schemas. Essentially, this means that the value for a language code should have two letters, following ISO 639-1 or, and only if a two letter code does not exist for a language, the three-letter ISO 639-2/T code. PressMint corpora will use (except for Great Britain) at least two languages, i.e. the language that the newspapers are written in, which we will call the local language and English, as the meta-language, which is (also) used in the metadata.
  • Temporal, i.e. time-related information is typically stored in the when, from and to attributes of various elements. To specify a date or time as the value of these attributes, formatting according to the ISO 8601 standard should be used, e.g. 1888-04-01 for the 1st of April 1888. More information on temporal attributes is given in the Section on Temporal attributes.

3.3. Attributes of top-level elements

The Chapter on Overall corpus structure introduced the top level elements of the corpus root file and of the component files (i.e. the <teiCorpus> and <TEI> elements), but did not elaborate on their attributes; these are presented in this section.

The corpus root has three required attributes, as shown below:
             <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"             xml:id="PressMint-SI"            xml:lang="sl">           
All three attributes can also be used on any other element, and are thus of special importance:
  • xmlns determines the namespace of the element, and this should always be the TEI namespace, i.e. http://www.tei-c.org/ns/1.0 (apart from the elements using the XInclude directive, cf. the Section on Use of XInclude). Note that lower level elements inherit the namespace of the superordinate element, unless explicitly overridden, so it is only necessary to specify the TEI namespace on the root element of a file.
  • xml:id is an attribute from the (implicitly assumed) XML namespace, and gives the identifier of the element bearing it. The value of an ID should be unique in the corpus as a whole and should obey format requirements as defined by W3C. For the corpus root, as well as for the components, it is required that this top level identifier is identical to the file name (without the file extension). The xml:id is a global attribute, so any element can have it. While this is not required, it is necessary for any element that is then referred to (via this same ID) by some other element, such as many elements in the <teiHeader>, as is explained in the Section on Corpus metadata. The subordinate elements in the text that have an ID (such as page breaks), are recommended to have the top level xml:id as a prefix and to indicate the element name in the ID. For example, if the top level ID is PressMint-SI_1899-01-02, the first page break would have the ID PressMint-SI_1899-01-02.pb1.
  • xml:lang is also a global attribute and gives the language code of the text content of the element; for the corpus root this means the content of its TEI header, while for corpus components this is the textual content of their TEI headers and <text> elements. The convention is that language of the text content of an element is determined by the value of the first xml:lang attribute on its ancestor axis. In cases where the content is multilingual, the language code should be of the majority language. When the proportion of the languages is about equal, then the mul code for multiple languages can also be used.
A corpus component also has the same three required attributes, but can additionally also have the ana attribute, which associates the text with a category or categories defined in one or more taxonomies:
             <TEI xmlns="http://www.tei-c.org/ns/1.0"       xml:id="PressMint-SI_1899-01-02"       xml:lang="sl"       ana="#SI-frequency.daily">           
The same as for the corpus root, the component also sets the TEI namespace, and gives the language of its textual content, while its xml:id, of course, identifies the particular component. The ana attribute is a pointing attribute, and we introduce the these attributes in the next section.

3.4. Pointing attributes

The PressMint encoding can use pointing attributes for various purposes, e.g. for references to the IDs of the facsimile elements.

While a few elements have dedicated pointing attributes, there are some generally used ones. They share the characteristics that they can all be used by a number of different elements and that their value is a series of pointers, i.e. a white-space delimited sequence of references to the values of some xml:id attribute in the corpus or, in general, to an URI. The attributes are:

  • facs gives the pointer(s) to the elements of the <facsimile> elements (cf. Section on Newspaper facsimile.
  • ref provides an explicit reference to the full definition or identity for the entity being named. In PressMint it could be used e.g. for referring to the Wikimedia entry for a person. The value of this attribute is often, but not always, an URL, e.g. for associating a place name with its GeoNames URL.
To illustrate, the example below gives some elements that contain one or more of these attributes:
<p facs="#PressMint-SI_1899_KRN-NUK.page1 #PressMint-SI_1899_KRN-NUK.page2">  <pb facs="#PressMint-SI_1899_KRN-NUK.page1"/>  <name ref="https://www.geopedia.world/#T12_x1614772.8705537645_y5789479.6377019035_s12_b2345">Ljubljana</name> </p>

3.5. Temporal attributes

PressMint makes use of temporal information, in particular to encode when a newspaper was published. As mentioned in the Section on Standard values, the ISO 8601 format should be used to specify the dates or times.

The following attributes are used to specify temporal information:

  • The when attribute is used when the temporal information refers to a point in time, typically a date, and is used e.g. to give the date when a corpus text was published, or when a change in the corpus was made.
  • The from and to attributes give the starting and ending date or time of an interval, e.g. the time period the corpus covers. If only one of the two attributes is present, then the assumption is that this interval extends at least to the start (if from is missing) or after the end (if to is missing) of time period that the particular PressMint corpus covers. Similary, if both attributes are missing, the assumption is that the interval covers the complete time period of a PressMint corpus.

4. Corpus metadata

As mentioned, <teiCorpus> and <TEI> elements contain the obligatory <teiHeader> element, which stores the metadata to the corpus root or component. In this section we explain and give examples of the required and optional metadata that is contained in the <teiHeader>, proceeding through its various elements, and there distinguishing which parts and what content is appropriate for the corpus root, and which for a corpus component.

As a general remark, most metadata contains free text, and it is a requirement of PressMint that this data is given in the English language, to help researchers for other countries to understand it, and it is recommended to also give it in the local language in which the (main portion of) newspapers is written, for a local researcher to be able to use it in their native tongue.

A PressMint <teiHeader> contains three obligatory elements: the file description, <fileDesc>, the encoding description, <encodingDesc>, and the profile description, <profileDesc>, and an optional revision description, <revisionDesc>:
<teiHeader>  <fileDesc>...</fileDesc>  <encodingDesc>...</encodingDesc>  <profileDesc>...</profileDesc>  <revisionDesc>...</revisionDesc> </teiHeader>
Below we explain each of these element in turn.

4.1. File description

The file description, <fileDesc> is composed of five obligatory elements, namely the title statement, <titleStmt>, the edition statement, <editionStmt>, the extent, <extent>, the publication statement, <publicationStmt>, and the source description, <sourceDesc>:
<fileDesc>  <titleStmt>...</titleStmt>  <editionStmt>...</editionStmt>  <extent>...</extent>  <publicationStmt>...</publicationStmt>  <sourceDesc>...</sourceDesc> </fileDesc>

4.1.1. Title statement

The title statement, <titleStmt> gives the title of the corpus root or component, along with the specification of the particular session(s) of the parliament contained, the persons responsible for compiling the corpus, and the funder(s) of the project.

This structure is exemplified by the following corpus root title statement:
<titleStmt>  <title type="main">Korpus starejših slovenskih časopisov PressMint-SI [PressMint]</title>  <title type="mainxml:lang="en">Slovenian historical newspaper corpus PressMint-SI [PressMint]</title>  <respStmt>   <persName ref="https://orcid.org/0000-0002-1560-4099">Tomaž Erjavec</persName>   <resp>Kodiranje PressMint TEI XML</resp>   <resp xml:lang="en">PressMint TEI XML corpus encoding</resp>  </respStmt>  <funder>   <orgName>Raziskovalna infrastruktura CLARIN</orgName>   <orgName xml:lang="en">The CLARIN research infrastructure</orgName>  </funder>  <funder>   <orgName>Slovenska raziskovalna infrastruktura CLARIN.SI</orgName>   <orgName xml:lang="en">The Slovenian research infrastructure CLARIN.SI</orgName>  </funder> </titleStmt>
The title statement starts with two titles (one in English and one in the local language), with the appropriate language code possibly inherited from a superordinate element.

The main title has a formulaic structure ‘<Country_name> historical newspaper corpus PressMint-<Country_code> [PressMint]’, with an equivalent structure for the local language. Note that the corpus ‘stamp’ in square brackets can also be ‘[PressMint.ana]’ for the linguistically annotated version of the corpus (as explained in the Chapter on Linguistic annotation) or ‘[PressMint SAMPLE]’ for corpus data samples, as available on the PressMint GitHub repository.

After the titles come one or more responsibility statements, <respStmt>, each one containing one or more person names, <persName>, with an optional ref attribute, giving the (typically ORCID) URL, where more information about the person can be found, and the responsibility element <resp>, which specifies what responsibility the statement is about.

In a similar manner, the <funder> elements give information on the organisations which have financially contributed to the compilation of the corpus, with the names of the organisations given in the <orgName> elements.

A corpus component has a very similar title statement to the corpus root, except that certain elements specify the metadata of the component, rather than the complete corpus. The can also contain redundant metadata, in particular, the responsibility statement and the funder, as illustrated in the example below:
<titleStmt>  <title>Korpus starejših slovenskih časopisov PressMint-SI, Kmetijske in rokodelske novice, 16. 4. 1899 [PressMint]</title>  <title xml:lang="en">Slovenian historical newspaper corpus PressMint-SI, "Argicultural and Artisan News", April 16th, 1899 [PressMint]</title>  <respStmt>   <persName ref="https://orcid.org/0000-0002-1560-4099">Tomaž Erjavec</persName>   <resp>Kodiranje PressMint TEI XML</resp>   <resp xml:lang="en">PressMint TEI XML corpus encoding</resp>  </respStmt>  <funder>   <orgName>Raziskovalna infrastruktura CLARIN</orgName>   <orgName xml:lang="en">The CLARIN research infrastructure</orgName>  </funder>  <funder>   <orgName>Slovenska raziskovalna infrastruktura CLARIN.SI</orgName>   <orgName xml:lang="en">The Slovenian research infrastructure CLARIN.SI</orgName>  </funder> </titleStmt>
In the example it can be seen that the title of a corpus component is simply an extension of the corpus root title, as it also gives the name and date of the newspaper that the component contains. Note that the component title must be unique in the complete corpus.

4.1.2. Edition statement

PressMint corpora have their edition statement, <editionStmt> both in the corpus root and components. As illustrated below, the only element it contains is <edition>:
<editionStmt>  <edition>1.0</edition> </editionStmt>
We use semantic versioning to specify the version of the corpus, i.e. giving the version number, where a new major version means substantial changes to the corpus, while the minor version is reserved for e.g. correcting errata or other minor changes. We do not use the patch number. It should be noted that - at least so far - all the PressMint corpora were released together, so that they are all of the same edition, i.e. have the same version number.

4.1.3. Extents

The <extent> element gives information on selected sizes of the complete corpus (in the corpus root) or of one corpus component, as illustrated below in the case of a corpus root extent:
<extent>  <measure unit="textsquantity="75122"   xml:lang="sl">75.122 besedil</measure>  <measure unit="textsquantity="75122"   xml:lang="en">75,122 texts</measure>  <measure unit="wordsquantity="20190034"   xml:lang="sl">20.190.034 besed</measure>  <measure unit="wordsquantity="20190034"   xml:lang="en">20,190,034 words</measure> </extent>
PressMint requires two sizes to be given, and, for preference, in both languages, which are distinguished by their unit attribute, namely the number of texts and the number of words. The exact quantity is given in the quantity attribute, while the text content of <measure> gives the quantity together with the unit - if possible, the number here should contain the thousands separator appropriate for the language.

It should be noted that while the number of texts corresponds to the number of corpus components, the number of words can be somewhat complex to compute. Both are, however, inserted into the TEI headers in the finalisation of a corpus (cf. the Section on Finalisation of corpora) by a common script, so it is not necessary to insert the extent in the process of developing a PressMint corpus.

4.1.4. Publication statement

The publication statement <publicationStmt> must appear in the corpus root as well as, in identical form, in the corpus components. As illustrated below, it contains information about the publisher of the corpus, the persistent identifier where the complete corpus can be found, under which licence it is distributed, and when it was released:
<publicationStmt>  <publisher>   <orgName xml:lang="sl">Raziskovalna infrastrukutra CLARIN</orgName>   <orgName xml:lang="en">CLARIN research infrastructure</orgName>   <ref target="https://www.clarin.eu/">www.clarin.eu</ref>  </publisher>  <idno type="URIsubtype="handle">http://hdl.handle.net/11356/8943</idno>  <availability status="free">   <licence>http://creativecommons.org/licenses/by/4.0/</licence>   <p xml:lang="sl">To delo je ponujeno pod   <ref target="http://creativecommons.org/licenses/by/4.0/">Creative Commons Priznanje avtorstva 4.0        mednarodna licenca</ref>.</p>   <p xml:lang="en">This work is licensed under the   <ref target="http://creativecommons.org/licenses/by/4.0/">Creative Commons Attribution 4.0        International License</ref>.</p>  </availability>  <date when="2021-06-11">11. 6. 2023</date> </publicationStmt>
The <publisher> is, at least for the corpora produced in the scope of the CLARIN PressMint project, the CLARIN research infrastructure, and the element also gives the home page of the infrastructure. The ‘identifier number’ element, <idno>, specifies via its type and subtype attributes with fixed values URI and handle that the identifier is a handle, and contains the handle where the complete corpus corresponding to the specified version can be found. The <availability> specifies, via its <licence> element the fixed-value CC BY 4.0 URL, and in the following paragraph gives a prose description of the licence, including its URL via the target attribute of <ref>. As usual, the textual information is given in both languages. Finally, the <date> gives the date of the release, where the when gives the date in the ISO 8601 format, while the textual content can give it according to the conventions used in the local language.

4.1.5. Source description

The source description <sourceDesc> of the corpus root encodes the immediate digital source(s) of the PressMint corpus in the <bibl> element(s), as shown in the following example:
<sourceDesc>  <bibl>   <author>Dobranić, Filip</author>   <author>Evkoski, Bojan</author>   <author>Ljubešić, Nikola</author>   <title type="mainxml:lang="sl">Korpus slovenske periodike (1771-1914) sPeriodika 1.0</title>   <title type="mainxml:lang="en">Corpus of Slovenian periodicals (1771-1914) sPeriodika 1.0</title>   <idno type="URIsubtype="handle">http://hdl.handle.net/11356/8943</idno>   <date>2023</date>   <bibl>    <title type="mainxml:lang="sl">Digitalna knjižnica Slovenije dLib</title>    <title type="mainxml:lang="en">Digital library of Slovenia dLib</title>    <idno type="URI">https://dlib.si/</idno>   </bibl>  </bibl> </sourceDesc>
Apart from the <author>s and bi-lingual <title>s, the source description should also contain the <idno> element with the fixed type as URI, which gives the URL where the source is available, if such exists, while the <date> gives the date of publications of the source. As can be seen, the source corpus was itself based on publications available in the digital library of Slovenia, and this is encoded in the nested <bibl> element.
For corpus components the source description encodes the edition of the newspaper that the component encodes, i.e. it gives the (bi-lingual) name of the newspaper, the date when the edition was published, and, if available, the URL of the edition on the Web.
<sourceDesc>  <bibl>   <title type="mainxml:lang="sl">Kmetijske in rokodelske novice</title>   <title type="mainxml:lang="en">Argicultural and Artisan News</title>   <date when="1899-04-16">16. 4. 1899</date>   <idno type="URI">https://dlib.si/details/URN:NBN:SI:DOC-000TTDCE/</idno>  </bibl> </sourceDesc>

4.2. Encoding description

The encoding description <encodingDesc> of the corpus root contains the following elements:
<encodingDesc>  <projectDesc>...</projectDesc>  <editorialDecl>...</editorialDecl>  <tagsDecl>...</tagsDecl> </encodingDesc>

In contrast, the encoding description of a corpus component contains only two elements, namely (and redundantly) the <projectDesc> and the <tagsDecl>.

4.2.1. Project description

The project description <projectDesc> of the corpus root contains a short description of the project in the scope of which the corpus was compiled:
<projectDesc>  <p xml:lang="sl">Projekt <ref target="https://www.clarin.eu/pressmint">PressMint</ref>    bo izdelal večjezične, primerljive, označene in prevedene interoperabilne korpuse    starejših Evropskih časopisov z začetka 20. stoletja. Korpusi PressMint bodo    odprto dostopni, tako za prenos v raznovrstnih formatih, kot tudi v več spletnih    platformah za analizo korpusov.</p>  <p xml:lang="en">The <ref target="https://www.clarin.eu/pressmint">PressMint</ref>    project aims to compile a multilingual, comparable, annotated, translated and    interoperable set of corpora of European historical newspapers from around the    start of the 20th century. The PressMint corpora will be openly available, both    for download in a variety of instances and formats, as well as via several online    corpus analysis tools.</p> </projectDesc>

4.2.2. Editorial declaration

The editorial declaration, <editorialDecl> is used only in the corpus root and contains prose descriptions of the editorial decision made in the process of compiling the corpus, along several dimensions, in particular what, if any types of <correction>, <normalization>, <quotation>, <hyphenation>, and <segmentation> was performed on the texts of the corpus. The example below illustrates the use of these elements:
<editorialDecl>  <correction>   <p xml:lang="en">In the source sPeriodika corpus the OCR-ed texts were corrected      with <ref target="https://github.com/clarinsi/csmtiser">cSMTiser</ref>,      a text normalisation tool based on character-level machine translation. No additional      correction was performed in the scope of PressMint.</p>  </correction>  <normalization>   <p xml:lang="en">Text has not been normalised, except for spacing.</p>  </normalization>  <hyphenation>   <p xml:lang="en">In the source sPeriodika corpus heuristics were used to join end-of-line hyphenated words.</p>  </hyphenation>  <quotation>   <p xml:lang="en">Quotation marks have been left in the text and are not explicitly marked up.</p>  </quotation>  <segmentation>   <p xml:lang="en">In the source sPeriodika corpus the texts were segmented into pages and paragraphs. No      additional segmentation was perfomed in the scope of PressMint.</p>  </segmentation> </editorialDecl>

4.2.3. Tags declaration

The tags declaration, <tagsDecl> of the corpus root gives the count of all the XML tags used in the data part (so, not in the TEI header) of the corpus (for the corpus root) or in an individual component of the corpus. To distinguish the TEI elements from the possible use of elements from other namespaces, a <namespace> element giving the TEI namespace in its name attribute is introduced first. Inside it, each TEI tag is listed in its own <tagUsage> element, with the attribute gi giving the name of the tag and occurs the number of occurrences, as shown in the following example:
<tagsDecl>  <namespace name="http://www.tei-c.org/ns/1.0">   <tagUsage gi="textoccurs="414"/>   <tagUsage gi="bodyoccurs="414"/>   <tagUsage gi="pboccurs="75122"/>   <tagUsage gi="poccurs="280971"/>   <tagUsage gi="gapoccurs="789"/>  </namespace> </tagsDecl>
It should be noted that similar to the extents (as explained in the Section on Extents) the tag usage is inserted into the TEI headers in the finalisation of a corpus (cf. the Section on Validation and conversion) by a common script, so it is not necessary to compute it the process of developing a PressMint corpus.

4.2.4. Prefix definitions

Pointing attributes, such as url or ana, take as their value a reference or space-delimited series of references to a URL and/or the value of xml:id elements. If the reference is to an ID, then it is prefixed the hash character, #, e.g. #parla.uni, and if they are to an ID in another XML document, then the hash follows the URL of the document, e.g. https://nl.ijs.si/ME/V6/msd/tables/msd-fslib2-sl.xml#Vmpr1p.

Because complete URLs tend to be long, especially inconvenient when such references are given to every token in a corpus, TEI introduces the so called Abbreviated Pointers, whereby the reference to an ID can be given in the form of a prefix, which is separated by a colon from the local part of the ID reference, and the value of this prefix is determined via the <prefixDef> element in the <encodingDesc> of the TEI header.

PressMint can use this mechanism for pointing to facsimiles images (c.f. the Section on The facsimile element) and for linguistic annotations with a closed vocabulary, in particular for corpus-specific analytical part-of-speech tags (c.f. the Section on Word-level annotation). The example below illustrates the prefix definitions in a corpus component giving the (partial) URL of the facsimile images of this particular newspaper issue:
<listPrefixDef>  <prefixDef ident="facs"   matchPattern="(.+)"   replacementPattern="https://nl.ijs.si/imp/facs/NUKP14042344.$1">   <p xml:lang="en">The URIs with this prefix point to the facsimile images of this      corpus component.</p>  </prefixDef> </listPrefixDef>
In short, if the value of a pointing attribute is facs:page1.png this should be expanded to https://nl.ijs.si/imp/facs/NUKP14042344.page1.png. The second example illustrates the prefix definitions in the corpus root for the optional MULTEXT-East part-of-speech tags:
<listPrefixDef>  <prefixDef ident="mtematchPattern="(.+)"   replacementPattern="https://nl.ijs.si/ME/V6/msd/tables/msd-fslib-sl.xml#$1">   <p xml:lang="en">Private URIs with this prefix point to feature-structure elements defining the      Slovenian MULTEXT-East Version 6 MSDs.</p>  </prefixDef> </listPrefixDef>
In detail, the specialised element for listing prefix definitions, <listPrefixDef> gives a (series of) prefix definitions, i.e. <prefixDef> elements. Each prefix definition defines its prefix as the value of the ident attribute, and then specifies a regular expression that matches the part of the ID reference after the prefix in its matchPattern attribute, and its substitution as the value of the replacementPattern attribute. The prefix definition above thus defines the mte prefix, so for any ID reference with this prefix, e.g. mte:Nps, the part after the prefix (Nps) should be matched against (.+) and the result being the matched part (here the entire relation Nps) substituted by #$1, i.e. by the hash character followed by the original value, so that mte:Nps gives https://nl.ijs.si/ME/V6/msd/tables/msd-fslib-sl.xml#Nps.

Finally, each prefix definition also contains a possibly bi-lingual paragraph explaining the definition.

4.3. Profile description

The profile description, <profileDesc> is the third main division of the metadata provided by the TEI header. It contains a description of non-bibliographic aspects of the corpus, in particular, the date range of the corpus. It is only used in the corpus root and contains two elements:
<profileDesc>  <settingDesc>...</settingDesc>  <langUsage>...</langUsage> </profileDesc>
We explain the contents of each element in the following sections.

4.3.1. Setting description

The setting description, <settingDesc>, is by the corpus root and contains only one element, <setting>, which, in turn then gives information on the date (year) range that the corpus covers:
<settingDesc>  <setting>   <date from="1870to="1915">1870--1915</date>  </setting> </settingDesc>

4.3.2. Language usage

The language usage, <langUsage> is the last element of the profile description of a corpus root and defines the languages that are used in the corpus. Typically the language use will define (bilingually) only two languages, the local language and English, as the language used in the metadata, for example:
<langUsage>  <language ident="slxml:lang="sl"   default="true">slovenski</language>  <language ident="enxml:lang="sl">angleški</language>  <language ident="slxml:lang="en"   default="true">Slovenian</language>  <language ident="enxml:lang="en">English</language> </langUsage>
To distinguish the meta-language from the local language, the default attribute is used, which has to have the value true for the local language.
In cases where the corpus contains more than one language, the percentage of their use can also be indicated in the usage element of the <language> elements, as illustrated in the example below:
<langUsage>  <language ident="enxml:lang="en">English</language>  <language ident="enxml:lang="nl">Engels</language>  <language default="trueusage="55"   ident="nlxml:lang="en">Dutch</language>  <language default="trueusage="55"   ident="nlxml:lang="nl">Nederlands</language>  <language usage="45ident="fr"   xml:lang="en">French</language>  <language usage="45ident="fr"   xml:lang="nl">Frans</language> </langUsage>
Note that only one of the local languages should have @default="true".

4.4. Revision description

The revision description, <revisionDesc> is the fourth, and last element of the TEI header. It is an optional element that can appear in the corpus root or component, and documents the revisions made in the corpus or component. Its structure is illustrated below:
<revisionDesc>  <change when="2025-07-11">   <name>Tomaž Erjavec</name>: Finalized encoding.</change>  <change when="2025-07-03">   <name>Tomaž Erjavec</name>: Built corpus.</change> </revisionDesc>
The revision description consists of a series of <change> elements, with the attribute when giving the date of the change, and the content containing the <name> of the person responsible for the change, and a free-text description of the change. Note that the <change> follow reverse chronological order, i.e. the most recent changes are at the top.

5. Newspaper facsimile

Facsimile (i.e. images) of the newspapers are highly useful, both for providing the original to the trancriptions in their analyis, as well as for allowing better OCR as the state-of-the-art improves. If the facsimile is available it also be also published together with the PressMint corpora, and should be referred to from the corpus, in particular from each corpus component.

How to encode references to the facsimile images in TEI is, in the general case, explained in the Chapter on Representation of Primary Sources of the TEI Guidelines. In this chapter we only provide the basic representation that is directly supported in PressMint.

5.1. The facsimile element

The <facsimile> element should appear in a corpus component immediately after the <teiHeader>, c.f. the Section on Overall XML corpus structure. It contains pointers to the complete facsimile or its parts, i.e. URLs of the images of an issue or its individual pages, and can further structure or document these images.

The <facsimile> element can contain a <graphic> element that points to the complete facsimile of the corpus component (typically an issue of a newspaper). This is followed by a series of <surface> elements, each one typically corresponding to a printed page. These, in turn, also contain <graphic> elements, each one pointing to the image of the page. It is important that each element containing a <graphic> has the xml:id attribute, as this serves to connect the text to the images. This following example illustrates this basic structure; note that the url values use the TEI extended pointers, cf. the Section on Prefix definitions:
<facsimile xml:id="PressMint-SI_1851-12-03_KRN-NUK.facsimile">  <graphic xml:id="PressMint-SI_1851-12-03_KRN-NUK.graphic"   url="facs:pdf"/>  <surface xml:id="PressMint-SI_1851-12-03_KRN-NUK.page1">   <graphic url="facs:page1.png"/>  </surface>  <surface xml:id="PressMint-SI_1851-12-03_KRN-NUK.page2">   <graphic url="facs:page2.png"/>  </surface> </facsimile>

Apart from modelling pages with <surface>, areas inside them can also be specified. For this, <zone> elements inside <surface> are used; these can specify a rectangle or, in general, a polygon inside it; the details are given in the TEI Section on Digital Facsimiles. Note, however, that if this approach is used, a mechanism needs to be implemented to show the correct zone on the image.

5.1.1. Connecting the text to the facsimile

Elements of the transcript are connected to the <facsimile>, <surface> or <zone> elements using the facs attribute, which has a series of ID references as its value. For example, if a paragraph appears on the first and second page of a newspaper issue, this would be modelled as in the following example:
<body>  <p facs="#PressMint-SI_1899_KRN-NUK.page1 #PressMint-SI_1899_KRN-NUK.page2"> ...  </p> </body>
The exact point where a new page or area starts can also be represented by using the empty page break (<pb>), column break (<cb>) or line break (<lb>) elements, as shown below:
<body>  <pb facs="#PressMint-SI_1899_KRN-NUK.page1"/>  <lb facs="#PressMint-SI_1899_KRN-NUK.page1.line1"/>  <p>v Ljubljani, v četrtek 16. aprila 1899.</p> </body>
By convention, if these empty elements are used, they should always appear in front of the part of the text they cover, i.e. a page/line/column break should come before the text, as above, which is different from the usual practice of putting line breaks at the end of the line.

Note that these breaks can appear anywhere in the text, including in the middle of a (end-of-line hyphenated) word, which makes the linguistic annotation of such text more complicated, as texual data is mixed with markup, typically not otherwise the case. Furthermore, by convention, the breaks should appear as high up in the hierarchy as possible, i.e. if a break should appear at the begining of a paragraph, it should be encoded before its start, as in the example above.

5.2. Structure of newspaper texts

The newspaper texts are encoded in the <text> element of corpus components. This element can contain <front> and <back> and must contain <body>. If they are used, <front> will typically contain the front-matter of a newspaper issue, i.e. its banner, while back would contain material that does not fit in well with the article-based structure of a newspaper, e.g. advertisements. However, this material can also be included directly in the <body> but at the risk of disrupting the flow of the article texts.

PressMint makes few assumptions on the structure of the texts in the TEI <body> and optional <front> and <back>. At minimum, they need to contain a series of paragraphs, i.e. <p> elements possibly with interspersed empty break elements, as discussed in the preceding Section on Connecting the text to the facsimile.

This kind of encoding is appropriate for text with little internal structure. However, if the text has been split into articles, these are encoded using the standard <div> element, possibly further qualified by the value of its type attribute, which will have a closed list of values, as illustrated below.
<text>  <body>   <div type="article">...</div>   <div type="advertisement">...</div>   <div type="article">...</div>    ...  </body> </text>
Each division can also start with a <head> element giving the title of the article, and contin possibly other standard TEI elements, such as <byline> giving the article's author.

5.2.1. Gaps

For various reasons, such as too low OCR quality or portions of the text not interesting to PressMint, such as tables, parts of the text can be ommitted. To mark missing material, the <gap> element is used, which is then also marked by the reason attribute, specifying why the material was ommitted. If desired, a description of the ommitted content can be given in the <desc> element of the <gap>, as illustrated below:
The city has provided us with a table giving the main expenditures and incomes: <gap reason="editorial">  <desc>Table of expenditures and incomes.</desc> </gap>

6. Linguistic annotation

This section introduces the PressMint linguistic annotation. An important note is that a linguistically annotated PressMint corpus is stored separately from its base (or plain-text) TEI version, i.e. the version that has been discussed in the preceding sections. The encoding of the linguistically annotated version differs from the plain-text one in the following:

6.1. Linguistic markup

Linguistic annotation is added only to the text content of <p> elements. For this text, PressMint requires the following additional markup to be present:

  • tokens: what is a word, and what is punctuation, with preserved information on inter-token spaces;
  • sentences: what is a sentence;
  • normalised form (optional): what is the modernised spelling of archaically spelled words;
  • lemmas: what is the base form of each word;
  • Universal Dependencies (UD) part-of-speech and morphological features, and, optionally, part-of-speech tags from a different (local) tagset;
  • named entities (NE): what is a name, categorised at least into the standard four NE classes;

Below, we explain the encoding of each of these levels.

6.1.1. Word-level annotation

Basic linguistic annotation comprises tokenisation, sentence segmentation, part-of-speech tagging and lemmatisation, and this mark-up is illustrated in the example below:
<s>  <w msd="UPosTag=DET|Case=Gen|Gender=Neut|Number=Sing|PronType=Dem"   lemma="ta">Tega</w>  <w msd="UPosTag=PRON|PronType=Prs|Reflex=Yes|Variant=Short"   lemma="se">se</w>  <w msd="UPosTag=PARTlemma="sploh">sploh</w>  <w msd="UPosTag=AUX|Mood=Ind|Number=Sing|Person=1|Polarity=Neg|Tense=Pres|VerbForm=Fin"   lemma="biti">nisem</w>  <w msd="UPosTag=VERB|Aspect=Perf|Gender=Masc|Number=Sing|VerbForm=Part"   lemma="zavestijoin="right">zavedel</w>  <pc msd="UPosTag=PUNCT">.</pc> </s>
Sentences are marked up using the <s> element, words with the <w> element and punctuation symbols with the <pc> element. To retain the linguistically significant whitespace, the join element with the fixed value right is used, meaning there should be no whitespace to the right of the token. There can be an added complication with tokenisation, which is further taken up in the next Section on Text modernisation.

The base form or lemmas of a word is given as the value of the lemma attribute, while punctuation characters, <pc>, do not have this attribute.

The UD part-of-speech and morphological features are both packed in the msd attribute, with the part-of-speech having the UPosTag linguistic attribute, and the features separated by the vertical bar.

PressMint also allows (but does not require) part-of-speech tags from some other tagset3 to be added to the linguistic annotation. Where this information is encoded, depends on the type of tagset.

For synthetic tagsets, such as the Penn Treebank tagset, which have atomic tags that cannot always be decomposed into attribute-value pairs (e.g. the tag ‘TO’ for the word ‘to’) should be encoded using the pos on words and punctuation symbols, as shown in the example below:
<s>  <w lemma="I"   msd="UPosTag=PRON|Case=Nom|Number=Sing|Person=1|PronType=Prspos="PRP">I</w>  <w lemma="support"   msd="UPosTag=VERB|Mood=Ind|Tense=Pres|VerbForm=Finpos="VBP">support</w>  <w lemma="the"   msd="UPosTag=DET|Definite=Def|PronType=Artpos="DT">the</w>  <w lemma="amendment"   msd="UPosTag=NOUN|Number=Singpos="NNjoin="right">amendment</w>  <pc msd="UPosTag=PUNCTpos=".">.</pc> </s>
For analytic tagsets, where a part-of-speech tag can be always decomposed into a set of attribute-values, the pointing attribute ana should be used. An example of such a collection of tagsets for various languages is given in the MULTEXT-East morphosyntactic specifications, and we give below an example that uses this tagset:
<s>  <w ana="mte:Vmpr1plemma="prehajati"   msd="UPosTag=VERB|Aspect=Imp|Mood=Ind|Number=Plur|Person=1|Tense=Pres|VerbForm=Fin">Prehajamo</w>  <w ana="mte:Salemma="na"   msd="UPosTag=ADP|Case=Acc">na</w>  <w ana="mte:Ncnsajoin="right"   lemma="odločanje"   msd="UPosTag=NOUN|Case=Acc|Gender=Neut|Number=Sing">odločanje</w>  <pc ana="mte:Zmsd="UPosTag=PUNCT">.</pc> </s>
The mte: is a prefix that is, via the TEI extended pointer syntax as defined in the TEI header (cf. the Section on Prefix definitions) expanded so that the value of such an ana attribute points to the expansions of the given tag to a feature structure. For example, the value mte:Vmpr1p would be expanded to https://nl.ijs.si/ME/V6/msd/tables/msd-fslib2-sl.xml#Vmpr1p, which then resolves to the feature-structure below:
<fs xml:id="Vmpr1pxml:lang="en"  corresp="#Ggnspm">  <f name="CATEGORY">   <symbol value="Verb"/>  </f>  <f name="Type">   <symbol value="main"/>  </f>  <f name="Aspect">   <symbol value="progressive"/>  </f>  <f name="VForm">   <symbol value="present"/>  </f>  <f name="Person">   <symbol value="first"/>  </f>  <f name="Number">   <symbol value="plural"/>  </f> </fs>

6.1.2. Text modernisation

The language of older newspapers might differ significantly from the contemporary norm. This has an impact on the quality of linguistic annotations, in cases where the annotation tool has been trained on contemporary texts only, as well as hindering searching for particular words or lemmas in their contemporary spellings. To alleviate this, normalisation (i.e. modernisation) is often used on archaic texts, and the subsequent linguist annotation is performed on such modernised text.

Modern neural approaches typically take a complete chunk of text and normalise it, while more traditional approaches perform the normalisation on individual words. The former has the advantage of being capable not only of modernising the spelling individual words but also substituting archaic words with their contemporary equivalents, modernising multi-word units or even syntactic constructions. However, if such a method is used on a PressMint corpus this means that the linguistically annotated variant of the corpus will contain only the modernised text, and the alignment to the plain-text variant of the corpus will be at the paragraph level only. In other words, losing word-alignment with the original tokens means also losing the ability to search for or directly view the original tokens.

In contrast, traditional methods (such as cSMTiser) will typically normalise only the spelling of individual words, or, at most, sequences of words. This means that the text has to be first tokenised, normalisation applied to such (series of) tokens, and the resulting normalised word-tokens then linguistically annotated. Here both the original and normalised and annotated words are available in the linguistically annotated version of the corpus.

If the normalised word token is identical to the original one then the annotation is exactly the same as for non-normalised text. If a original token is normalised into a single, but different token, then the norm attribute is used to record the value of the normalised token. These two cases are illustrated in the following example, where we also give the lemma of the words but, for simplicity, no further linguistic annotation:
<w lemma="lep">lepo</w> <w norm="soncelemma="sonce">solnce</w>
A complication arrises when one original token corresponds to several normalised tokens or vice versa. For these cases we use the same mechanism as was used in ParlaMint for splitting orthographic words into syntactic ones4, which is illustrated in the following two examples, the first where an archaic word was split into two contemporary ones, and the second where two archaic words form one contemporary word. Note that the linguistic annotation is given only to the normalised forms:
<w>neſkèrbite <w norm="nelemma="ne"/>  <w norm="skrbitelemma="skrbeti"/> </w> ... <w norm="najmanjšilemma="lep">  <w>nar</w>  <w>manſhi</w> </w>
Also note that if such nested tokens do not have a following space, join="right" should be added to the top level word as well as to the last nested word.

6.1.3. Named entities

PressMint also requires annotation of Named Entities (NE), which should be categorised into the following four types:

  • PER: person
  • LOC: location
  • ORG: organisation
  • MISC: miscellaneous
The identified names and their type are marked up as the <name> element with the appropriate value of its type attribute, as shown in the example below:
... <w lemma="andmsd="UPosTag=CCONJ">and</w> <name type="ORG">  <w lemma="Westminster"   msd="UPosTag=PROPN|Number=Sing">Westminster</w>  <w join="rightlemma="Hall"   msd="UPosTag=PROPN|Number=Sing">Hall</w> </name> <pc msd="UPosTag=PUNCT">,</pc> ...

6.2. Metadata for linguistic annotation

What kind of metadata a plain-text PressMint corpus should contain was explained in the Section on Corpus metadata and in this section we detail what additions must be made to the metadata for the linguistically annotated version. Note that the other changes for this version of a corpus have been already explained at the start of this Chapter. For PressMint, this information has been, so far, simplified in comparison with ParlaMint; we do not foresee linguistic taxonomies (in particular, for NER), so there are one obligatory and one optional metadata element dedicated to linguistic processing. Both are added to the <teiHeader> of the root of the linguistically analysed corpus, namely a description of the tool(s) used to linguistically annotate the corpus, and optional taxonomies of corpus-specific PoS tags.

6.2.1. Application information for linguistic processing

As the linguistic analysis of a PressMint corpus will be performed by a tool, the information on which tool (or tools) have been used should be documented in the corpus root TEI header. This information is encoded in the <appInfo> element of the <encodingDesc>, as shown in the example below:
<appInfo>  <application version="1.0ident="classla">   <label>CLASSLA</label>   <desc xml:lang="en">Linguistic processing performed with with CLASSLA trained for      Slovene, available from <ref target="https://github.com/clarinsi/classla">https://github.com/clarinsi/classla</ref>.</desc>  </application> </appInfo>
The <appInfo> element contains, in general, a series of <application> elements, each one giving the information on one tool. The element gives the version number of the tool and specifies, via ident, and identifying code. It has two subordinate elements, with <label> giving the name of the tool and <desc> a short description of it, preferably with a pointer to the URL where it can be found or is at least documented.

7. Validation and conversion

The chapter explains how to validate and finalise a PressMint corpus, and introduces scripts for converting a PressMint corpus to other, derived formats.

7.1. Validating PressMint corpora

The XML structure of PressMint corpora can be validated via RelaxNG schema produced as a customisation of the TEI Guidelines.

The TEI customisation is written as a TEI ODD document, which is, in fact, the XML version of this document, and is available in the TEI/ directory of the PressMint GitHub repository. The XML contains not only the prose guidelines, but also the formal specification of the TEI schema, which is given in the Appendix A. In the XML it contains the formal schema specification, while in the on-line version this is converted to a reference to all the elements, attributes and classes used in PressMint corpora --- quite a lot, as the PressMint schema has been left open enough to accommodate differing requirements in the encoding.

The ODD document is not immediately useful for XML validation but has to be converted with standard TEI XSLT stylesheets to a RelaxNG schema. The TEI ODD and its RelaxNG schema (PressMint.rng (and the HTML guidelelines) are always kept in sync. This schema should be used to check that PressMint component files validate against TEI, typically using Jing (cf. Contributing to PressMint.

7.2. Finalisation of corpora

While the vast majority of converting source encodings into the PressMint corpus format is left to the compilers of a corpus, there are a few metadata elements that can be produced by a common script on the basis of nearly finished corpora, which then results in the final version of the corpus for a particular release. This includes setting the date, edition and handle under which the corpus will be distributed, and also calculating the size of the corpus (cf. the Sections on Extents and on Tags declaration). The script for finalisation can be found in the Scripts/ directory of the PressMint GitHub repository and the README file briefly explains its function; more comments can be found in the script itself.

7.3. Conversions

A TEI encoded document is, in general, not meant to be used directly by software programs, rather, it serves as an interchange and storage format. The PressMint project has produced various scripts to down-convert the XML encoded corpora to other formats and they can be found in the Scripts/ directory of the PressMint GitHub repository, with the README file listing them and explaining their function. In short, the scripts convert the PressMint XML to plain text, to CoNLL-U, and to vertical format. There is also a script that takes a PressMint corpus and makes from it a sample for inclusion to the PressMint GitHub repository.

8. Contributing to PressMint

The PressMint GitHub repository contains these guidelines, the PressMint XML schemas, the scripts used to validate, finalise and convert the PressMint TEI XML corpora to derived formats, and samples of the PressMint corpora. There are four main branches in the repository:

The validation procedure for corpora is explained in the Section on Validating PressMint corpora, while the technical aspects of contributing corpora is further explained in the CONTRIBUTING file of the repository.

9. Acknowledgements

The work on these recommendations was funded by the CLARIN Research Infrastructure for Language Resources and Tools.

Appendix A Formal specification

Appendix A.1 Elements

Appendix A.1.1 <TEI>

<TEI> (TEI document) contains a single TEI-conformant document, combining a single TEI header with one or more members of the model.resource class. Multiple <TEI> elements may be combined within a <TEI> (or <teiCorpus>) element. [4. Default Text Structure 16.1. Varieties of Composite Text]
Moduletextstructure — Formal specification
Attributes
versionspecifies the version number of the TEI Guidelines against which this document is valid.
StatusOptional
Datatypeteidata.version
Note

Major editions of the Guidelines have long been informally referred to by a name made up of the letter P (for Proposal) followed by a digit. The current release is one of the many releases of the fifth major edition of the Guidelines, known as P5. This attribute may be used to associate a TEI document with a specific release of the P5 Guidelines, in the absence of a more precise association provided by the source attribute on the associated <schemaSpec>.

Member of
Contained by
core: teiCorpus
textstructure: TEI
May contain
header: teiHeader
textstructure: TEI text
Note

As with all elements in the TEI scheme (except <egXML>) this element is in the TEI namespace (see 5.7.2. Namespaces). Thus, when it is used as the outermost element of a TEI document, it is necessary to specify the TEI namespace on it. This is customarily achieved by including http://www.tei-c.org/ns/1.0 as the value of the XML namespace declaration (xmlns), without indicating a prefix, and then not using a prefix on TEI elements in the rest of the document. For example: <TEI version="4.8.1" xml:lang="it" xmlns="http://www.tei-c.org/ns/1.0">.

Example
<TEI version="3.3.0" xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader>   <fileDesc>    <titleStmt>     <title>The shortest TEI Document Imaginable</title>    </titleStmt>    <publicationStmt>     <p>First published as part of TEI P2, this is the P5          version using a namespace.</p>    </publicationStmt>    <sourceDesc>     <p>No source: this is an original work.</p>    </sourceDesc>   </fileDesc>  </teiHeader>  <text>   <body>    <p>This is about the shortest TEI document imaginable.</p>   </body>  </text> </TEI>
Example
<TEI version="2.9.1" xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader>   <fileDesc>    <titleStmt>     <title>A TEI Document containing four page images </title>    </titleStmt>    <publicationStmt>     <p>Unpublished demonstration file.</p>    </publicationStmt>    <sourceDesc>     <p>No source: this is an original work.</p>    </sourceDesc>   </fileDesc>  </teiHeader>  <facsimile>   <graphic url="page1.png"/>   <graphic url="page2.png"/>   <graphic url="page3.png"/>   <graphic url="page4.png"/>  </facsimile> </TEI>
Content model
<content>
 <sequence>
  <elementRef key="teiHeader"/>
  <alternate>
   <sequence>
    <classRef key="model.resource"
     minOccurs="1" maxOccurs="unbounded"/>
    <elementRef key="TEI" minOccurs="0"
     maxOccurs="unbounded"/>
   </sequence>
   <elementRef key="TEI" minOccurs="1"
    maxOccurs="unbounded"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration
element TEI
{
   tei_att.global.attributes,
   tei_att.typed.attributes,
   attribute version { text }?,
   ( tei_teiHeader, ( ( tei_model.resource+, tei_TEI* ) | tei_TEI+ ) )
}

Appendix A.1.2 <addSpan>

<addSpan> (added span of text) marks the beginning of a longer sequence of text added by an author, scribe, annotator or corrector (see also <add>). [12.3.1.4. Additions and Deletions]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May containEmpty element
Note

Both the beginning and the end of the added material must be marked; the beginning by the <addSpan> element itself, the end by the spanTo attribute.

Example
<handNote xml:id="HEOL"  scribe="HelgiÓlafsson"/> <!-- ... --> <body>  <div> <!-- text here -->  </div>  <addSpan n="added_gatheringhand="#HEOL"   spanTo="#P025"/>  <div> <!-- text of first added poem here -->  </div>  <div> <!-- text of second added poem here -->  </div>  <div> <!-- text of third added poem here -->  </div>  <div> <!-- text of fourth added poem here -->  </div>  <anchor xml:id="P025"/>  <div> <!-- more text here -->  </div> </body>
Schematron
<sch:rule context="tei:addSpan"> <sch:assert test="@spanTo">The @spanTo attribute of <sch:name/> is required.</sch:assert> </sch:rule>
Schematron
<sch:rule context="tei:addSpan"> <sch:assert test="@spanTo">L'attribut spanTo est requis.</sch:assert> </sch:rule>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element addSpan
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.spanning.attributes,
   tei_att.transcriptional.attributes,
   tei_att.typed.attributes,
   empty
}

Appendix A.1.3 <appInfo>

<appInfo> (application information) records information about an application which has edited the TEI file. [2.3.11. The Application Information Element]
Moduleheader — Formal specification
Attributes
Member of
Contained by
header: encodingDesc
May contain
header: application
Example
<appInfo>  <application version="1.24ident="Xaira">   <label>XAIRA Indexer</label>   <ptr target="#P1"/>  </application> </appInfo>
Content model
<content>
 <classRef key="model.applicationLike"
  minOccurs="1" maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element appInfo { tei_att.global.attributes, tei_model.applicationLike+ }

Appendix A.1.4 <application>

<application> provides information about an application which has acted upon the document. [2.3.11. The Application Information Element]
Moduleheader — Formal specification
Attributes
identsupplies an identifier for the application, independent of its version number or display name.
StatusRequired
Datatypeteidata.name
versionsupplies a version number for the application, independent of its identifier or display name.
StatusRequired
Datatypeteidata.versionNumber
Member of
Contained by
header: appInfo
May contain
core: desc label p ref
Example
<appInfo>  <application version="1.5"   ident="ImageMarkupTool1notAfter="2006-06-01">   <label>Image Markup Tool</label>   <ptr target="#P1"/>   <ptr target="#P2"/>  </application> </appInfo>
This example shows an appInfo element documenting the fact that version 1.5 of the Image Markup Tool1 application has an interest in two parts of a document which was last saved on June 6 2006. The parts concerned are accessible at the URLs given as target for the two <ptr> elements.
Content model
<content>
 <sequence>
  <classRef key="model.labelLike"
   minOccurs="1" maxOccurs="unbounded"/>
  <alternate>
   <classRef key="model.ptrLike"
    minOccurs="0" maxOccurs="unbounded"/>
   <classRef key="model.pLike"
    minOccurs="0" maxOccurs="unbounded"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration
element application
{
   tei_att.global.attributes,
   tei_att.datable.attributes,
   tei_att.typed.attributes,
   attribute ident { text },
   attribute version { text },
   ( tei_model.labelLike+, ( tei_model.ptrLike* | tei_model.pLike* ) )
}

Appendix A.1.5 <availability>

<availability> (availability) supplies information about the availability of a text, for example any restrictions on its use or distribution, its copyright status, any licence applying to it, etc. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduleheader — Formal specification
Attributes
status(status) supplies a code identifying the current availability of the text.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
free
(free) the text is freely available.
unknown
(unknown) the status of the text is unknown.
restricted
(restricted) the text is not freely available.
Member of
Contained by
core: bibl
May contain
core: p
header: licence
Note

A consistent format should be adopted

Example
<availability status="restricted">  <p>Available for academic research purposes only.</p> </availability> <availability status="free">  <p>In the public domain</p> </availability> <availability status="restricted">  <p>Available under licence from the publishers.</p> </availability>
Example
<availability>  <licence target="http://opensource.org/licenses/MIT">   <p>The MIT License      applies to this document.</p>   <p>Copyright (C) 2011 by The University of Victoria</p>   <p>Permission is hereby granted, free of charge, to any person obtaining a copy      of this software and associated documentation files (the "Software"), to deal      in the Software without restriction, including without limitation the rights      to use, copy, modify, merge, publish, distribute, sublicense, and/or sell      copies of the Software, and to permit persons to whom the Software is      furnished to do so, subject to the following conditions:</p>   <p>The above copyright notice and this permission notice shall be included in      all copies or substantial portions of the Software.</p>   <p>THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR      IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,      FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE      AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER      LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,      OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN      THE SOFTWARE.</p>  </licence> </availability>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde"  value="tei:availability"/> </sch:pattern>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.availabilityPart"/>
  <classRef key="model.pLike"/>
 </alternate>
</content>
    
Schema Declaration
element availability
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   attribute status { "free" | "unknown" | "restricted" }?,
   ( tei_model.availabilityPart | tei_model.pLike )+
}

Appendix A.1.6 <back>

<back> (back matter) contains any appendixes, etc. following the main part of a text. [4.7. Back Matter 4. Default Text Structure]
Moduletextstructure — Formal specification
Attributes
Contained by
textstructure: text
transcr: facsimile
May contain
Note

Because cultural conventions differ as to which elements are grouped as back matter and which as front matter, the content models for the <back> and <front> elements are identical.

Example
<back>  <div type="appendix">   <head>The Golden Dream or, the Ingenuous Confession</head>   <p>TO shew the Depravity of human Nature, and how apt the Mind is to be misled by Trinkets      and false Appearances, Mrs. Two-Shoes does acknowledge, that after she became rich, she      had like to have been, too fond of Money <!-- .... -->   </p>  </div> <!-- ... -->  <div type="epistle">   <head>A letter from the Printer, which he desires may be inserted</head>   <salute>Sir.</salute>   <p>I have done with your Copy, so you may return it to the Vatican, if you please;    <!-- ... -->   </p>  </div>  <div type="advert">   <head>The Books usually read by the Scholars of Mrs Two-Shoes are these and are sold at Mr      Newbery's at the Bible and Sun in St Paul's Church-yard.</head>   <list>    <item n="1">The Christmas Box, Price 1d.</item>    <item n="2">The History of Giles Gingerbread, 1d.</item> <!-- ... -->    <item n="42">A Curious Collection of Travels, selected from the Writers of all Nations,        10 Vol, Pr. bound 1l.</item>   </list>  </div>  <div type="advert">   <head>By the KING's Royal Patent, Are sold by J. NEWBERY, at the Bible and Sun in St.      Paul's Church-Yard.</head>   <list>    <item n="1">Dr. James's Powders for Fevers, the Small-Pox, Measles, Colds, &amp;c. 2s.        6d</item>    <item n="2">Dr. Hooper's Female Pills, 1s.</item> <!-- ... -->   </list>  </div> </back>
Content model
<content>
 <sequence>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.frontPart"/>
   <classRef key="model.pLike.front"/>
   <classRef key="model.pLike"/>
   <classRef key="model.listLike"/>
   <classRef key="model.global"/>
  </alternate>
  <alternate minOccurs="0">
   <sequence>
    <classRef key="model.div1Like"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.frontPart"/>
     <classRef key="model.div1Like"/>
     <classRef key="model.global"/>
    </alternate>
   </sequence>
   <sequence>
    <classRef key="model.divLike"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.frontPart"/>
     <classRef key="model.divLike"/>
     <classRef key="model.global"/>
    </alternate>
   </sequence>
  </alternate>
  <sequence minOccurs="0">
   <classRef key="model.divBottomPart"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.divBottomPart"/>
    <classRef key="model.global"/>
   </alternate>
  </sequence>
 </sequence>
</content>
    
Schema Declaration
element back
{
   tei_att.global.attributes,
   tei_att.declaring.attributes,
   (
      (
         tei_model.frontPart
       | tei_model.pLike.front
       | tei_model.pLike
       | tei_model.listLike
       | tei_model.global
      )*,
      (
         (
            tei_model.div1Like,
            ( tei_model.frontPart | tei_model.div1Like | tei_model.global )*
         )
       | (
            tei_model.divLike,
            ( tei_model.frontPart | tei_model.divLike | tei_model.global )*
         )
      )?,
      (
         (
            tei_model.divBottomPart,
            ( tei_model.divBottomPart | tei_model.global )*
         )?
      )
   )
}

Appendix A.1.7 <bibl>

<bibl> (bibliographic citation) contains a loosely-structured bibliographic citation of which the sub-components may or may not be explicitly tagged. [3.12.1. Methods of Encoding Bibliographic References and Lists of References 2.2.7. The Source Description 16.3.2. Declarable Elements]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
Note

Contains phrase-level elements, together with any combination of elements from the model.biblPart class

Example
<bibl>Blain, Clements and Grundy: Feminist Companion to Literature in English (Yale, 1990)</bibl>
Example
<bibl>  <title level="a">The Interesting story of the Children in the Wood</title>. In <author>Victor E Neuberg</author>, <title>The Penny Histories</title>. <publisher>OUP</publisher>  <date>1968</date>. </bibl>
Example
<bibl type="articlesubtype="book_chapter"  xml:id="carlin_2003">  <author>   <name>    <surname>Carlin</surname>      (<forename>Claire</forename>)</name>  </author>, <title level="a">The Staging of Impotence : France’s last    congrès</title> dans <bibl type="monogr">   <title level="m">Theatrum mundi : studies in honor of Ronald W.      Tobin</title>, éd.  <editor>    <name>     <forename>Claire</forename>     <surname>Carlin</surname>    </name>   </editor> et  <editor>    <name>     <forename>Kathleen</forename>     <surname>Wine</surname>    </name>   </editor>,  <pubPlace>Charlottesville, Va.</pubPlace>,  <publisher>Rookwood Press</publisher>,  <date when="2003">2003</date>.  </bibl> </bibl>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde" value="tei:bibl"/> </sch:pattern>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.highlighted"/>
  <classRef key="model.pPart.data"/>
  <classRef key="model.pPart.edit"/>
  <classRef key="model.segLike"/>
  <classRef key="model.ptrLike"/>
  <classRef key="model.biblPart"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration
element bibl
{
   tei_att.global.attributes,
   tei_att.canonical.attributes,
   tei_att.cmc.attributes,
   tei_att.declarable.attributes,
   tei_att.docStatus.attributes,
   tei_att.sortable.attributes,
   tei_att.typed.attributes,
   (
      text
    | tei_model.gLike
    | tei_model.highlighted
    | tei_model.pPart.data
    | tei_model.pPart.edit
    | tei_model.segLike
    | tei_model.ptrLike
    | tei_model.biblPart
    | tei_model.global
   )*
}

Appendix A.1.8 <body>

<body> (text body) contains the whole body of a single unitary text, excluding any front or back matter. [4. Default Text Structure]
Moduletextstructure — Formal specification
Attributes
Contained by
textstructure: text
May contain
Example
<body>  <l>Nu scylun hergan hefaenricaes uard</l>  <l>metudæs maecti end his modgidanc</l>  <l>uerc uuldurfadur sue he uundra gihuaes</l>  <l>eci dryctin or astelidæ</l>  <l>he aerist scop aelda barnum</l>  <l>heben til hrofe haleg scepen.</l>  <l>tha middungeard moncynnæs uard</l>  <l>eci dryctin æfter tiadæ</l>  <l>firum foldu frea allmectig</l>  <trailer>primo cantauit Cædmon istud carmen.</trailer> </body>
Content model
<content>
 <sequence>
  <classRef key="model.global"
   minOccurs="0" maxOccurs="unbounded"/>
  <sequence minOccurs="0">
   <classRef key="model.divTop"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.global"/>
    <classRef key="model.divTop"/>
   </alternate>
  </sequence>
  <sequence minOccurs="0">
   <classRef key="model.divGenLike"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.global"/>
    <classRef key="model.divGenLike"/>
   </alternate>
  </sequence>
  <alternate>
   <sequence minOccurs="1"
    maxOccurs="unbounded">
    <classRef key="model.divLike"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.global"/>
     <classRef key="model.divGenLike"/>
    </alternate>
   </sequence>
   <sequence minOccurs="1"
    maxOccurs="unbounded">
    <classRef key="model.div1Like"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.global"/>
     <classRef key="model.divGenLike"/>
    </alternate>
   </sequence>
   <sequence>
    <sequence minOccurs="1"
     maxOccurs="unbounded">
     <alternate minOccurs="1" maxOccurs="1">
      <elementRef key="schemaSpec"/>
      <classRef key="model.common"/>
     </alternate>
     <classRef key="model.global"
      minOccurs="0" maxOccurs="unbounded"/>
    </sequence>
    <alternate minOccurs="0">
     <sequence minOccurs="1"
      maxOccurs="unbounded">
      <classRef key="model.divLike"/>
      <alternate minOccurs="0"
       maxOccurs="unbounded">
       <classRef key="model.global"/>
       <classRef key="model.divGenLike"/>
      </alternate>
     </sequence>
     <sequence minOccurs="1"
      maxOccurs="unbounded">
      <classRef key="model.div1Like"/>
      <alternate minOccurs="0"
       maxOccurs="unbounded">
       <classRef key="model.global"/>
       <classRef key="model.divGenLike"/>
      </alternate>
     </sequence>
    </alternate>
   </sequence>
  </alternate>
  <sequence minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.divBottom"/>
   <classRef key="model.global"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
 </sequence>
</content>
    
Schema Declaration
element body
{
   tei_att.global.attributes,
   tei_att.declaring.attributes,
   (
      tei_model.global*,
      ( ( tei_model.divTop, ( tei_model.global | tei_model.divTop )* )? ),
      (
         ( tei_model.divGenLike, ( tei_model.global | tei_model.divGenLike )* )?
      ),
      (
         (
            ( tei_model.divLike, ( tei_model.global | tei_model.divGenLike )* )+
         )
       | (
            (
               tei_model.div1Like,
               ( tei_model.global | tei_model.divGenLike )*
            )+
         )
       | (
            ( ( ( schemaSpec | tei_model.common ), tei_model.global* )+ ),
            (
               (
                  (
                     tei_model.divLike,
                     ( tei_model.global | tei_model.divGenLike )*
                  )+
               )
             | (
                  (
                     tei_model.div1Like,
                     ( tei_model.global | tei_model.divGenLike )*
                  )+
               )
            )?
         )
      ),
      ( ( tei_model.divBottom, tei_model.global* )* )
   )
}

Appendix A.1.9 <catDesc>

<catDesc> (category description) describes some category within a taxonomy or text typology, either in the form of a brief prose description or in terms of the situational parameters used by the TEI formal <textDesc>. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributes
Contained by
header: category
May contain
header: idno
transcr: ex subst
character data
Example
<catDesc>Prose reportage</catDesc>
Example
<catDesc>  <textDesc n="novel">   <channel mode="w">print; part issues</channel>   <constitution type="single"/>   <derivation type="original"/>   <domain type="art"/>   <factuality type="fiction"/>   <interaction type="none"/>   <preparedness type="prepared"/>   <purpose type="entertaindegree="high"/>   <purpose type="informdegree="medium"/>  </textDesc> </catDesc>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.limitedPhrase"/>
  <classRef key="model.catDescPart"/>
 </alternate>
</content>
    
Schema Declaration
element catDesc
{
   tei_att.global.attributes,
   tei_att.canonical.attributes,
   ( text | tei_model.limitedPhrase | tei_model.catDescPart )*
}

Appendix A.1.10 <catRef>

<catRef> (category reference) specifies one or more defined categories within some taxonomy or text typology. [2.4.3. The Text Classification]
Moduleheader — Formal specification
Attributes
schemeidentifies the classification scheme within which the set of categories concerned is defined, for example by a <taxonomy> element, or by some other resource.
StatusOptional
Datatypeteidata.pointer
Contained by
header: textClass
May containEmpty element
Note

The scheme attribute needs to be supplied only if more than one taxonomy has been declared.

Example
<catRef scheme="#myTopics"  target="#news #prov #sales2"/> <!-- elsewhere --> <taxonomy xml:id="myTopics">  <category xml:id="news">   <catDesc>Newspapers</catDesc>  </category>  <category xml:id="prov">   <catDesc>Provincial</catDesc>  </category>  <category xml:id="sales2">   <catDesc>Low to average annual sales</catDesc>  </category> </taxonomy>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element catRef
{
   tei_att.global.attributes,
   tei_att.pointing.attributes,
   attribute scheme { text }?,
   empty
}

Appendix A.1.11 <category>

<category> (category) contains an individual descriptive category, possibly nested within a superordinate category, within a user-defined taxonomy. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributes
Contained by
May contain
core: desc
Example
<category xml:id="b1">  <catDesc>Prose reportage</catDesc> </category>
Example
<category xml:id="b2">  <catDesc>Prose </catDesc>  <category xml:id="b11">   <catDesc>journalism</catDesc>  </category>  <category xml:id="b12">   <catDesc>fiction</catDesc>  </category> </category>
Example
<category xml:id="LIT">  <catDesc xml:lang="pl">literatura piękna</catDesc>  <catDesc xml:lang="en">fiction</catDesc>  <category xml:id="LPROSE">   <catDesc xml:lang="pl">proza</catDesc>   <catDesc xml:lang="en">prose</catDesc>  </category>  <category xml:id="LPOETRY">   <catDesc xml:lang="pl">poezja</catDesc>   <catDesc xml:lang="en">poetry</catDesc>  </category>  <category xml:id="LDRAMA">   <catDesc xml:lang="pl">dramat</catDesc>   <catDesc xml:lang="en">drama</catDesc>  </category> </category>
Content model
<content>
 <sequence>
  <alternate>
   <elementRef key="catDesc" minOccurs="1"
    maxOccurs="unbounded"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.descLike"/>
    <elementRef key="equiv"/>
    <elementRef key="gloss"/>
   </alternate>
  </alternate>
  <elementRef key="category" minOccurs="0"
   maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration
element category
{
   tei_att.global.attributes,
   tei_att.datcat.attributes,
   (
      ( tei_catDesc+ | ( tei_model.descLike | equiv | gloss )* ),
      tei_category*
   )
}

Appendix A.1.12 <change>

<change> (change) documents a change or set of changes made during the production of a source document, or during the revision of an electronic file. [2.6. The Revision Description 2.4.1. Creation 12.7. Identifying Changes and Revisions]
Moduleheader — Formal specification
Attributes
target(target) points to one or more elements that belong to this change.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Contained by
header: revisionDesc
May contain
Note

The who attribute may be used to point to any other element, but will typically specify a <respStmt> or <person> element elsewhere in the header, identifying the person responsible for the change and their role in making it.

It is recommended that changes be recorded with the most recent first. The status attribute may be used to indicate the status of a document following the change documented.

Example
<titleStmt>  <title> ... </title>  <editor xml:id="LDB">Lou Burnard</editor>  <respStmt xml:id="BZ">   <resp>copy editing</resp>   <name>Brett Zamir</name>  </respStmt> </titleStmt> <!-- ... --> <revisionDesc status="published">  <change who="#BZwhen="2008-02-02"   status="public">Finished chapter 23</change>  <change who="#BZwhen="2008-01-02"   status="draft">Finished chapter 2</change>  <change n="P2.2when="1991-12-21"   who="#LDB">Added examples to section 3</change>  <change when="1991-11-11who="#MSM">Deleted chapter 10</change> </revisionDesc>
Example
<profileDesc>  <creation>   <listChange>    <change xml:id="DRAFT1">First draft in pencil</change>    <change xml:id="DRAFT2"     notBefore="1880-12-09">First revision, mostly        using green ink</change>    <change xml:id="DRAFT3"     notBefore="1881-02-13">Final corrections as        supplied to printer.</change>   </listChange>  </creation> </profileDesc>
Content model
<content>
 <macroRef key="macro.specialPara"/>
</content>
    
Schema Declaration
element change
{
   tei_att.global.attributes,
   tei_att.ascribed.attributes,
   tei_att.datable.attributes,
   tei_att.docStatus.attributes,
   tei_att.typed.attributes,
   attribute target { list { + } }?,
   tei_macro.specialPara
}

Appendix A.1.13 <char>

<char> (character) provides descriptive information about a character. [5.2. Markup Constructs for Representation of Characters and Glyphs]
Modulegaiji — Formal specification
Attributes
Contained by
gaiji: charDecl
May contain
Example
<char xml:id="circledU4EBA">  <localProp name="Name"   value="CIRCLED IDEOGRAPH 4EBA"/>  <localProp name="daikanwavalue="36"/>  <unicodeProp name="Decomposition_Mapping"   value="circle"/>  <mapping type="standard"></mapping> </char>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <elementRef key="unicodeProp"/>
  <elementRef key="unihanProp"/>
  <elementRef key="localProp"/>
  <elementRef key="mapping"/>
  <elementRef key="figure"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.noteLike"/>
  <classRef key="model.descLike"/>
 </alternate>
</content>
    
Schema Declaration
element char
{
   tei_att.global.attributes,
   (
      tei_unicodeProp
    | tei_unihanProp
    | tei_localProp
    | tei_mapping
    | figure
    | tei_model.graphicLike
    | tei_model.noteLike
    | tei_model.descLike
   )*
}

Appendix A.1.14 <charDecl>

<charDecl> (character declarations) provides information about nonstandard characters and glyphs. [5.2. Markup Constructs for Representation of Characters and Glyphs]
Modulegaiji — Formal specification
Attributes
Member of
Contained by
header: encodingDesc
May contain
core: desc
gaiji: char glyph
Example
<charDecl>  <char xml:id="aENL">   <unicodeProp name="Name"    value="LATIN LETTER ENLARGED SMALL A"/>   <mapping type="standard">a</mapping>  </char> </charDecl>
Content model
<content>
 <sequence>
  <elementRef key="desc" minOccurs="0"/>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
   <elementRef key="char"/>
   <elementRef key="glyph"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration
element charDecl
{
   tei_att.global.attributes,
   ( tei_desc?, ( tei_char | tei_glyph )+ )
}

Appendix A.1.15 <classDecl>

<classDecl> (classification declarations) contains one or more taxonomies defining any classificatory codes used elsewhere in the text. [2.3.7. The Classification Declaration 2.3. The Encoding Description]
Moduleheader — Formal specification
Attributes
Member of
Contained by
header: encodingDesc
May contain
header: taxonomy
Example
<classDecl>  <taxonomy xml:id="LCSH">   <bibl>Library of Congress Subject Headings</bibl>  </taxonomy> </classDecl> <!-- ... --> <textClass>  <keywords scheme="#LCSH">   <term>Political science</term>   <term>United States -- Politics and government --      Revolution, 1775-1783</term>  </keywords> </textClass>
Content model
<content>
 <elementRef key="taxonomy" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element classDecl { tei_att.global.attributes, tei_taxonomy+ }

Appendix A.1.16 <correction>

<correction> (correction principles) states how and under what circumstances corrections have been made in the text. [2.3.3. The Editorial Practices Declaration 16.3.2. Declarable Elements]
Moduleheader — Formal specification
Attributes
statusindicates the degree of correction applied to the text.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
high
the text has been thoroughly checked and proofread.
medium
the text has been checked at least once.
low
the text has not been checked.
unknown
the correction status of the text is unknown.
methodindicates the method adopted to indicate corrections within the text.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
silent
corrections have been made silently[Default]
markup
corrections have been represented using markup
Member of
Contained by
May contain
core: p
Note

May be used to note the results of proof reading the text against its original, indicating (for example) whether discrepancies have been silently rectified, or recorded using the editorial tags described in section 3.5. Simple Editorial Changes.

Example
<correction>  <p>Errors in transcription controlled by using the WordPerfect spelling checker, with a user    defined dictionary of 500 extra words taken from Chambers Twentieth Century    Dictionary.</p> </correction>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde"  value="tei:correction"/> </sch:pattern>
Content model
<content>
 <classRef key="model.pLike" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element correction
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   attribute status { "high" | "medium" | "low" | "unknown" }?,
   attribute method { "silent" | "markup" }?,
   tei_model.pLike+
}

Appendix A.1.17 <damage>

<damage> (damage) contains an area of damage to the text witness. [12.3.3.1. Damage, Illegibility, and Supplied Text]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May contain
Note

Since damage to text witnesses frequently makes them harder to read, the <damage> element will often contain an <unclear> element. If the damaged area is not continuous (e.g. a stain affecting several strings of text), the group attribute may be used to group together several related <damage> elements; alternatively the <join> element may be used to indicate which <damage> and <unclear> elements are part of the same physical phenomenon.

The <damage>, <gap>, <del>, <unclear> and <supplied> elements may be closely allied in use. See section 12.3.3.2. Use of the gap, del, damage, unclear, and supplied Elements in Combination for discussion of which element is appropriate for which circumstance.

Example
<l>The Moving Finger wri<damage agent="watergroup="1">es; and</damage> having writ,</l> <l>Moves <damage agent="watergroup="1">   <supplied>on: nor all your</supplied>  </damage> Piety nor Wit</l>
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element damage
{
   tei_att.global.attributes,
   tei_att.damaged.attributes,
   tei_att.typed.attributes,
   tei_macro.paraContent
}

Appendix A.1.18 <damageSpan>

<damageSpan> (damaged span of text) marks the beginning of a longer sequence of text which is damaged in some way but still legible. [12.3.3.1. Damage, Illegibility, and Supplied Text]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May containEmpty element
Note

Both the beginning and ending of the damaged sequence must be marked: the beginning by the <damageSpan> element, the ending by the target of the spanTo attribute: if no other element available, the <anchor> element may be used for this purpose.

The damaged text must be at least partially legible, in order for the encoder to be able to transcribe it. If it is not legible at all, the <damageSpan> element should not be used. Rather, the <gap> or <unclear> element should be employed, with the value of the reason attribute giving the cause. See further sections 12.3.3.1. Damage, Illegibility, and Supplied Text and 12.3.3.2. Use of the gap, del, damage, unclear, and supplied Elements in Combination.

Example
<p>Paragraph partially damaged. This is the undamaged portion <damageSpan spanTo="#a34"/>and this the damaged portion of the paragraph.</p> <p>This paragraph is entirely damaged.</p> <p>Paragraph partially damaged; in the middle of this paragraph the damage ends and the anchor point marks the start of the <anchor xml:id="a34"/> undamaged part of the text. ...</p>
Schematron
<sch:rule context="tei:damageSpan"> <sch:assert test="@spanTo">The @spanTo attribute of <sch:name/> is required.</sch:assert> </sch:rule>
Schematron
<sch:rule context="tei:damageSpan"> <sch:assert test="@spanTo">L'attribut spanTo est requis.</sch:assert> </sch:rule>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element damageSpan
{
   tei_att.global.attributes,
   tei_att.damaged.attributes,
   tei_att.spanning.attributes,
   tei_att.typed.attributes,
   empty
}

Appendix A.1.19 <date>

<date> (date) contains a date in any format. [3.6.4. Dates and Times 2.2.4. Publication, Distribution, Licensing, etc. 2.6. The Revision Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 16.2.3. The Setting Description 14.4. Dates]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
Example
<date when="1980-02">early February 1980</date>
Example
Given on the <date when="1977-06-12">Twelfth Day of June in the Year of Our Lord One Thousand Nine Hundred and Seventy-seven of the Republic the Two Hundredth and first and of the University the Eighty-Sixth.</date>
Example
<date when="1990-09">September 1990</date>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.phrase"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration
element date
{
   tei_att.global.attributes,
   tei_att.calendarSystem.attributes,
   tei_att.canonical.attributes,
   tei_att.cmc.attributes,
   tei_att.datable.attributes,
   tei_att.dimensions.attributes,
   tei_att.editLike.attributes,
   tei_att.typed.attributes,
   ( text | tei_model.gLike | tei_model.phrase | tei_model.global )*
}

Appendix A.1.20 <delSpan>

<delSpan> (deleted span of text) marks the beginning of a longer sequence of text deleted, marked as deleted, or otherwise signaled as superfluous or spurious by an author, scribe, annotator, or corrector. [12.3.1.4. Additions and Deletions]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May containEmpty element
Note

Both the beginning and ending of the deleted sequence must be marked: the beginning by the <delSpan> element, the ending by the target of the spanTo attribute.

The text deleted must be at least partially legible, in order for the encoder to be able to transcribe it. If it is not legible at all, the <delSpan> tag should not be used. Rather, the <gap> tag should be employed to signal that text cannot be transcribed, with the value of the reason attribute giving the cause for the omission from the transcription as deletion. If it is not fully legible, the <unclear> element should be used to signal the areas of text which cannot be read with confidence. See further sections 12.3.1.7. Text Omitted from or Supplied in the Transcription and, for the close association of the <delSpan> tag with the <gap>, <damage>, <unclear> and <supplied> elements, 12.3.3.2. Use of the gap, del, damage, unclear, and supplied Elements in Combination.

The <delSpan> tag should not be used for deletions made by editors or encoders. In these cases, either the <corr> tag or the <gap> tag should be used.

Example
<p>Paragraph partially deleted. This is the undeleted portion <delSpan spanTo="#a23"/>and this the deleted portion of the paragraph.</p> <p>Paragraph deleted together with adjacent material.</p> <p>Second fully deleted paragraph.</p> <p>Paragraph partially deleted; in the middle of this paragraph the deletion ends and the anchor point marks the resumption <anchor xml:id="a23"/> of the text. ...</p>
Schematron
<sch:rule context="tei:delSpan"> <sch:assert test="@spanTo">The @spanTo attribute of <sch:name/> is required.</sch:assert> </sch:rule>
Schematron
<sch:rule context="tei:delSpan"> <sch:assert test="@spanTo">L'attribut spanTo est requis.</sch:assert> </sch:rule>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element delSpan
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.spanning.attributes,
   tei_att.transcriptional.attributes,
   tei_att.typed.attributes,
   empty
}

Appendix A.1.21 <desc>

<desc> (description) contains a short description of the purpose, function, or use of its parent element, or when the parent is a documentation element, describes or defines the object being documented. [23.4.1. Description of Components]
Modulecore — Formal specification
Attributes
typecharacterizes the element in some sense, using any convenient classification scheme or typology.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Suggested values include:
deprecationInfo
(deprecation information) This element describes why or how its parent element is being deprecated, typically including recommendations for alternate encoding.
<dataSpec module="tei"  ident="teidata.point"  validUntil="2050-02-25">  <desc type="deprecationInfo"   versionDate="2018-09-14"   xml:lang="en">Several standards bodies, including NIST in the USA,    strongly recommend against ending the representation of a number    with a decimal point. So instead of <q>3.</q> use either <q>3</q>    or <q>3.0</q>.</desc> <!-- ... --> </dataSpec>
Member of
Contained by
May contain
header: idno
transcr: ex subst
character data
Note

When used in a specification element such as <elementSpec>, TEI convention requires that this be expressed as a finite clause, begining with an active verb.

ExampleExample of a <desc> element inside a documentation element.
<dataSpec module="tei"  ident="teidata.point">  <desc versionDate="2010-10-17"   xml:lang="en">defines the data type used to express a point in cartesian space.</desc>  <content>   <dataRef name="token"    restriction="(-?[0-9]+(\.[0-9]+)?,-?[0-9]+(\.[0-9]+)?)"/>  </content> <!-- ... --> </dataSpec>
ExampleExample of a <desc> element in a non-documentation element.
<place xml:id="KERG2">  <placeName>Kerguelen Islands</placeName> <!-- ... -->  <terrain>   <desc>antarctic tundra</desc>  </terrain> <!-- ... --> </place>
SchematronA <desc> with a type of deprecationInfo should only occur when its parent element is being deprecated. Furthermore, it should always occur in an element that is being deprecated when <desc> is a valid child of that element.
<sch:rule context="tei:desc[ @type eq 'deprecationInfo']"> <sch:assert test="../@validUntil">Information about a deprecation should only be present in a specification element that is being deprecated: that is, only an element that has a @validUntil attribute should have a child <desc type="deprecationInfo">.</sch:assert> </sch:rule>
Content model
<content>
 <macroRef key="macro.limitedContent"/>
</content>
    
Schema Declaration
element desc
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.typed.attribute.subtype,
   attribute type { "deprecationInfo" }?,
   tei_macro.limitedContent
}

Appendix A.1.22 <div>

<div> (text division) contains a subdivision of the front, body, or back of a text. [4.1. Divisions of the Body]
Moduletextstructure — Formal specification
Attributes
type
StatusRecommended
Sample values include:
article
An article in a newspaper
advert
An advertisement in a newspaper
Member of
Contained by
textstructure: back body div front
May contain
Example
<body>  <div type="part">   <head>Fallacies of Authority</head>   <p>The subject of which is Authority in various shapes, and the object, to repress all      exercise of the reasoning faculty.</p>   <div n="1type="chapter">    <head>The Nature of Authority</head>    <p>With reference to any proposed measures having for their object the greatest        happiness of the greatest number [...]</p>    <div n="1.1type="section">     <head>Analysis of Authority</head>     <p>What on any given occasion is the legitimate weight or influence to be attached to          authority [...] </p>    </div>    <div n="1.2type="section">     <head>Appeal to Authority, in What Cases Fallacious.</head>     <p>Reference to authority is open to the charge of fallacy when [...] </p>    </div>   </div>  </div> </body>
Schematron
<sch:rule context="tei:l//tei:div"> <sch:assert test="ancestor::tei:floatingText"> Abstract model violation: Metrical lines may not contain higher-level structural elements such as div, unless div is a descendant of floatingText. </sch:assert> </sch:rule>
Schematron
<sch:rule context="tei:div"> <sch:report test="(ancestor::tei:p or ancestor::tei:ab) and not(ancestor::tei:floatingText)"> Abstract model violation: p and ab may not contain higher-level structural elements such as div, unless div is a descendant of floatingText. </sch:report> </sch:rule>
Content model
<content>
 <sequence minOccurs="1" maxOccurs="1">
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.divTop"/>
   <classRef key="model.global"/>
  </alternate>
  <sequence minOccurs="0" maxOccurs="1">
   <alternate minOccurs="1" maxOccurs="1">
    <sequence minOccurs="1"
     maxOccurs="unbounded">
     <alternate minOccurs="1" maxOccurs="1">
      <classRef key="model.divLike"/>
      <classRef key="model.divGenLike"/>
     </alternate>
     <classRef key="model.global"
      minOccurs="0" maxOccurs="unbounded"/>
    </sequence>
    <sequence minOccurs="1" maxOccurs="1">
     <sequence minOccurs="1"
      maxOccurs="unbounded">
      <alternate minOccurs="1"
       maxOccurs="1">
       <elementRef key="schemaSpec"/>
       <classRef key="model.common"/>
      </alternate>
      <classRef key="model.global"
       minOccurs="0" maxOccurs="unbounded"/>
     </sequence>
     <sequence minOccurs="0"
      maxOccurs="unbounded">
      <alternate minOccurs="1"
       maxOccurs="1">
       <classRef key="model.divLike"/>
       <classRef key="model.divGenLike"/>
      </alternate>
      <classRef key="model.global"
       minOccurs="0" maxOccurs="unbounded"/>
     </sequence>
    </sequence>
   </alternate>
   <sequence minOccurs="0"
    maxOccurs="unbounded">
    <classRef key="model.divBottom"/>
    <classRef key="model.global"
     minOccurs="0" maxOccurs="unbounded"/>
   </sequence>
  </sequence>
 </sequence>
</content>
    
Schema Declaration
element div
{
   tei_att.global.attributes,
   tei_att.declaring.attributes,
   tei_att.divLike.attributes,
   tei_att.placement.attributes,
   tei_att.typed.attribute.subtype,
   tei_att.written.attributes,
   attribute type { text }?,
   (
      ( tei_model.divTop | tei_model.global )*,
      (
         (
            (
               (
                  (
                     ( tei_model.divLike | tei_model.divGenLike ),
                     tei_model.global*
                  )+
               )
             | (
                  ( ( ( schemaSpec | tei_model.common ), tei_model.global* )+ ),
                  (
                     (
                        ( tei_model.divLike | tei_model.divGenLike ),
                        tei_model.global*
                     )*
                  )
               )
            ),
            ( ( tei_model.divBottom, tei_model.global* )* )
         )?
      )
   )
}

Appendix A.1.23 <docDate>

<docDate> (document date) contains the date of a document, as given on a title page or in a dateline. [4.6. Title Pages]
Moduletextstructure — Formal specification
Attributes
Member of
Contained by
textstructure: back body div front
May contain
Note

Cf. the general <date> element in the core tag set. This specialized element is provided for convenience in marking and processing the date of the documents, since it is likely to require specialized handling for many applications. It should be used only for the date of the entire document, not for any subset or part of it.

Example
<docImprint>Oxford, Clarendon Press, <docDate>1987</docDate> </docImprint>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element docDate
{
   tei_att.global.attributes,
   tei_att.calendarSystem.attributes,
   tei_att.cmc.attributes,
   tei_att.datable.attributes,
   tei_macro.phraseSeq
}

Appendix A.1.24 <edition>

<edition> (edition) describes the particularities of one edition of a text. [2.2.2. The Edition Statement]
Moduleheader — Formal specification
Attributes
Member of
Contained by
core: bibl
header: editionStmt
May contain
Example
<edition>First edition <date>Oct 1990</date> </edition> <edition n="S2">Students' edition</edition>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element edition { tei_att.global.attributes, tei_macro.phraseSeq }

Appendix A.1.25 <editionStmt>

<editionStmt> (edition statement) groups information relating to one edition of a text. [2.2.2. The Edition Statement 2.2. The File Description]
Moduleheader — Formal specification
Attributes
Contained by
header: fileDesc
May contain
Example
<editionStmt>  <edition n="S2">Students' edition</edition>  <respStmt>   <resp>Adapted by </resp>   <name>Elizabeth Kirk</name>  </respStmt> </editionStmt>
Example
<editionStmt>  <p>First edition, <date>Michaelmas Term, 1991.</date>  </p> </editionStmt>
Content model
<content>
 <alternate>
  <classRef key="model.pLike" minOccurs="1"
   maxOccurs="unbounded"/>
  <sequence>
   <elementRef key="edition"/>
   <classRef key="model.respLike"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
 </alternate>
</content>
    
Schema Declaration
element editionStmt
{
   tei_att.global.attributes,
   ( tei_model.pLike+ | ( tei_edition, tei_model.respLike* ) )
}

Appendix A.1.26 <editorialDecl>

<editorialDecl> (editorial practice declaration) provides details of editorial principles and practices applied during the encoding of a text. [2.3.3. The Editorial Practices Declaration 2.3. The Encoding Description 16.3.2. Declarable Elements]
Moduleheader — Formal specification
Attributes
Member of
Contained by
header: encodingDesc
May contain
Example
<editorialDecl>  <normalization>   <p>All words converted to Modern American spelling using      Websters 9th Collegiate dictionary   </p>  </normalization>  <quotation marks="all">   <p>All opening quotation marks converted to “ all closing      quotation marks converted to &amp;cdq;.</p>  </quotation> </editorialDecl>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde"  value="tei:editorialDecl"/> </sch:pattern>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.pLike"/>
  <classRef key="model.editorialDeclPart"/>
 </alternate>
</content>
    
Schema Declaration
element editorialDecl
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   ( tei_model.pLike | tei_model.editorialDeclPart )+
}

Appendix A.1.27 <email>

<email> (electronic mail address) contains an email address identifying a location to which email messages can be delivered. [3.6.2. Addresses]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
Note

The format of a modern Internet email address is defined in RFC 2822

Example
<email>membership@tei-c.org</email>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element email
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_macro.phraseSeq
}

Appendix A.1.28 <encodingDesc>

<encodingDesc> (encoding description) documents the relationship between an electronic text and the source or sources from which it was derived. [2.3. The Encoding Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Attributes
Member of
Contained by
header: teiHeader
May contain
Example
<encodingDesc>  <p>Basic encoding, capturing lexical information only. All    hyphenation, punctuation, and variant spellings normalized. No    formatting or layout information preserved.</p> </encodingDesc>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.encodingDescPart"/>
  <classRef key="model.pLike"/>
 </alternate>
</content>
    
Schema Declaration
element encodingDesc
{
   tei_att.global.attributes,
   ( tei_model.encodingDescPart | tei_model.pLike )+
}

Appendix A.1.29 <ex>

<ex> (editorial expansion) contains a sequence of letters added by an editor or transcriber when expanding an abbreviation. [12.3.1.2. Abbreviation and Expansion]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May contain
gaiji: g
character data
Example
The address is Southmoor <choice>  <expan>R<ex>oa</ex>d</expan>  <abbr>Rd</abbr> </choice>
Content model
<content>
 <macroRef key="macro.xtext"/>
</content>
    
Schema Declaration
element ex
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.editLike.attributes,
   tei_macro.xtext
}

Appendix A.1.30 <extent>

<extent> (extent) describes the approximate size of a text stored on some carrier medium or of some other object, digital or non-digital, specified in any convenient units. [2.2.3. Type and Extent of File 2.2. The File Description 3.12.2.4. Imprint, Size of a Document, and Reprint Information 11.7.1. Object Description]
Moduleheader — Formal specification
Attributes
Member of
Contained by
core: bibl
header: fileDesc
May contain
Example
<extent>3200 sentences</extent> <extent>between 10 and 20 Mb</extent> <extent>ten 3.5 inch high density diskettes</extent>
ExampleThe <measure> element may be used to supply normalized or machine tractable versions of the size or sizes concerned.
<extent>  <measure unit="MiBquantity="4.2">About four megabytes</measure>  <measure unit="pagesquantity="245">245 pages of source    material</measure> </extent>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element extent { tei_att.global.attributes, tei_macro.phraseSeq }

Appendix A.1.31 <facsimile>

<facsimile> contains a representation of some written source in the form of a set of images rather than as transcribed or encoded text. [12.1. Digital Facsimiles]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
core: teiCorpus
textstructure: TEI
transcr: facsimile
May contain
textstructure: back front
Example
<facsimile>  <graphic url="page1.png"/>  <surface>   <graphic url="page2-highRes.png"/>   <graphic url="page2-lowRes.png"/>  </surface>  <graphic url="page3.png"/>  <graphic url="page4.png"/> </facsimile>
Example
<facsimile>  <surface ulx="0uly="0lrx="200lry="300">   <graphic url="Bovelles-49r.png"/>  </surface> </facsimile>
Schematron
<sch:rule context="tei:facsimile//tei:line | tei:facsimile//tei:zone"> <sch:report test="child::text()[ normalize-space(.) ne '']"> A facsimile element represents a text with images, thus transcribed text should not be present within it. </sch:report> </sch:rule>
Content model
<content>
 <sequence>
  <elementRef key="front" minOccurs="0"/>
  <alternate>
   <alternate minOccurs="1"
    maxOccurs="unbounded">
    <classRef key="model.graphicLike"/>
    <elementRef key="surface"/>
    <elementRef key="surfaceGrp"/>
   </alternate>
   <elementRef key="facsimile"
    minOccurs="1" maxOccurs="unbounded"/>
  </alternate>
  <elementRef key="back" minOccurs="0"/>
 </sequence>
</content>
    
Schema Declaration
element facsimile
{
   tei_att.global.attributes,
   tei_att.declaring.attributes,
   (
      tei_front?,
      (
         ( tei_model.graphicLike | tei_surface | tei_surfaceGrp )+
       | tei_facsimile+
      ),
      tei_back?
   )
}

Appendix A.1.32 <fileDesc>

<fileDesc> (file description) contains a full bibliographic description of an electronic file. [2.2. The File Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Attributes
Contained by
header: teiHeader
May contain
Note

The major source of information for those seeking to create a catalogue entry or bibliographic citation for an electronic file. As such, it provides a title and statements of responsibility together with details of the publication or distribution of the file, of any series to which it belongs, and detailed bibliographic notes for matters not addressed elsewhere in the header. It also contains a full bibliographic description for the source or sources from which the electronic text was derived.

Example
<fileDesc>  <titleStmt>   <title>The shortest possible TEI document</title>  </titleStmt>  <publicationStmt>   <p>Distributed as part of TEI P5</p>  </publicationStmt>  <sourceDesc>   <p>No print source exists: this is an original digital text</p>  </sourceDesc> </fileDesc>
Content model
<content>
 <sequence>
  <sequence>
   <elementRef key="titleStmt"/>
   <elementRef key="editionStmt"
    minOccurs="0"/>
   <elementRef key="extent" minOccurs="0"/>
   <elementRef key="publicationStmt"/>
   <elementRef key="seriesStmt"
    minOccurs="0" maxOccurs="unbounded"/>
   <elementRef key="notesStmt"
    minOccurs="0"/>
  </sequence>
  <elementRef key="sourceDesc"
   minOccurs="1" maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration
element fileDesc
{
   tei_att.global.attributes,
   (
      (
         tei_titleStmt,
         tei_editionStmt?,
         tei_extent?,
         tei_publicationStmt,
         seriesStmt*,
         notesStmt?
      ),
      tei_sourceDesc+
   )
}

Appendix A.1.33 <front>

<front> (front matter) contains any prefatory matter (headers, abstracts, title page, prefaces, dedications, etc.) found at the start of a document, before the main body. [4.6. Title Pages 4. Default Text Structure]
Moduletextstructure — Formal specification
Attributes
Contained by
textstructure: text
transcr: facsimile
May contain
Note

Because cultural conventions differ as to which elements are grouped as front matter and which as back matter, the content models for the <front> and <back> elements are identical.

Example
<front>  <epigraph>   <quote>Nam Sibyllam quidem Cumis ego ipse oculis meis vidi in ampulla      pendere, et cum illi pueri dicerent: <q xml:lang="grc">Σίβυλλα τί        θέλεις</q>; respondebat illa: <q xml:lang="grc">ὰποθανεῖν θέλω.</q>   </quote>  </epigraph>  <div type="dedication">   <p>For Ezra Pound <q xml:lang="it">il miglior fabbro.</q>   </p>  </div> </front>
Example
<front>  <div type="dedication">   <p>To our three selves</p>  </div>  <div type="preface">   <head>Author's Note</head>   <p>All the characters in this book are purely imaginary, and if the      author has used names that may suggest a reference to living persons      she has done so inadvertently. ...</p>  </div> </front>
Example
<front>  <div type="abstract">   <div>    <head> BACKGROUND:</head>    <p>Food insecurity can put children at greater risk of obesity because        of altered food choices and nonuniform consumption patterns.</p>   </div>   <div>    <head> OBJECTIVE:</head>    <p>We examined the association between obesity and both child-level        food insecurity and personal food insecurity in US children.</p>   </div>   <div>    <head> DESIGN:</head>    <p>Data from 9,701 participants in the National Health and Nutrition        Examination Survey, 2001-2010, aged 2 to 11 years were analyzed.        Child-level food insecurity was assessed with the US Department of        Agriculture's Food Security Survey Module based on eight        child-specific questions. Personal food insecurity was assessed with        five additional questions. Obesity was defined, using physical        measurements, as body mass index (calculated as kg/m2) greater than        or equal to the age- and sex-specific 95th percentile of the Centers        for Disease Control and Prevention growth charts. Logistic        regressions adjusted for sex, race/ethnic group, poverty level, and        survey year were conducted to describe associations between obesity        and food insecurity.</p>   </div>   <div>    <head> RESULTS:</head>    <p>Obesity was significantly associated with personal food insecurity        for children aged 6 to 11 years (odds ratio=1.81; 95% CI 1.33 to        2.48), but not in children aged 2 to 5 years (odds ratio=0.88; 95%        CI 0.51 to 1.51). Child-level food insecurity was not associated        with obesity among 2- to 5-year-olds or 6- to 11-year-olds.</p>   </div>   <div>    <head> CONCLUSIONS:</head>    <p>Personal food insecurity is associated with an increased risk of        obesity only in children aged 6 to 11 years. Personal        food-insecurity measures may give different results than aggregate        food-insecurity measures in children.</p>   </div>  </div> </front>
Content model
<content>
 <sequence>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.frontPart"/>
   <classRef key="model.pLike"/>
   <classRef key="model.pLike.front"/>
   <classRef key="model.global"/>
  </alternate>
  <sequence minOccurs="0">
   <alternate>
    <sequence>
     <classRef key="model.div1Like"/>
     <alternate minOccurs="0"
      maxOccurs="unbounded">
      <classRef key="model.div1Like"/>
      <classRef key="model.frontPart"/>
      <classRef key="model.global"/>
     </alternate>
    </sequence>
    <sequence>
     <classRef key="model.divLike"/>
     <alternate minOccurs="0"
      maxOccurs="unbounded">
      <classRef key="model.divLike"/>
      <classRef key="model.frontPart"/>
      <classRef key="model.global"/>
     </alternate>
    </sequence>
   </alternate>
   <sequence minOccurs="0">
    <classRef key="model.divBottom"/>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <classRef key="model.divBottom"/>
     <classRef key="model.global"/>
    </alternate>
   </sequence>
  </sequence>
 </sequence>
</content>
    
Schema Declaration
element front
{
   tei_att.global.attributes,
   tei_att.declaring.attributes,
   (
      (
         tei_model.frontPart
       | tei_model.pLike
       | tei_model.pLike.front
       | tei_model.global
      )*,
      (
         (
            (
               (
                  tei_model.div1Like,
                  (
                     tei_model.div1Like
                   | tei_model.frontPart
                   | tei_model.global
                  )*
               )
             | (
                  tei_model.divLike,
                  (
                     tei_model.divLike
                   | tei_model.frontPart
                   | tei_model.global
                  )*
               )
            ),
            (
               (
                  tei_model.divBottom,
                  ( tei_model.divBottom | tei_model.global )*
               )?
            )
         )?
      )
   )
}

Appendix A.1.34 <funder>

<funder> (funding body) specifies the name of an individual, institution, or organization responsible for the funding of a project or text. [2.2.1. The Title Statement]
Moduleheader — Formal specification
Attributes
Member of
Contained by
May contain
Note

Funders provide financial support for a project; they are distinct from sponsors (see element <sponsor>), who provide intellectual support and authority.

Example
<funder>The National Endowment for the Humanities, an independent federal agency</funder> <funder>Directorate General XIII of the Commission of the European Communities</funder> <funder>The Andrew W. Mellon Foundation</funder> <funder>The Social Sciences and Humanities Research Council of Canada</funder>
Content model
<content>
 <macroRef key="macro.phraseSeq.limited"/>
</content>
    
Schema Declaration
element funder
{
   tei_att.global.attributes,
   tei_att.canonical.attributes,
   tei_att.datable.attributes,
   tei_macro.phraseSeq.limited
}

Appendix A.1.35 <fw>

<fw> (forme work) contains a running head (e.g. a header, footer), catchword, or similar material appearing on the current page. [12.6. Headers, Footers, and Similar Matter]
Moduletranscr — Formal specification
Attributes
typeclassifies the material encoded according to some useful typology.
Derived fromatt.typed
StatusRecommended
Datatypeteidata.enumerated
Sample values include:
header
a running title at the top of the page
footer
a running title at the bottom of the page
pageNum
(page number) a page number or foliation symbol
lineNum
(line number) a line number, either of prose or poetry
sig
(signature) a signature or gathering symbol
catch
(catchword) a catch-word
Member of
Contained by
May contain
Note

Where running heads are consistent throughout a chapter or section, it is usually more convenient to relate them to the chapter or section, e.g. by use of the rend attribute. The <fw> element is intended for cases where the running head changes from page to page, or where details of page layout and the internal structure of the running heads are of paramount importance.

Example
<fw type="sigplace="bottom">C3</fw>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element fw
{
   tei_att.global.attributes,
   tei_att.placement.attributes,
   tei_att.typed.attribute.subtype,
   tei_att.written.attributes,
   attribute type { text }?,
   tei_macro.phraseSeq
}

Appendix A.1.36 <g>

<g> (character or glyph) represents a glyph, or a non-standard character. [5. Characters, Glyphs, and Writing Modes]
Modulegaiji — Formal specification
Attributes
refpoints to a description of the character or glyph intended.
StatusOptional
Datatypeteidata.pointer
Member of
Contained by
May containCharacter data only
Note

The name g is short for gaiji, which is the Japanese term for a non-standardized character or glyph.

Example
<g ref="#ctlig">ct</g>
This example points to a <glyph> element with the identifier ctlig like the following:
<glyph xml:id="ctlig"> <!-- here we describe the particular ct-ligature intended --> </glyph>
Example
<g ref="#per-glyph">per</g>
The medieval brevigraph per could similarly be considered as an individual glyph, defined in a <glyph> element with the identifier per-glyph as follows:
<glyph xml:id="per-glyph"> <!-- ... --> </glyph>
Content model
<content>
 <textNode/>
</content>
    
Schema Declaration
element g
{
   tei_att.global.attributes,
   tei_att.typed.attributes,
   attribute ref { text }?,
   text
}

Appendix A.1.37 <gap>

<gap> (gap) indicates a point where material has been omitted in a transcription, whether for editorial reasons described in the TEI header, as part of sampling practice, or because the material is illegible, invisible, or inaudible. [3.5.3. Additions, Deletions, and Omissions]
Modulecore — Formal specification
Attributes
reason(reason) gives the reason for omission.
StatusOptional
Datatype1–∞ occurrences of teidata.enumerated separated by whitespace
Suggested values include:
cancelled
(cancelled)
deleted
(deleted)
editorial
(editorial) for features omitted from transcription due to editorial policy
illegible
(illegible)
inaudible
(inaudible)
irrelevant
(irrelevant)
sampling
(sampling)
agent(agent) in the case of text omitted because of damage, categorizes the cause of the damage, if it can be identified.
StatusOptional
Datatypeteidata.enumerated
Sample values include:
rubbing
(rubbing) damage results from rubbing of the leaf edges
mildew
(mildew) damage results from mildew on the leaf surface
smoke
(smoke) damage results from smoke
Member of
Contained by
May contain
core: desc
Note

The <gap>, <unclear>, and <del> core tag elements may be closely allied in use with the <damage> and <supplied> elements, available when using the additional tagset for transcription of primary sources. See section 12.3.3.2. Use of the gap, del, damage, unclear, and supplied Elements in Combination for discussion of which element is appropriate for which circumstance.

The <gap> tag simply signals the editors decision to omit or inability to transcribe a span of text. Other information, such as the interpretation that text was deliberately erased or covered, should be indicated using the relevant tags, such as <del> in the case of deliberate deletion.

Example
<gap quantity="4unit="chars"  reason="illegible"/>
Example
<gap quantity="1unit="essay"  reason="sampling"/>
Example
<del>  <gap atLeast="4atMost="8unit="chars"   reason="illegible"/> </del>
Example
<gap extent="several linesreason="lost"/>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <classRef key="model.descLike"/>
  <classRef key="model.certLike"/>
 </alternate>
</content>
    
Schema Declaration
element gap
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.dimensions.attributes,
   tei_att.editLike.attributes,
   tei_att.timed.attributes,
   attribute reason
   {
      list
      {
         (
            "cancelled"
          | "deleted"
          | "editorial"
          | "illegible"
          | "inaudible"
          | "irrelevant"
          | "sampling"
         )+
      }
   }?,
   attribute agent { text }?,
   ( tei_model.descLike | tei_model.certLike )*
}

Appendix A.1.38 <glyph>

<glyph> (character glyph) provides descriptive information about a character glyph. [5.2. Markup Constructs for Representation of Characters and Glyphs]
Modulegaiji — Formal specification
Attributes
Contained by
gaiji: charDecl
May contain
Example
<glyph xml:id="rstroke">  <localProp name="Name"   value="LATIN SMALL LETTER R WITH A FUNNY STROKE"/>  <localProp name="entityvalue="rstroke"/>  <figure>   <graphic url="glyph-rstroke.png"/>  </figure> </glyph>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <elementRef key="unicodeProp"/>
  <elementRef key="unihanProp"/>
  <elementRef key="localProp"/>
  <elementRef key="mapping"/>
  <elementRef key="figure"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.noteLike"/>
  <classRef key="model.descLike"/>
 </alternate>
</content>
    
Schema Declaration
element glyph
{
   tei_att.global.attributes,
   (
      tei_unicodeProp
    | tei_unihanProp
    | tei_localProp
    | tei_mapping
    | figure
    | tei_model.graphicLike
    | tei_model.noteLike
    | tei_model.descLike
   )*
}

Appendix A.1.39 <graphic>

<graphic> (graphic) indicates the location of a graphic or illustration, either forming part of a text, or providing an image of it. [3.10. Graphics and Other Non-textual Components 12.1. Digital Facsimiles]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
core: desc
Note

The mimeType attribute should be used to supply the MIME media type of the image specified by the url attribute.

Within the body of a text, a <graphic> element indicates the presence of a graphic component in the source itself. Within the context of a <facsimile> or <sourceDoc> element, however, a <graphic> element provides an additional digital representation of some part of the source being encoded.

Example
<figure>  <graphic url="fig1.png"/>  <head>Figure One: The View from the Bridge</head>  <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a    series of buoys strung out between them.</figDesc> </figure>
Example
<facsimile>  <surfaceGrp n="leaf1">   <surface>    <graphic url="page1.png"/>   </surface>   <surface>    <graphic url="page2-highRes.png"/>    <graphic url="page2-lowRes.png"/>   </surface>  </surfaceGrp> </facsimile>
Example
<facsimile>  <surfaceGrp n="leaf1xml:id="spi001">   <surface xml:id="spi001r">    <graphic type="normal"     subtype="thumbnailurl="spi/thumb/001r.jpg"/>    <graphic type="normalsubtype="low-res"     url="spi/normal/lowRes/001r.jpg"/>    <graphic type="normal"     subtype="high-resurl="spi/normal/highRes/001r.jpg"/>    <graphic type="high-contrast"     subtype="low-resurl="spi/contrast/lowRes/001r.jpg"/>    <graphic type="high-contrast"     subtype="high-resurl="spi/contrast/highRes/001r.jpg"/>   </surface>   <surface xml:id="spi001v">    <graphic type="normal"     subtype="thumbnailurl="spi/thumb/001v.jpg"/>    <graphic type="normalsubtype="low-res"     url="spi/normal/lowRes/001v.jpg"/>    <graphic type="normal"     subtype="high-resurl="spi/normal/highRes/001v.jpg"/>    <graphic type="high-contrast"     subtype="low-resurl="spi/contrast/lowRes/001v.jpg"/>    <graphic type="high-contrast"     subtype="high-resurl="spi/contrast/highRes/001v.jpg"/>    <zone xml:id="spi001v_detail01">     <graphic type="normal"      subtype="thumbnailurl="spi/thumb/001v-detail01.jpg"/>     <graphic type="normal"      subtype="low-res"      url="spi/normal/lowRes/001v-detail01.jpg"/>     <graphic type="normal"      subtype="high-res"      url="spi/normal/highRes/001v-detail01.jpg"/>     <graphic type="high-contrast"      subtype="low-res"      url="spi/contrast/lowRes/001v-detail01.jpg"/>     <graphic type="high-contrast"      subtype="high-res"      url="spi/contrast/highRes/001v-detail01.jpg"/>    </zone>   </surface>  </surfaceGrp> </facsimile>
Content model
<content>
 <classRef key="model.descLike"
  minOccurs="0" maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element graphic
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.declaring.attributes,
   tei_att.media.attributes,
   tei_att.resourced.attributes,
   tei_att.typed.attributes,
   tei_model.descLike*
}

Appendix A.1.40 <handNotes>

<handNotes> contains one or more <handNote> elements documenting the different hands identified within the source texts. [12.3.2.1. Document Hands]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
header: profileDesc
May containEmpty element
Example
<handNotes>  <handNote xml:id="H1script="copperplate"   medium="brown-ink">Carefully written with regular descenders</handNote>  <handNote xml:id="H2script="print"   medium="pencil">Unschooled scrawl</handNote> </handNotes>
Content model
<content>
 <elementRef key="handNote" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element handNotes { tei_att.global.attributes, handNote+ }

Appendix A.1.41 <handShift>

<handShift> (handwriting shift) marks the beginning of a sequence of text written in a new hand, or the beginning of a scribal stint. [12.3.2.1. Document Hands]
Moduletranscr — Formal specification
Attributes
newindicates a <handNote> element describing the hand concerned.
StatusRecommended
Datatypeteidata.pointer
Note

This attribute serves the same function as the hand attribute provided for those elements which are members of the att.transcriptional class. It may be renamed at a subsequent major release.

Member of
Contained by
May containEmpty element
Note

The <handShift> element may be used either to denote a shift in the document hand (as from one scribe to another, on one writing style to another). Or, it may indicate a shift within a document hand, as a change of writing style, character or ink. Like other milestone elements, it should appear at the point of transition from some other state to the state which it describes.

Example
<l>When wolde the cat dwelle in his ynne</l> <handShift medium="greenish-ink"/> <l>And if the cattes skynne be slyk <handShift medium="black-ink"/> and gaye</l>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element handShift
{
   tei_att.global.attributes,
   tei_att.handFeatures.attributes,
   attribute new { text }?,
   empty
}

Appendix A.1.42 <head>

<head> (heading) contains any type of heading, for example the title of a section, or the heading of a list, glossary, manuscript description, etc. [4.2.1. Headings and Trailers]
Modulecore — Formal specification
Attributes
Member of
Contained by
textstructure: back body div front
May contain
Note

The <head> element is used for headings at all levels; software which treats (e.g.) chapter headings, section headings, and list titles differently must determine the proper processing of a <head> element based on its structural position. A <head> occurring as the first element of a list is the title of that list; one occurring as the first element of a <div1> is the title of that chapter or section.

ExampleThe most common use for the <head> element is to mark the headings of sections. In older writings, the headings or incipits may be rather longer than usual in modern works. If a section has an explicit ending as well as a heading, it should be marked as a <trailer>, as in this example:
<div1 n="Itype="book">  <head>In the name of Christ here begins the first book of the ecclesiastical history of    Georgius Florentinus, known as Gregory, Bishop of Tours.</head>  <div2 type="section">   <head>In the name of Christ here begins Book I of the history.</head>   <p>Proposing as I do ...</p>   <p>From the Passion of our Lord until the death of Saint Martin four hundred and twelve      years passed.</p>   <trailer>Here ends the first Book, which covers five thousand, five hundred and ninety-six      years from the beginning of the world down to the death of Saint Martin.</trailer>  </div2> </div1>
ExampleWhen headings are not inline with the running text (see e.g. the heading "Secunda conclusio") they might however be encoded as if. The actual placement in the source document can be captured with the place attribute.
<div type="subsection">  <head place="margin">Secunda conclusio</head>  <p>   <lb n="1251"/>   <hi rend="large">Potencia: habitus: et actus: recipiunt speciem ab obiectis<supplied>.</supplied>   </hi>   <lb n="1252"/>Probatur sic. Omne importans necessariam habitudinem ad proprium    [...]  </p> </div>
ExampleThe <head> element is also used to mark headings of other units, such as lists:
With a few exceptions, connectives are equally useful in all kinds of discourse: description, narration, exposition, argument. <list rend="bulleted">  <head>Connectives</head>  <item>above</item>  <item>accordingly</item>  <item>across from</item>  <item>adjacent to</item>  <item>again</item>  <item> <!-- ... -->  </item> </list>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <elementRef key="lg"/>
  <classRef key="model.gLike"/>
  <classRef key="model.phrase"/>
  <classRef key="model.inter"/>
  <classRef key="model.lLike"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration
element head
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.placement.attributes,
   tei_att.typed.attributes,
   tei_att.written.attributes,
   (
      text
    | lg
    | tei_model.gLike
    | tei_model.phrase
    | tei_model.inter
    | tei_model.lLike
    | tei_model.global
   )*
}

Appendix A.1.43 <hyphenation>

<hyphenation> (hyphenation) summarizes the way in which hyphenation in a source text has been treated in an encoded version of it. [2.3.3. The Editorial Practices Declaration 16.3.2. Declarable Elements]
Moduleheader — Formal specification
Attributes
eol(end-of-line) indicates whether or not end-of-line hyphenation has been retained in a text.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
all
all end-of-line hyphenation has been retained, even though the lineation of the original may not have been.
some
end-of-line hyphenation has been retained in some cases.[Default]
hard
all soft end-of-line hyphenation has been removed: any remaining end-of-line hyphenation should be retained.
none
all end-of-line hyphenation has been removed: any remaining hyphenation occurred within the line.
Member of
Contained by
May contain
core: p
Example
<hyphenation eol="some">  <p>End-of-line hyphenation silently removed where appropriate</p> </hyphenation>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde"  value="tei:hyphenation"/> </sch:pattern>
Content model
<content>
 <classRef key="model.pLike" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element hyphenation
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   attribute eol { "all" | "some" | "hard" | "none" }?,
   tei_model.pLike+
}

Appendix A.1.44 <idno>

<idno> (identifier) supplies any form of identifier used to identify some object, such as a bibliographic item, a person, a title, an organization, etc. in a standardized way. [14.3.1. Basic Principles 2.2.4. Publication, Distribution, Licensing, etc. 2.2.5. The Series Statement 3.12.2.4. Imprint, Size of a Document, and Reprint Information]
Moduleheader — Formal specification
Attributes
typecategorizes the identifier, for example as an ISBN, Social Security number, etc.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Suggested values include:
ISBN
International Standard Book Number: a 13- or (if assigned prior to 2007) 10-digit identifying number assigned by the publishing industry to a published book or similar item, registered with the International ISBN Agency.
ISSN
International Standard Serial Number: an eight-digit number to uniquely identify a serial publication.
DOI
Digital Object Identifier: a unique string of letters and numbers assigned to an electronic document.
URI
Uniform Resource Identifier: a string of characters to uniquely identify a resource, following the syntax of RFC 3986.
VIAF
A data number in the Virtual Internet Authority File assigned to link different names in catalogs around the world for the same entity.
ESTC
English Short-Title Catalogue number: an identifying number assigned to a document in English printed in the British Isles or North America before 1801.
OCLC
OCLC control number (record number) for the union catalog record in WorldCat, a union catalog for member libraries in the Online Computer Library Center global cooperative.
Member of
Contained by
May contain
gaiji: g
header: idno
character data
Note

<idno> should be used for labels which identify an object or concept in a formal cataloguing system such as a database or an RDF store, or in a distributed system such as the World Wide Web. Some suggested values for type on <idno> are ISBN, ISSN, DOI, and URI.

Example
<idno type="ISBN">978-1-906964-22-1</idno> <idno type="ISSN">0143-3385</idno> <idno type="DOI">10.1000/123</idno> <idno type="URI">http://www.worldcat.org/oclc/185922478</idno> <idno type="URI">http://authority.nzetc.org/463/</idno> <idno type="LT">Thomason Tract E.537(17)</idno> <idno type="Wing">C695</idno> <idno type="oldCat">  <g ref="#sym"/>345 </idno>
In the last case, the identifier includes a non-Unicode character which is defined elsewhere by means of a <glyph> or <char> element referenced here as #sym.
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <elementRef key="idno"/>
 </alternate>
</content>
    
Schema Declaration
element idno
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.datable.attributes,
   tei_att.sortable.attributes,
   tei_att.typed.attribute.subtype,
   attribute type
   {
      "ISBN" | "ISSN" | "DOI" | "URI" | "VIAF" | "ESTC" | "OCLC"
   }?,
   ( text | tei_model.gLike | tei_idno )*
}

Appendix A.1.45 <label>

<label> (label) contains any label or heading used to identify part of a text, typically but not exclusively in a list or glossary. [3.8. Lists]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
ExampleLabels are commonly used for the headwords in glossary lists; note the use of the global xml:lang attribute to set the default language of the glossary list to Middle English, and identify the glosses and headings as modern English or Latin:
<list type="glossxml:lang="enm">  <head xml:lang="en">Vocabulary</head>  <headLabel xml:lang="en">Middle English</headLabel>  <headItem xml:lang="en">New English</headItem>  <label>nu</label>  <item xml:lang="en">now</item>  <label>lhude</label>  <item xml:lang="en">loudly</item>  <label>bloweth</label>  <item xml:lang="en">blooms</item>  <label>med</label>  <item xml:lang="en">meadow</item>  <label>wude</label>  <item xml:lang="en">wood</item>  <label>awe</label>  <item xml:lang="en">ewe</item>  <label>lhouth</label>  <item xml:lang="en">lows</item>  <label>sterteth</label>  <item xml:lang="en">bounds, frisks (cf. <cit>    <ref>Chaucer, K.T.644</ref>    <quote>a courser, <term>sterting</term>as the fyr</quote>   </cit>  </item>  <label>verteth</label>  <item xml:lang="la">pedit</item>  <label>murie</label>  <item xml:lang="en">merrily</item>  <label>swik</label>  <item xml:lang="en">cease</item>  <label>naver</label>  <item xml:lang="en">never</item> </list>
ExampleLabels may also be used to record explicitly the numbers or letters which mark list items in ordered lists, as in this extract from Gibbon's Autobiography. In this usage the <label> element is synonymous with the n attribute on the <item> element:
I will add two facts, which have seldom occurred in the composition of six, or at least of five quartos. <list rend="runontype="ordered">  <label>(1)</label>  <item>My first rough manuscript, without any intermediate copy, has been sent to the press.</item>  <label>(2) </label>  <item>Not a sheet has been seen by any human eyes, excepting those of the author and the    printer: the faults and the merits are exclusively my own.</item> </list>
ExampleLabels may also be used for other structured list items, as in this extract from the journal of Edward Gibbon:
<list type="gloss">  <label>March 1757.</label>  <item>I wrote some critical observations upon Plautus.</item>  <label>March 8th.</label>  <item>I wrote a long dissertation upon some lines of Virgil.</item>  <label>June.</label>  <item>I saw Mademoiselle Curchod — <quote xml:lang="la">Omnia vincit amor, et nos cedamus      amori.</quote>  </item>  <label>August.</label>  <item>I went to Crassy, and staid two days.</item> </list>
Note that the <label> might also appear within the <item> rather than as its sibling. Though syntactically valid, this usage is not recommended TEI practice.
ExampleLabels may also be used to represent a label or heading attached to a paragraph or sequence of paragraphs not treated as a structural division, or to a group of verse lines. Note that, in this case, the <label> element appears within the <p> or <lg> element, rather than as a preceding sibling of it.
<p>[...] <lb/>&amp; n’entrer en mauuais &amp; mal-heu- <lb/>ré meſnage. Or des que le conſente- <lb/>ment des parties y eſt le mariage eſt <lb/> arreſté, quoy que de faict il ne ſoit <label place="margin">Puiſſance maritale    entre les Romains.</label>  <lb/> conſommé. Depuis la conſomma- <lb/>tion du mariage la femme eſt ſoubs <lb/> la puiſſance du mary, s’il n’eſt eſcla- <lb/>ue ou enfant de famille : car en ce <lb/> cas, la femme, qui a eſpouſé vn en- <lb/>fant de famille, eſt ſous la puiſſance [...]</p>
In this example the text of the label appears in the right hand margin of the original source, next to the paragraph it describes, but approximately in the middle of it. If so desired the type attribute may be used to distinguish different categories of label.
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element label
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.placement.attributes,
   tei_att.typed.attributes,
   tei_att.written.attributes,
   tei_macro.phraseSeq
}

Appendix A.1.46 <langUsage>

<langUsage> (language usage) describes the languages, sublanguages, registers, dialects, etc. represented within a text. [2.4.2. Language Usage 2.4. The Profile Description 16.3.2. Declarable Elements]
Moduleheader — Formal specification
Attributes
Member of
Contained by
header: profileDesc
May contain
core: p
header: language
Example
<langUsage>  <language ident="fr-CAusage="60">Québecois</language>  <language ident="en-CAusage="20">Canadian business English</language>  <language ident="en-GBusage="20">British English</language> </langUsage>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde" value="tei:langUsage"/> </sch:pattern>
Content model
<content>
 <alternate>
  <classRef key="model.pLike" minOccurs="1"
   maxOccurs="unbounded"/>
  <elementRef key="language" minOccurs="1"
   maxOccurs="unbounded"/>
 </alternate>
</content>
    
Schema Declaration
element langUsage
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   ( tei_model.pLike+ | tei_language+ )
}

Appendix A.1.47 <language>

<language> (language) characterizes a single language or sublanguage used within a text. [2.4.2. Language Usage]
Moduleheader — Formal specification
Attributes
ident(identifier) Supplies a language code constructed as defined in BCP 47 which is used to identify the language documented by this element, and which may be referenced by the global xml:lang attribute.
StatusRequired
Datatypeteidata.language
usagespecifies the approximate percentage of the text which uses this language.
StatusOptional
DatatypenonNegativeInteger
Contained by
header: langUsage
May contain
Note

Particularly for sublanguages, an informal prose characterization should be supplied as content for the element.

Example
<langUsage>  <language ident="en-USusage="75">modern American English</language>  <language ident="az-Arabusage="20">Azerbaijani in Arabic script</language>  <language ident="x-lapusage="05">Pig Latin</language> </langUsage>
Content model
<content>
 <macroRef key="macro.phraseSeq.limited"/>
</content>
    
Schema Declaration
element language
{
   tei_att.global.attributes,
   tei_att.scope.attributes,
   attribute ident { text },
   attribute usage { text }?,
   tei_macro.phraseSeq.limited
}

Appendix A.1.48 <licence>

<licence> contains information about a licence or other legal agreement applicable to the text. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduleheader — Formal specification
Attributes
Member of
Contained by
header: availability
May contain
Note

A <licence> element should be supplied for each licence agreement applicable to the text in question. The target attribute may be used to reference a full version of the licence. The when, notBefore, notAfter, from or to attributes may be used in combination to indicate the date or dates of applicability of the licence.

Example
<licence target="http://www.nzetc.org/tm/scholarly/tei-NZETC-Help.html#licensing"> Licence: Creative Commons Attribution-Share Alike 3.0 New Zealand Licence </licence>
Example
<availability>  <licence target="http://creativecommons.org/licenses/by/3.0/"   notBefore="2013-01-01">   <p>The Creative Commons Attribution 3.0 Unported (CC BY 3.0) Licence      applies to this document.</p>   <p>The licence was added on January 1, 2013.</p>  </licence> </availability>
Content model
<content>
 <macroRef key="macro.specialPara"/>
</content>
    
Schema Declaration
element licence
{
   tei_att.global.attributes,
   tei_att.datable.attributes,
   tei_att.pointing.attributes,
   tei_macro.specialPara
}

Appendix A.1.49 <line>

<line> contains the transcription of a topographic line in the source document. [12.2.2. Embedded Transcription]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
transcr: line surface zone
May contain
Note

This element should be used only to mark up writing which is topographically organized as a series of lines, horizontal or vertical. It should not be used to mark lines of verse (for which use <l>) nor to mark line beginnings within text which has been encoded using structural elements such as <p> (for which use <lb>).

ExampleThis example shows topographical lines as a means of preserving the visual appearance of a poem:
<surface>  <zone>   <line>Poem</line>   <line>As in Visions of — at</line>   <line>night —</line>   <line>All sorts of fancies running through</line>   <line>the head</line>  </zone> </surface>
Example
<surface>  <zone>   <line>Hope you enjoyed</line>   <line>Wales, as they      said</line>   <line>to Mrs FitzHerbert</line>   <line>Mama</line>  </zone>  <zone>   <line>Printed in England</line>  </zone> </surface>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.global"/>
  <classRef key="model.gLike"/>
  <classRef key="model.linePart"/>
 </alternate>
</content>
    
Schema Declaration
element line
{
   tei_att.global.attributes,
   tei_att.coordinated.attributes,
   tei_att.typed.attributes,
   tei_att.written.attributes,
   ( text | tei_model.global | tei_model.gLike | tei_model.linePart )*
}

Appendix A.1.50 <listPrefixDef>

<listPrefixDef> (list of prefix definitions) contains a list of definitions of prefixing schemes used in teidata.pointer values, showing how abbreviated URIs using each scheme may be expanded into full URIs. [17.2.3. Using Abbreviated Pointers]
Moduleheader — Formal specification
Attributes
Member of
Contained by
May contain
ExampleIn this example, two private URI scheme prefixes are defined and patterns are provided for dereferencing them. Each prefix is also supplied with a human-readable explanation in a <p> element.
<listPrefixDef>  <prefixDef ident="psn"   matchPattern="([A-Z]+)"   replacementPattern="personography.xml#$1">   <p> Private URIs using the <code>psn</code>      prefix are pointers to <gi>person</gi>      elements in the personography.xml file.      For example, <code>psn:MDH</code>      dereferences to <code>personography.xml#MDH</code>.   </p>  </prefixDef>  <prefixDef ident="bibl"   matchPattern="([a-z]+[a-z0-9]*)"   replacementPattern="http://www.example.com/getBibl.xql?id=$1">   <p> Private URIs using the <code>bibl</code> prefix can be      expanded to form URIs which retrieve the relevant      bibliographical reference from www.example.com.   </p>  </prefixDef> </listPrefixDef>
Content model
<content>
 <sequence>
  <elementRef key="desc" minOccurs="0"
   maxOccurs="unbounded"/>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
   <elementRef key="prefixDef"/>
   <elementRef key="listPrefixDef"/>
  </alternate>
 </sequence>
</content>
    
Schema Declaration
element listPrefixDef
{
   tei_att.global.attributes,
   ( tei_desc*, ( tei_prefixDef | tei_listPrefixDef )+ )
}

Appendix A.1.51 <listTranspose>

<listTranspose> supplies a list of transpositions, each of which is indicated at some point in a document typically by means of metamarks. [12.3.4.5. Transpositions]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May contain
core: desc
transcr: transpose
Example
<listTranspose>  <transpose>   <ptr target="#ib02"/>   <ptr target="#ib01"/>  </transpose> </listTranspose>
This example might be used for a source document which indicates in some way that the elements identified by ib02 and code ib01 should be read in that order (ib02 followed by ib01), rather than in the reading order in which they are presented in the source.
Content model
<content>
 <sequence>
  <elementRef key="desc" minOccurs="0"
   maxOccurs="unbounded"/>
  <elementRef key="transpose" minOccurs="1"
   maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration
element listTranspose
{
   tei_att.global.attributes,
   ( tei_desc*, tei_transpose+ )
}

Appendix A.1.52 <localProp>

<localProp> (locally defined property) provides a locally defined character (or glyph) property. [5.2.1. Character Properties]
Modulegaiji — Formal specification
Attributes
Contained by
gaiji: char glyph
May containEmpty element
Note

No definitive list of local names is proposed. However, the name entity is recommended as a means of naming the property identifying the recommended character entity name for this character or glyph.

Example
<char xml:id="daikanwaU4EBA">  <localProp name="name"   value="CIRCLED IDEOGRAPH 4EBA"/>  <localProp name="entityvalue="daikanwa"/>  <unicodeProp name="Decomposition_Mapping"   value="circle"/>  <mapping type="standard"></mapping> </char>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element localProp
{
   tei_att.global.attributes,
   tei_att.gaijiProp.attributes,
   empty
}

Appendix A.1.53 <mapping>

<mapping> (character mapping) contains one or more characters which are related to the parent character or glyph in some respect, as specified by the type attribute. [5.2. Markup Constructs for Representation of Characters and Glyphs]
Modulegaiji — Formal specification
Attributes
Contained by
gaiji: char glyph
May contain
gaiji: g
character data
Note

Suggested values for the type attribute include exact for exact equivalences, uppercase for uppercase equivalences, lowercase for lowercase equivalences, and simplified for simplified characters. The <g> elements contained by this element can point to either another <char> or <glyph> element or contain a character that is intended to be the target of this mapping.

Example
<mapping type="modern">r</mapping> <mapping type="standard"></mapping>
Content model
<content>
 <macroRef key="macro.xtext"/>
</content>
    
Schema Declaration
element mapping
{
   tei_att.global.attributes,
   tei_att.datable.attributes,
   tei_att.typed.attributes,
   tei_macro.xtext
}

Appendix A.1.54 <measure>

<measure> (measure) contains a word or phrase referring to some quantity of an object or commodity, usually comprising a number, a unit, and a commodity name. [3.6.3. Numbers and Measures]
Modulecore — Formal specification
Attributes
typespecifies the type of measurement in any convenient typology.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Member of
Contained by
May contain
ExampleThis example references a definition of a measurement unit declared in the TEI header:
<measure type="weight">  <num>2</num> pounds of flesh </measure> <measure type="currency">£10-11-6d</measure> <measure type="areaunitRef="#merk">2 <unit>merks</unit> of old extent</measure> <!-- In the TEI Header: --> <encodingDesc>  <unitDecl>   <unitDef xml:id="merktype="area">    <label>merk</label>    <placeName ref="#Scotland"/>    <desc>A merk was an area of land determined variably by its agricultural        productivity.</desc>   </unitDef>  </unitDecl> </encodingDesc>
Example
<measure quantity="40unit="hogshead"  commodity="rum">2 score hh rum</measure> <measure quantity="12unit="count"  commodity="roses">1 doz. roses</measure> <measure quantity="1unit="count"  commodity="tulips">a yellow tulip</measure>
Example
<head>Long papers.</head> <p>Speakers will be given 30 minutes each: 20 minutes for presentation, 10 minutes for discussion. Proposals should not exceed <measure max="500unit="count"   commodity="words">500    words</measure>. This presentation type is suitable for substantial research, theoretical or critical discussions.</p>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element measure
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.measurement.attributes,
   tei_att.ranging.attributes,
   tei_att.typed.attribute.subtype,
   attribute type { text }?,
   tei_macro.phraseSeq
}

Appendix A.1.55 <media>

<media> indicates the location of any form of external media such as an audio or video clip etc. [3.10. Graphics and Other Non-textual Components]
Modulecore — Formal specification
Attributes
mimeType(MIME media type) specifies the applicable multimedia internet mail extension (MIME) media type.
Derived fromatt.internetMedia
StatusRequired
Datatype1–∞ occurrences of teidata.word separated by whitespace
Member of
Contained by
May contain
core: desc
Note

The attributes available for this element are not appropriate in all cases. For example, it makes no sense to specify the temporal duration of a graphic. Such errors are not currently detected.

The mimeType attribute must be used to specify the MIME media type of the resource specified by the url attribute.

Example
<figure>  <media mimeType="image/pngurl="fig1.png"/>  <head>Figure One: The View from the Bridge</head>  <figDesc>A Whistleresque view showing four or five sailing boats in the foreground, and a    series of buoys strung out between them.</figDesc> </figure>
Example
<media mimeType="audio/wav"  url="dingDong.wavdur="PT10S">  <desc>Ten seconds of bellringing sound</desc> </media>
Example
<media mimeType="video/mp4"  url="clip45.mp4dur="PT45Mwidth="500px">  <desc>A 45 minute video clip to be displayed in a window 500    px wide</desc> </media>
Content model
<content>
 <classRef key="model.descLike"
  minOccurs="0" maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element media
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.declaring.attributes,
   tei_att.media.attribute.width,
   tei_att.media.attribute.height,
   tei_att.media.attribute.scale,
   tei_att.resourced.attributes,
   tei_att.timed.attributes,
   tei_att.typed.attributes,
   attribute mimeType { list { + } },
   tei_model.descLike*
}

Appendix A.1.56 <meeting>

<meeting> contains the formalized descriptive title for a meeting or conference, for use in a bibliographic description for an item derived from such a meeting, or as a heading or preamble to publications emanating from it. [3.12.2.2. Titles, Authors, and Editors]
Modulecore — Formal specification
Attributes
Member of
Contained by
core: bibl
textstructure: body div front
May contain
header: idno
transcr: ex subst
character data
Example
<div>  <meeting>Ninth International Conference on Middle High German Textual Criticism, Aachen,    June 1998.</meeting>  <list type="attendance">   <head>List of Participants</head>   <item>    <persName>...</persName>   </item>   <item>    <persName>...</persName>   </item> <!--...-->  </list>  <p>...</p> </div>
Content model
<content>
 <macroRef key="macro.limitedContent"/>
</content>
    
Schema Declaration
element meeting
{
   tei_att.global.attributes,
   tei_att.canonical.attributes,
   tei_att.cmc.attributes,
   tei_att.datable.attributes,
   tei_macro.limitedContent
}

Appendix A.1.57 <metamark>

<metamark> contains or describes any kind of graphic or written signal within a document the function of which is to determine how it should be read rather than forming part of the actual content of the document. [12.3.4.2. Metamarks]
Moduletranscr — Formal specification
Attributes
functiondescribes the function (for example status, insertion, deletion, transposition) of the metamark.
StatusOptional
Datatypeteidata.word
targetidentifies one or more elements to which the metamark applies.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Member of
Contained by
May contain
Example
<surface>  <metamark function="usedrend="line"   target="#X2"/>  <zone xml:id="zone-X2">   <line>I am that halfgrown <add>angry</add> boy, fallen asleep</line>   <line>The tears of foolish passion yet undried</line>   <line>upon my cheeks.</line> <!-- ... -->   <line>I pass through <add>the</add> travels and <del>fortunes</del> of   <retrace>thirty</retrace>   </line>   <line>years and become old,</line>   <line>Each in its due order comes and goes,</line>   <line>And thus a message for me comes.</line>   <line>The</line>  </zone>  <metamark function="used"   target="#zone-X2">Entered - Yes</metamark> </surface>
Content model
<content>
 <macroRef key="macro.specialPara"/>
</content>
    
Schema Declaration
element metamark
{
   tei_att.global.attributes,
   tei_att.placement.attributes,
   tei_att.spanning.attributes,
   attribute function { text }?,
   attribute target { list { + } }?,
   tei_macro.specialPara
}

Appendix A.1.58 <mod>

<mod> represents any kind of modification identified within a single document. [12.3.4.1. Generic Modification]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May contain
Example
<mod type="subst">  <add>pleasing</add>  <del>agreable</del> </mod>
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element mod
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.spanning.attributes,
   tei_att.transcriptional.attributes,
   tei_att.typed.attributes,
   tei_macro.paraContent
}

Appendix A.1.59 <name>

<name> (name, proper noun) contains a proper noun or noun phrase. [3.6.1. Referring Strings]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
Note

Proper nouns referring to people, places, and organizations may be tagged instead with <persName>, <placeName>, or <orgName>, when the TEI module for names and dates is included.

Example
<name type="person">Thomas Hoccleve</name> <name type="place">Villingaholt</name> <name type="org">Vetus Latina Institut</name> <name type="personref="#HOC001">Occleve</name>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element name
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.datable.attributes,
   tei_att.editLike.attributes,
   tei_att.personal.attributes,
   tei_att.typed.attributes,
   tei_macro.phraseSeq
}

Appendix A.1.60 <namespace>

<namespace> (namespace) supplies the formal name of the namespace to which the elements documented by its children belong. [2.3.4. The Tagging Declaration]
Moduleheader — Formal specification
Attributes
namespecifies the full formal name of the namespace concerned.
StatusRequired
Datatype0–1 occurrences of teidata.namespace separated by whitespace
Contained by
header: tagsDecl
May contain
header: tagUsage
Example
<namespace name="http://www.tei-c.org/ns/1.0">  <tagUsage gi="hioccurs="28withId="2"> Used only to mark English words    italicized in the copy text </tagUsage> </namespace>
Content model
<content>
 <elementRef key="tagUsage" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element namespace
{
   tei_att.global.attributes,
   attribute name { ? },
   tei_tagUsage+
}

Appendix A.1.61 <normalization>

<normalization> (normalization) indicates the extent of normalization or regularization of the original source carried out in converting it to electronic form. [2.3.3. The Editorial Practices Declaration 16.3.2. Declarable Elements]
Moduleheader — Formal specification
Attributes
methodindicates the method adopted to indicate normalizations within the text.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
silent
normalization made silently[Default]
markup
normalization represented using markup
Member of
Contained by
May contain
core: p
Example
<editorialDecl>  <normalization method="markup">   <p>Where both upper- and lower-case i, j, u, v, and vv have been normalized, to modern      20th century typographical practice, the <gi>choice</gi> element has been used to      enclose <gi>orig</gi> and <gi>reg</gi> elements giving the original and new values      respectively. ... </p>  </normalization>  <normalization method="silent">   <p>Spacing between words and following punctuation has been regularized to zero spaces;      spacing between words has been regularized to one space.</p>  </normalization>  <normalization source="http://www.dict.sztaki.hu/webster">   <p>Spelling converted throughout to Modern American usage, based on Websters 9th      Collegiate dictionary.</p>  </normalization> </editorialDecl>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde"  value="tei:normalization"/> </sch:pattern>
Content model
<content>
 <classRef key="model.pLike" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element normalization
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   attribute method { "silent" | "markup" }?,
   tei_model.pLike+
}

Appendix A.1.62 <note>

<note> (note) contains a note or annotation. [3.9.1. Notes and Simple Annotation 2.2.6. The Notes Statement 3.12.2.8. Notes and Statement of Language 10.3.5.4. Notes within Entries]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
ExampleIn the following example, the translator has supplied a footnote containing an explanation of the term translated as "painterly":
And yet it is not only in the great line of Italian renaissance art, but even in the painterly <note place="bottomtype="gloss"  resp="#MDMH">  <term xml:lang="de">Malerisch</term>. This word has, in the German, two distinct meanings, one objective, a quality residing in the object, the other subjective, a mode of apprehension and creation. To avoid confusion, they have been distinguished in English as <mentioned>picturesque</mentioned> and <mentioned>painterly</mentioned> respectively. </note> style of the Dutch genre painters of the seventeenth century that drapery has this psychological significance. <!-- elsewhere in the document --> <respStmt xml:id="MDMH">  <resp>translation from German to English</resp>  <name>Hottinger, Marie Donald Mackie</name> </respStmt>
For this example to be valid, the code MDMH must be defined elsewhere, for example by means of a responsibility statement in the associated TEI header.
ExampleThe global n attribute may be used to supply the symbol or number used to mark the note's point of attachment in the source text, as in the following example:
Mevorakh b. Saadya's mother, the matriarch of the family during the second half of the eleventh century, <note n="126anchored="true"> The alleged mention of Judah Nagid's mother in a letter from 1071 is, in fact, a reference to Judah's children; cf. above, nn. 111 and 54. </note> is well known from Geniza documents published by Jacob Mann.
However, if notes are numbered in sequence and their numbering can be reconstructed automatically by processing software, it may well be considered unnecessary to record the note numbers.
Content model
<content>
 <macroRef key="macro.specialPara"/>
</content>
    
Schema Declaration
element note
{
   tei_att.global.attributes,
   tei_att.anchoring.attributes,
   tei_att.cmc.attributes,
   tei_att.placement.attributes,
   tei_att.pointing.attributes,
   tei_att.typed.attributes,
   tei_att.written.attributes,
   tei_macro.specialPara
}

Appendix A.1.63 <num>

<num> (number) contains a number, written in any form. [3.6.3. Numbers and Measures]
Modulecore — Formal specification
Attributes
typeindicates the type of numeric value.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Suggested values include:
cardinal
absolute number, e.g. 21, 21.5
ordinal
ordinal number, e.g. 21st
fraction
fraction, e.g. one half or three-quarters
percentage
a percentage
Note

If a different typology is desired, other values can be used for this attribute.

valuesupplies the value of the number in standard form.
StatusOptional
Datatypeteidata.numeric
Valuesa numeric value.
Note

The standard form used is defined by the TEI datatype teidata.numeric.

Member of
Contained by
May contain
Note

Detailed analyses of quantities and units of measure in historical documents may also use the feature structure mechanism described in chapter 19. Feature Structures. The <num> element is intended for use in simple applications.

Example
<p>I reached <num type="cardinalvalue="21">twenty-one</num> on my <num type="ordinalvalue="21">twenty-first</num> birthday</p> <p>Light travels at <num value="3E10">3×10<hi rend="sup">10</hi>  </num> cm per second.</p>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element num
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.ranging.attributes,
   tei_att.typed.attribute.subtype,
   attribute type { "cardinal" | "ordinal" | "fraction" | "percentage" }?,
   attribute value { text }?,
   tei_macro.phraseSeq
}

Appendix A.1.64 <p>

<p> (paragraph) marks paragraphs in prose. [3.1. Paragraphs 7.2.5. Speech Contents]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
Example
<p>Hallgerd was outside. <q>There is blood on your axe,</q> she said. <q>What have you    done?</q> </p> <p>  <q>I have now arranged that you can be married a second time,</q> replied Thjostolf. </p> <p>  <q>Then you must mean that Thorvald is dead,</q> she said. </p> <p>  <q>Yes,</q> said Thjostolf. <q>And now you must think up some plan for me.</q> </p>
Schematron
<sch:rule context="tei:p"> <sch:report test="(ancestor::tei:ab or ancestor::tei:p) and not( ancestor::tei:floatingText | parent::tei:exemplum | parent::tei:item | parent::tei:note | parent::tei:q | parent::tei:quote | parent::tei:remarks | parent::tei:said | parent::tei:sp | parent::tei:stage | parent::tei:cell | parent::tei:figure )"> Abstract model violation: Paragraphs may not occur inside other paragraphs or ab elements. </sch:report> </sch:rule>
Schematron
<sch:rule context="tei:l//tei:p"> <sch:assert test="ancestor::tei:floatingText | parent::tei:figure | parent::tei:note"> Abstract model violation: Metrical lines may not contain higher-level structural elements such as div, p, or ab, unless p is a child of figure or note, or is a descendant of floatingText. </sch:assert> </sch:rule>
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element p
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.declaring.attributes,
   tei_att.fragmentable.attributes,
   tei_att.written.attributes,
   tei_macro.paraContent
}

Appendix A.1.65 <particDesc>

<particDesc> (participation description) describes the identifiable speakers, voices, or other participants in any kind of text or other persons named or otherwise referred to in a text, edition, or metadata. [16.2. Contextual Information]
Modulecorpus — Formal specification
Attributes
Member of
Contained by
header: profileDesc
May contain
core: p
Note

May contain a prose description organized as paragraphs, or a structured list of persons and person groups, with an optional formal specification of any relationships amongst them.

Example
<particDesc>  <listPerson>   <person xml:id="P-1234sex="2age="mid">    <p>Female informant, well-educated, born in        Shropshire UK, 12 Jan 1950, of unknown occupation. Speaks French fluently.        Socio-Economic status B2.</p>   </person>   <person xml:id="P-4332sex="1">    <persName>     <surname>Hancock</surname>     <forename>Antony</forename>     <forename>Aloysius</forename>     <forename>St John</forename>    </persName>    <residence notAfter="1959">     <address>      <street>Railway Cuttings</street>      <settlement>East Cheam</settlement>     </address>    </residence>    <occupation>comedian</occupation>   </person>   <listRelation>    <relation type="personalname="spouse"     mutual="#P-1234 #P-4332"/>   </listRelation>  </listPerson> </particDesc>
This example shows both a very simple person description, and a very detailed one, using some of the more specialized elements from the module for Names and Dates.
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde"  value="tei:particDesc"/> </sch:pattern>
Content model
<content>
 <alternate>
  <classRef key="model.pLike" minOccurs="1"
   maxOccurs="unbounded"/>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
   <classRef key="model.personLike"/>
   <elementRef key="listPerson"/>
   <elementRef key="listOrg"/>
  </alternate>
 </alternate>
</content>
    
Schema Declaration
element particDesc
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   ( tei_model.pLike+ | ( tei_model.personLike | listPerson | listOrg )+ )
}

Appendix A.1.66 <path>

<path> (path) defines any line passing through two or more points within a <surface> element. [12.1. Digital Facsimiles 12.2.2. Embedded Transcription]
Moduletranscr — Formal specification
Attributes
pointsidentifies a line within the container or bounding box specified by the parent element by means of a series of two or more pairs of numbers, each of which gives the x,y coordinates of a point on the line.
Derived fromatt.coordinated
StatusOptional
Datatype2–∞ occurrences of teidata.point separated by whitespace
Member of
Contained by
transcr: line surface zone
May containEmpty element
Note

Although the simplest form of a path is a straight line between two points, a line with more than two points may bend at any point. The order of coordinates in points is significant, because the line follows the coordinate sequence.

To specify a closed polygon, use the <zone> element rather than the <path> element.

Example
<surface ulx="0uly="0lrx="443lry="272">  <graphic url="facs-fig3.jpg"/>  <path points="74,73 171,244"/>  <path points="71,203 173,116"/> </surface>
SchematronSince a <path> represents a line with distinct start and end points, the last coordinate should not be the same as the first coordinate.
<sch:rule context="tei:path[@points]"> <sch:let name="firstPair"  value="tokenize( normalize-space( @points ), ' ')[1]"/> <sch:let name="lastPair"  value="tokenize( normalize-space( @points ), ' ')[last()]"/> <sch:let name="firstX"  value="xs:float( substring-before( $firstPair, ',') )"/> <sch:let name="firstY"  value="xs:float( substring-after( $firstPair, ',') )"/> <sch:let name="lastX"  value="xs:float( substring-before( $lastPair, ',') )"/> <sch:let name="lastY"  value="xs:float( substring-after( $lastPair, ',') )"/> <sch:report test="$firstX eq $lastX and $firstY eq $lastY">The first and last elements of this path are the same. To specify a closed polygon, use the zone element rather than the path element. </sch:report> </sch:rule>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element path
{
   tei_att.global.attributes,
   tei_att.coordinated.attribute.start,
   tei_att.coordinated.attribute.ulx,
   tei_att.coordinated.attribute.uly,
   tei_att.coordinated.attribute.lrx,
   tei_att.coordinated.attribute.lry,
   tei_att.typed.attributes,
   tei_att.written.attributes,
   attribute points { list { * } }?,
   empty
}

Appendix A.1.67 <pb>

<pb> (page beginning) marks the beginning of a new page in a paginated document. [3.11.3. Milestone Elements]
Modulecore — Formal specification
Attributes
Member of
Contained by
May containEmpty element
Note

A <pb> element should appear at the start of the page which it identifies. The global n attribute indicates the number or other value associated with this page. This will normally be the page number or signature printed on it, since the physical sequence number is implicit in the presence of the <pb> element itself.

The type attribute may be used to characterize the page beginning in any respect. The more specialized attributes break, ed, or edRef should be preferred when the intent is to indicate whether or not the page beginning is word-breaking, or to note the source from which it derives.

ExamplePage numbers may vary in different editions of a text.
<p> ... <pb n="145ed="ed2"/> <!-- Page 145 in edition "ed2" starts here --> ... <pb n="283ed="ed1"/> <!-- Page 283 in edition "ed1" starts here--> ... </p>
ExampleA page beginning may be associated with a facsimile image of the page it introduces by means of the facs attribute
<body>  <pb n="1facs="page1.png"/> <!-- page1.png contains an image of the page; the text it contains is encoded here -->  <p> <!-- ... -->  </p>  <pb n="2facs="page2.png"/> <!-- similarly, for page 2 -->  <p> <!-- ... -->  </p> </body>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element pb
{
   tei_att.global.attributes,
   tei_att.breaking.attributes,
   tei_att.cmc.attributes,
   tei_att.edition.attributes,
   tei_att.spanning.attributes,
   tei_att.typed.attributes,
   empty
}

Appendix A.1.68 <pc>

<pc> (punctuation character) contains a character or string of characters regarded as constituting a single punctuation mark. [18.1.2. Below the Word Level 18.4.2. Lightweight Linguistic Annotation]
Moduleanalysis — Formal specification
Attributes
forceindicates the extent to which this punctuation mark conventionally separates words or phrases.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
strong
the punctuation mark is a word separator
weak
the punctuation mark is not a word separator
inter
the punctuation mark may or may not be a word separator
unitprovides a name for the kind of unit delimited by this punctuation mark.
StatusOptional
Datatypeteidata.enumerated
preindicates whether this punctuation mark precedes or follows the unit it delimits.
StatusOptional
Datatypeteidata.truthValue
Member of
Contained by
May contain
Example
<phr>  <w>do</w>  <w>you</w>  <w>understand</w>  <pc type="interrogative">?</pc> </phr>
ExampleExample encoding of the German sentence Wir fahren in den Urlaub., encoded with attributes from att.linguistic discussed in section [[undefined AILALW]].
<s>  <w pos="PPERmsd="1.Pl.*.Nom">Wir</w>  <w pos="VVFINmsd="1.Pl.Pres.Ind">fahren</w>  <w pos="APPRmsd="--">in</w>  <w pos="ARTmsd="Def.Masc.Akk.Sg.">den</w>  <w pos="NNmsd="Masc.Akk.Sg.">Urlaub</w>  <pc pos="$.msd="--join="left">.</pc> </s>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <elementRef key="c"/>
  <classRef key="model.pPart.edit"/>
 </alternate>
</content>
    
Schema Declaration
element pc
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.linguistic.attributes,
   tei_att.segLike.attributes,
   tei_att.typed.attributes,
   attribute force { "strong" | "weak" | "inter" }?,
   attribute unit { text }?,
   attribute pre { text }?,
   ( text | tei_model.gLike | c | tei_model.pPart.edit )*
}

Appendix A.1.69 <prefixDef>

<prefixDef> (prefix definition) defines a prefixing scheme used in teidata.pointer values, showing how abbreviated URIs using the scheme may be expanded into full URIs. [17.2.3. Using Abbreviated Pointers]
Moduleheader — Formal specification
Attributes
identsupplies a name which functions as the prefix for an abbreviated pointing scheme such as a private URI scheme. The prefix constitutes the text preceding the first colon.
StatusRequired
Datatypeteidata.prefix
Note

The value is limited to teidata.prefix so that it may be mapped directly to a URI prefix.

Contained by
May contain
core: p
Note

The abbreviated pointer may be dereferenced to produce either an absolute or a relative URI reference. In the latter case it is combined with the value of xml:base in force at the place where the pointing attribute occurs to form an absolute URI in the usual manner as prescribed by XML Base.

Example
<prefixDef ident="ref"  matchPattern="([a-z]+)"  replacementPattern="../../references/references.xml#$1">  <p> In the context of this project, private URIs with    the prefix "ref" point to <gi>div</gi> elements in    the project's global references.xml file.  </p> </prefixDef>
Content model
<content>
 <classRef key="model.pLike" minOccurs="0"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element prefixDef
{
   tei_att.global.attributes,
   tei_att.patternReplacement.attributes,
   attribute ident { text },
   tei_model.pLike*
}

Appendix A.1.70 <profileDesc>

<profileDesc> (text-profile description) provides a detailed description of non-bibliographic aspects of a text, specifically the languages and sublanguages used, the situation in which it was produced, the participants and their setting. [2.4. The Profile Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Attributes
Member of
Contained by
header: teiHeader
May contain
Note

Although the content model permits it, it is rarely meaningful to supply multiple occurrences for any of the child elements of <profileDesc> unless these are documenting multiple texts.

Example
<profileDesc>  <langUsage>   <language ident="fr">French</language>  </langUsage>  <textDesc n="novel">   <channel mode="w">print; part issues</channel>   <constitution type="single"/>   <derivation type="original"/>   <domain type="art"/>   <factuality type="fiction"/>   <interaction type="none"/>   <preparedness type="prepared"/>   <purpose type="entertaindegree="high"/>   <purpose type="informdegree="medium"/>  </textDesc>  <settingDesc>   <setting>    <name>Paris, France</name>    <time>Late 19th century</time>   </setting>  </settingDesc> </profileDesc>
Content model
<content>
 <classRef key="model.profileDescPart"
  minOccurs="0" maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element profileDesc { tei_att.global.attributes, tei_model.profileDescPart* }

Appendix A.1.71 <projectDesc>

<projectDesc> (project description) describes in detail the aim or purpose for which an electronic file was encoded, together with any other relevant information concerning the process by which it was assembled or collected. [2.3.1. The Project Description 2.3. The Encoding Description 16.3.2. Declarable Elements]
Moduleheader — Formal specification
Attributes
Member of
Contained by
header: encodingDesc
May contain
core: p
Example
<projectDesc>  <p>Texts collected for use in the Claremont Shakespeare Clinic, June 1990</p> </projectDesc>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde"  value="tei:projectDesc"/> </sch:pattern>
Content model
<content>
 <classRef key="model.pLike" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element projectDesc
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   tei_model.pLike+
}

Appendix A.1.72 <pubPlace>

<pubPlace> (publication place) contains the name of the place where a bibliographic item was published. [3.12.2.4. Imprint, Size of a Document, and Reprint Information]
Modulecore — Formal specification
Attributes
Member of
Contained by
core: bibl
May contain
Example
<publicationStmt>  <publisher>Oxford University Press</publisher>  <pubPlace>Oxford</pubPlace>  <date>1989</date> </publicationStmt>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element pubPlace
{
   tei_att.global.attributes,
   tei_att.naming.attributes,
   tei_macro.phraseSeq
}

Appendix A.1.73 <publicationStmt>

<publicationStmt> (publication statement) groups information concerning the publication or distribution of an electronic or other text. [2.2.4. Publication, Distribution, Licensing, etc. 2.2. The File Description]
Moduleheader — Formal specification
Attributes
Contained by
header: fileDesc
May contain
Note

Where a publication statement contains several members of the model.publicationStmtPart.agency or model.publicationStmtPart.detail classes rather than one or more paragraphs or anonymous blocks, care should be taken to ensure that the repeated elements are presented in a meaningful order. It is a conformance requirement that elements supplying information about publication place, address, identifier, availability, and date be given following the name of the publisher, distributor, or authority concerned, and preferably in that order.

Example
<publicationStmt>  <publisher>C. Muquardt </publisher>  <pubPlace>Bruxelles &amp; Leipzig</pubPlace>  <date when="1846"/> </publicationStmt>
Example
<publicationStmt>  <publisher>Chadwyck Healey</publisher>  <pubPlace>Cambridge</pubPlace>  <availability>   <p>Available under licence only</p>  </availability>  <date when="1992">1992</date> </publicationStmt>
Example
<publicationStmt>  <publisher>Zea Books</publisher>  <pubPlace>Lincoln, NE</pubPlace>  <date>2017</date>  <availability>   <p>This is an open access work licensed under a Creative Commons Attribution 4.0 International license.</p>  </availability>  <ptr target="http://digitalcommons.unl.edu/zeabook/55"/> </publicationStmt>
Content model
<content>
 <alternate>
  <sequence minOccurs="1"
   maxOccurs="unbounded">
   <classRef key="model.publicationStmtPart.agency"/>
   <classRef key="model.publicationStmtPart.detail"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
  <classRef key="model.pLike" minOccurs="1"
   maxOccurs="unbounded"/>
 </alternate>
</content>
    
Schema Declaration
element publicationStmt
{
   tei_att.global.attributes,
   (
      (
         (
            tei_model.publicationStmtPart.agency,
            tei_model.publicationStmtPart.detail*
         )+
      )
    | tei_model.pLike+
   )
}

Appendix A.1.74 <publisher>

<publisher> (publisher) provides the name of the organization responsible for the publication or distribution of a bibliographic item. [3.12.2.4. Imprint, Size of a Document, and Reprint Information 2.2.4. Publication, Distribution, Licensing, etc.]
Modulecore — Formal specification
Attributes
Member of
Contained by
core: bibl
May contain
Note

Use the full form of the name by which a company is usually referred to, rather than any abbreviation of it which may appear on a title page

Example
<imprint>  <pubPlace>Oxford</pubPlace>  <publisher>Clarendon Press</publisher>  <date>1987</date> </imprint>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element publisher
{
   tei_att.global.attributes,
   tei_att.canonical.attributes,
   tei_macro.phraseSeq
}

Appendix A.1.75 <quotation>

<quotation> (quotation) specifies editorial practice adopted with respect to quotation marks in the original. [2.3.3. The Editorial Practices Declaration 16.3.2. Declarable Elements]
Moduleheader — Formal specification
Attributes
marks(quotation marks) indicates whether or not quotation marks have been retained as content within the text.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
none
no quotation marks have been retained
some
some quotation marks have been retained
all
all quotation marks have been retained
Member of
Contained by
May contain
core: p
Example
<quotation marks="none">  <p>No quotation marks have been retained. Instead, the <att>rend</att> attribute on the  <gi>q</gi> element is used to specify what kinds of quotation mark was used, according    to the following list: <list type="gloss">    <label>dq</label>    <item>double quotes, open and close</item>    <label>sq</label>    <item>single quotes, open and close</item>    <label>dash</label>    <item>long dash open, no close</item>    <label>dg</label>    <item>double guillemets, open and close</item>   </list>  </p> </quotation>
Example
<quotation marks="all">  <p>All quotation marks are retained in the text and are represented by appropriate Unicode    characters.</p> </quotation>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde" value="tei:quotation"/> </sch:pattern>
Schematron
<sch:rule context="tei:quotation"> <sch:report test="not( @marks ) and not( tei:p )"> On <sch:name/>, either the @marks attribute should be used, or a paragraph of description provided </sch:report> </sch:rule>
Content model
<content>
 <classRef key="model.pLike" minOccurs="0"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element quotation
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   attribute marks { "none" | "some" | "all" }?,
   tei_model.pLike*
}

Appendix A.1.76 <redo>

<redo> indicates one or more cancelled interventions in a document which have subsequently been marked as reaffirmed or repeated. [12.3.4.4. Confirmation, Cancellation, and Reinstatement of Modifications]
Moduletranscr — Formal specification
Attributes
targetpoints to one or more elements representing the interventions which are being reasserted.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Member of
Contained by
May containEmpty element
Example
<line>  <redo hand="#g_ttarget="#redo-1"   cause="fix"/>  <mod xml:id="redo-1rend="strikethrough"   spanTo="#anchor-1hand="#g_bl"/>Ihr hagren, triſten, krummgezog<mod rend="strikethrough">nen</mod>ener Nacken </line> <line>Wenn ihr nur piepſet iſt die Welt ſchon matt.<anchor xml:id="anchor-1"/> </line>
This encoding represents the following sequence of events:
  • "Ihr hagren, triſten, krummgezog nenener Nacken/ Wenn ihr nur piepſet iſt die Welt ſchon matt." is written
  • the redundant letters "nen" in "nenener" are deleted
  • the whole passage is deleted by hand g_bl using strikethrough
  • the deletion is reasserted by another hand (identified here as g_t)
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element redo
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.spanning.attributes,
   tei_att.transcriptional.attributes,
   attribute target { list { + } }?,
   empty
}

Appendix A.1.77 <ref>

<ref> (reference) defines a reference to another location, possibly modified by additional text or comment. [3.7. Simple Links and Cross-References 17.1. Links]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
Note

The target and cRef attributes are mutually exclusive.

Example
See especially <ref target="http://www.natcorp.ox.ac.uk/Texts/A02.xml#s2">the second sentence</ref>
Example
See also <ref target="#locution">s.v. <term>locution</term> </ref>.
Schematron
<sch:rule context="tei:ref"> <sch:report test="@target and @cRef">Only one of the attributes @target and @cRef may be supplied on <sch:name/>.</sch:report> </sch:rule>
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element ref
{
   tei_att.global.attributes,
   tei_att.cReferencing.attributes,
   tei_att.cmc.attributes,
   tei_att.declaring.attributes,
   tei_att.internetMedia.attributes,
   tei_att.pointing.attributes,
   tei_att.typed.attributes,
   tei_macro.paraContent
}

Appendix A.1.78 <resp>

<resp> (responsibility) contains a phrase describing the nature of a person's intellectual responsibility, or an organization's role in the production or distribution of a work. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.2. The Edition Statement 2.2.5. The Series Statement]
Modulecore — Formal specification
Attributes
Contained by
core: respStmt
May contain
Note

The attribute ref, inherited from the class att.canonical may be used to indicate the kind of responsibility in a normalized form by referring directly to a standardized list of responsibility types, such as that maintained by a naming authority, for example the list maintained at http://www.loc.gov/marc/relators/relacode.html for bibliographic usage.

Example
<respStmt>  <resp ref="http://id.loc.gov/vocabulary/relators/com.html">compiler</resp>  <name>Edward Child</name> </respStmt>
Content model
<content>
 <macroRef key="macro.phraseSeq.limited"/>
</content>
    
Schema Declaration
element resp
{
   tei_att.global.attributes,
   tei_att.canonical.attributes,
   tei_att.datable.attributes,
   tei_macro.phraseSeq.limited
}

Appendix A.1.79 <respStmt>

<respStmt> (statement of responsibility) supplies a statement of responsibility for the intellectual content of a text, edition, recording, or series, where the specialized elements for authors, editors, etc. do not suffice or do not apply. May also be used to encode information about individuals or organizations which have played a role in the production or distribution of a bibliographic work. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.2. The Edition Statement 2.2.5. The Series Statement]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
core: name note resp
Example
<respStmt>  <resp>transcribed from original ms</resp>  <persName>Claus Huitfeldt</persName> </respStmt>
Example
<respStmt>  <resp>converted to XML encoding</resp>  <name>Alan Morrison</name> </respStmt>
Content model
<content>
 <sequence>
  <alternate>
   <sequence>
    <elementRef key="resp" minOccurs="1"
     maxOccurs="unbounded"/>
    <classRef key="model.nameLike.agent"
     minOccurs="1" maxOccurs="unbounded"/>
   </sequence>
   <sequence>
    <classRef key="model.nameLike.agent"
     minOccurs="1" maxOccurs="unbounded"/>
    <elementRef key="resp" minOccurs="1"
     maxOccurs="unbounded"/>
   </sequence>
  </alternate>
  <elementRef key="note" minOccurs="0"
   maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration
element respStmt
{
   tei_att.global.attributes,
   tei_att.canonical.attributes,
   (
      (
         ( tei_resp+, tei_model.nameLike.agent+ )
       | ( tei_model.nameLike.agent+, tei_resp+ )
      ),
      tei_note*
   )
}

Appendix A.1.80 <restore>

<restore> (restore) indicates restoration of text to an earlier state by cancellation of an editorial or authorial marking or instruction. [12.3.1.6. Cancellation of Deletions and Other Markings]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May contain
Note

On this element, the type attribute categorizes the way that the cancelled intervention has been indicated in some way, for example by means of a marginal note, over-inking, additional markup, etc.

Example
For I hate this <restore hand="#dhl"  type="marginalStetNote">  <del>my</del> </restore> body
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element restore
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.transcriptional.attributes,
   tei_att.typed.attributes,
   tei_macro.paraContent
}

Appendix A.1.81 <retrace>

<retrace> contains a sequence of writing which has been retraced, for example by over-inking, to clarify or fix it. [12.3.4.3. Fixation and Clarification]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May contain
Note

Multiple retraces are indicated by nesting one <retrace> within another. In principle, a retrace differs from a substitution in that second and subsequent rewrites do not materially alter the content of an element. Where minor changes have been made during the retracing action however these may be marked up using <del>, <add>, etc. with an appropriate value for the change attribute.

Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element retrace
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.spanning.attributes,
   tei_att.transcriptional.attributes,
   tei_macro.paraContent
}

Appendix A.1.82 <revisionDesc>

<revisionDesc> (revision description) summarizes the revision history for a file. [2.6. The Revision Description 2.1.1. The TEI Header and Its Components]
Moduleheader — Formal specification
Attributes
Contained by
header: teiHeader
May contain
header: change
Note

If present on this element, the status attribute should indicate the current status of the document. The same attribute may appear on any <change> to record the status at the time of that change. Conventionally <change> elements should be given in reverse date order, with the most recent change at the start of the list.

Example
<revisionDesc status="embargoed">  <change when="1991-11-11who="#LB"> deleted chapter 10 </change> </revisionDesc>
Content model
<content>
 <alternate>
  <elementRef key="list" minOccurs="1"
   maxOccurs="unbounded"/>
  <elementRef key="listChange"
   minOccurs="1" maxOccurs="unbounded"/>
  <elementRef key="change" minOccurs="1"
   maxOccurs="unbounded"/>
 </alternate>
</content>
    
Schema Declaration
element revisionDesc
{
   tei_att.global.attributes,
   tei_att.docStatus.attributes,
   ( list+ | listChange+ | tei_change+ )
}

Appendix A.1.83 <s>

<s> (s-unit) contains a sentence-like division of a text. [18.1. Linguistic Segment Categories 8.4.1. Segmentation]
Moduleanalysis — Formal specification
Attributes
Member of
Contained by
May contain
Note

The <s> element may be used to mark orthographic sentences, or any other segmentation of a text, provided that the segmentation is end-to-end, complete, and non-nesting. For segmentation which is partial or recursive, the <seg> should be used instead.

The type attribute may be used to indicate the type of segmentation intended, according to any convenient typology.

Example
<head>  <s>A short affair</s> </head> <s>When are you leaving?</s> <s>Tomorrow.</s>
Schematron
<sch:rule context="tei:s"> <sch:report test="tei:s">You may not nest one s element within another: use seg instead</sch:report> </sch:rule>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element s
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.notated.attributes,
   tei_att.segLike.attributes,
   tei_att.typed.attributes,
   tei_macro.phraseSeq
}

Appendix A.1.84 <secl>

<secl> (secluded text) Secluded. Marks text present in the source which the editor believes to be genuine but out of its original place (which is unknown). [12.3.1.7. Text Omitted from or Supplied in the Transcription]
Moduletranscr — Formal specification
Attributes
reasonone or more words indicating why this text has been secluded, e.g. interpolated etc.
StatusOptional
Datatype1–∞ occurrences of teidata.enumerated separated by whitespace
Member of
Contained by
May contain
Example
<rdg source="#Pescani">  <secl>   <l n="15xml:id="l15">Alphesiboea suos ulta est pro coniuge fratres,</l>   <l n="16xml:id="l16">sanguinis et cari vincula rupit amor.</l>  </secl> </rdg> <note>secl. Pescani</note>
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element secl
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.editLike.attributes,
   attribute reason { list { + } }?,
   tei_macro.paraContent
}

Appendix A.1.85 <segmentation>

<segmentation> (segmentation) describes the principles according to which the text has been segmented, for example into sentences, tone-units, graphemic strata, etc. [2.3.3. The Editorial Practices Declaration 16.3.2. Declarable Elements]
Moduleheader — Formal specification
Attributes
Member of
Contained by
May contain
core: p
Example
<segmentation>  <p>   <gi>s</gi> elements mark orthographic sentences and are numbered sequentially within    their parent <gi>div</gi> element </p> </segmentation>
Example
<p>  <gi>seg</gi> elements are used to mark functional constituents of various types within each <gi>s</gi>; the typology used is defined by a <gi>taxonomy</gi> element in the corpus header <gi>classDecl</gi> </p>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde"  value="tei:segmentation"/> </sch:pattern>
Content model
<content>
 <classRef key="model.pLike" minOccurs="1"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element segmentation
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   tei_model.pLike+
}

Appendix A.1.86 <setting>

<setting> describes one particular setting in which a language interaction takes place. [16.2.3. The Setting Description]
Modulecorpus — Formal specification
Attributes
Contained by
corpus: settingDesc
May contain
core: date name p time
Note

If the who attribute is not supplied, the setting is assumed to be that of all participants in the language interaction.

Example
<setting>  <placeName>New York City, US</placeName>  <date>1989</date>  <locale>on a park bench</locale>  <activity>feeding birds</activity> </setting>
Content model
<content>
 <alternate>
  <classRef key="model.pLike" minOccurs="1"
   maxOccurs="unbounded"/>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.nameLike.agent"/>
   <classRef key="model.dateLike"/>
   <classRef key="model.settingPart"/>
  </alternate>
 </alternate>
</content>
    
Schema Declaration
element setting
{
   tei_att.global.attributes,
   tei_att.ascribed.attributes,
   (
      tei_model.pLike+
    | ( tei_model.nameLike.agent | tei_model.dateLike | tei_model.settingPart )*
   )
}

Appendix A.1.87 <settingDesc>

<settingDesc> (setting description) describes the setting or settings within which a language interaction takes place, or other places otherwise referred to in a text, edition, or metadata. [16.2. Contextual Information 2.4. The Profile Description]
Modulecorpus — Formal specification
Attributes
Member of
Contained by
header: profileDesc
May contain
core: p
corpus: setting
Note

May contain a prose description organized as paragraphs, or a series of <setting> elements. If used to record not settings of language interactions, but other places mentioned in the text, then <place> optionally grouped by <listPlace> inside <standOff> should be preferred.

Example
<settingDesc>  <p>Texts recorded in the    Canadian Parliament building in Ottawa, between April and November 1988 </p> </settingDesc>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde"  value="tei:settingDesc"/> </sch:pattern>
Content model
<content>
 <alternate>
  <classRef key="model.pLike" minOccurs="1"
   maxOccurs="unbounded"/>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
   <elementRef key="setting"/>
   <classRef key="model.placeLike"/>
   <elementRef key="listPlace"/>
  </alternate>
 </alternate>
</content>
    
Schema Declaration
element settingDesc
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   ( tei_model.pLike+ | ( tei_setting | tei_model.placeLike | listPlace )+ )
}

Appendix A.1.88 <sourceDesc>

<sourceDesc> (source description) describes the source(s) from which an electronic text was derived or generated, typically a bibliographic description in the case of a digitized text, or a phrase such as ‘born digital’ for a text which has no previous existence. [2.2.7. The Source Description]
Moduleheader — Formal specification
Attributes
Contained by
header: fileDesc
May contain
core: bibl p
Example
<sourceDesc>  <bibl>   <title level="a">The Interesting story of the Children in the Wood</title>. In  <author>Victor E Neuberg</author>, <title>The Penny Histories</title>.  <publisher>OUP</publisher>   <date>1968</date>. </bibl> </sourceDesc>
Example
<sourceDesc>  <p>Born digital: no previous source exists.</p> </sourceDesc>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde"  value="tei:sourceDesc"/> </sch:pattern>
Content model
<content>
 <alternate>
  <classRef key="model.pLike" minOccurs="1"
   maxOccurs="unbounded"/>
  <alternate minOccurs="1"
   maxOccurs="unbounded">
   <classRef key="model.biblLike"/>
   <classRef key="model.sourceDescPart"/>
   <classRef key="model.listLike"/>
  </alternate>
 </alternate>
</content>
    
Schema Declaration
element sourceDesc
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   (
      tei_model.pLike+
    | ( tei_model.biblLike | tei_model.sourceDescPart | tei_model.listLike )+
   )
}

Appendix A.1.89 <sourceDoc>

<sourceDoc> contains a transcription or other representation of a single source document potentially forming part of a dossier génétique or collection of sources. [12.1. Digital Facsimiles 12.2.2. Embedded Transcription]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
core: teiCorpus
textstructure: TEI
May contain
Note

This element may be used as an alternative to <facsimile> for TEI documents containing only page images, or for documents containing both images and transcriptions. Transcriptions may be provided within the <surface> elements making up a source document, in parallel with them as part of a <text> element, or in both places if the encoder wishes to distinguish these two modes of transcription.

Example
<sourceDoc>  <surfaceGrp n="leaf1">   <surface facs="page1.png">    <zone>All the writing on page 1</zone>   </surface>   <surface>    <graphic url="page2-highRes.png"/>    <graphic url="page2-lowRes.png"/>    <zone>     <line>A line of writing on page 2</line>     <line>Another line of writing on page 2</line>    </zone>   </surface>  </surfaceGrp> </sourceDoc>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.global"/>
  <classRef key="model.graphicLike"/>
  <elementRef key="surface"/>
  <elementRef key="surfaceGrp"/>
 </alternate>
</content>
    
Schema Declaration
element sourceDoc
{
   tei_att.global.attributes,
   tei_att.declaring.attributes,
   ( tei_model.global | tei_model.graphicLike | tei_surface | tei_surfaceGrp )+
}

Appendix A.1.90 <space>

<space> (space) indicates the location of a significant space in the text. [12.4.1. Space]
Moduletranscr — Formal specification
Attributes
resp(responsible party) (responsible party) indicates the individual responsible for identifying and measuring the space.
Derived fromatt.global.responsibility
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
dim(dimension) indicates whether the space is horizontal or vertical.
StatusRecommended
Datatypeteidata.enumerated
Legal values are:
horizontal
the space is horizontal.
vertical
the space is vertical.
Note

For irregular shapes in two dimensions, the value for this attribute should reflect the more important of the two dimensions. In conventional left-right scripts, a space with both vertical and horizontal components should be classed as vertical.

Member of
Contained by
May contain
core: desc
Note

This element should be used wherever it is desired to record an unusual space in the source text, e.g. space left for a word to be filled in later, for later rubrication, etc. It is not intended to be used to mark normal inter-word space or the like.

Example
By god if wommen had writen storyes As <space quantity="7unit="minims"/> han within her oratoryes
Example
στρατηλάτ<space quantity="1unit="chars"/>ου
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <classRef key="model.descLike"/>
  <classRef key="model.certLike"/>
 </alternate>
</content>
    
Schema Declaration
element space
{
   tei_att.global.attribute.xmlid,
   tei_att.global.attribute.n,
   tei_att.global.attribute.xmllang,
   tei_att.global.attribute.xmlbase,
   tei_att.global.attribute.xmlspace,
   tei_att.global.analytic.attribute.ana,
   tei_att.global.change.attribute.change,
   tei_att.global.facs.attribute.facs,
   tei_att.global.rendition.attribute.rend,
   tei_att.global.rendition.attribute.style,
   tei_att.global.rendition.attribute.rendition,
   tei_att.global.responsibility.attribute.cert,
   tei_att.global.source.attribute.source,
   tei_att.dimensions.attributes,
   tei_att.typed.attributes,
   attribute resp { list { + } }?,
   attribute dim { "horizontal" | "vertical" }?,
   ( tei_model.descLike | tei_model.certLike )*
}

Appendix A.1.91 <subst>

<subst> (substitution) groups one or more deletions (or surplus text) with one or more additions when the combination is to be regarded as a single intervention in the text. [12.3.1.5. Substitutions]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May contain
core: pb
transcr: fw surplus
Example
... are all included. <del hand="#RG">It is</del> <subst>  <add>T</add>  <del>t</del> </subst>he expressed
Example
that he and his Sister Miſs D — <lb/>who always lived with him, wd. be <subst>  <del>very</del>  <lb/>  <add>principally</add> </subst> remembered in her Will.
Example
<ab>τ<subst>   <add place="above">ῶν</add>   <del>α</del>  </subst> συνκυρόντ<subst>   <add place="above">ων</add>   <del>α</del>  </subst> ἐργαστηρί<subst>   <add place="above">ων</add>   <del>α</del>  </subst> </ab>
Example
<subst>  <del>   <gap reason="illegiblequantity="5"    unit="character"/>  </del>  <add>apple</add> </subst>
Schematron
<sch:rule context="tei:subst"> <sch:assert test="child::tei:add and (child::tei:del or child::tei:surplus)">  <sch:name/> must have at least one child add and at least one child del or surplus</sch:assert> </sch:rule>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <elementRef key="add"/>
  <elementRef key="surplus"/>
  <elementRef key="del"/>
  <classRef key="model.milestoneLike"/>
 </alternate>
</content>
    
Schema Declaration
element subst
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.transcriptional.attributes,
   ( add | tei_surplus | del | tei_model.milestoneLike )+
}

Appendix A.1.92 <substJoin>

<substJoin> (substitution join) identifies a series of possibly fragmented additions, deletions, or other revisions on a manuscript that combine to make up a single intervention in the text. [12.3.1.5. Substitutions]
Moduletranscr — Formal specification
Attributes
Member of
Contained by
May contain
core: desc
Example
While <del xml:id="r112">pondering</del> thus <add xml:id="r113">she mus'd</add>, her pinions fann'd <substJoin target="#r112 #r113"/>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <classRef key="model.descLike"/>
  <classRef key="model.certLike"/>
 </alternate>
</content>
    
Schema Declaration
element substJoin
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.pointing.attributes,
   tei_att.transcriptional.attributes,
   ( tei_model.descLike | tei_model.certLike )*
}

Appendix A.1.93 <supplied>

<supplied> (supplied) signifies text supplied by the transcriber or editor for any reason; for example because the original cannot be read due to physical damage, or because of an obvious omission by the author or scribe. [12.3.3.1. Damage, Illegibility, and Supplied Text]
Moduletranscr — Formal specification
Attributes
reasonone or more words indicating why the text has had to be supplied, e.g. overbinding, faded-ink, lost-folio, omitted-in-original.
StatusOptional
Datatype1–∞ occurrences of teidata.enumerated separated by whitespace
Member of
Contained by
May contain
Note

The <damage>, <gap>, <del>, <unclear> and <supplied> elements may be closely allied in use. See section 12.3.3.2. Use of the gap, del, damage, unclear, and supplied Elements in Combination for discussion of which element is appropriate for which circumstance.

Example
I am dr Sr yr <supplied reason="illegible"  source="#amanuensis_copy">very humble Servt</supplied> Sydney Smith
Example
<supplied reason="omitted-in-original">Dedication</supplied> to the duke of Bejar
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element supplied
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.editLike.attributes,
   attribute reason { list { + } }?,
   tei_macro.paraContent
}

Appendix A.1.94 <surface>

<surface> defines a written surface as a two-dimensional coordinate space, optionally grouping one or more graphic representations of that space, zones of interest within that space, and, when using an embedded transcription approach, transcriptions of the writing within them. [12.1. Digital Facsimiles 12.2.2. Embedded Transcription]
Moduletranscr — Formal specification
Attributes
attachmentdescribes the method by which this surface is or was connected to the main surface.
StatusOptional
Datatypeteidata.enumerated
Sample values include:
glued
glued in place
pinned
pinned or stapled in place
sewn
sewn in place
flippingindicates whether the surface is attached and folded in such a way as to provide two writing surfaces.
StatusOptional
Datatypeteidata.truthValue
Contained by
May contain
Note

The <surface> element represents any two-dimensional space on some physical surface forming part of the source material, such as a piece of paper, a face of a monument, a billboard, a scroll, a leaf etc.

The coordinate space defined by this element may be thought of as a grid lrx - ulx units wide and uly - lry units high.

The <surface> element may contain graphic representations or transcriptions of written zones, or both. The coordinate values used by every <zone> element contained by this element are to be understood with reference to the same grid.

Where it is useful or meaningful to do so, any grouping of multiple <surface> elements may be indicated using the <surfaceGrp> element.

Example
<facsimile>  <surface ulx="0uly="0lrx="200lry="300">   <graphic url="Bovelles-49r.png"/>  </surface> </facsimile>
Content model
<content>
 <sequence>
  <alternate minOccurs="0"
   maxOccurs="unbounded">
   <classRef key="model.global"/>
   <classRef key="model.labelLike"/>
   <classRef key="model.graphicLike"/>
  </alternate>
  <sequence minOccurs="0"
   maxOccurs="unbounded">
   <alternate>
    <elementRef key="zone"/>
    <elementRef key="line"/>
    <elementRef key="path"/>
    <elementRef key="surface"/>
    <elementRef key="surfaceGrp"/>
   </alternate>
   <classRef key="model.global"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
 </sequence>
</content>
    
Schema Declaration
element surface
{
   tei_att.global.attributes,
   tei_att.coordinated.attributes,
   tei_att.declaring.attributes,
   tei_att.typed.attributes,
   attribute attachment { text }?,
   attribute flipping { text }?,
   (
      ( tei_model.global | tei_model.labelLike | tei_model.graphicLike )*,
      (
         (
            ( tei_zone | tei_line | tei_path | tei_surface | tei_surfaceGrp ),
            tei_model.global*
         )*
      )
   )
}

Appendix A.1.95 <surfaceGrp>

<surfaceGrp> (surface group) defines any kind of useful grouping of written surfaces, for example the recto and verso of a single leaf, which the encoder wishes to treat as a single unit. [12.1. Digital Facsimiles]
Moduletranscr — Formal specification
Attributes
Contained by
May contain
Note

Where it is useful or meaningful to do so, any grouping of multiple <surface> elements may be indicated using the <surfaceGrp> elements.

Example
<sourceDoc>  <surfaceGrp>   <surface ulx="0uly="0lrx="200"    lry="300">    <graphic url="Bovelles-49r.png"/>   </surface>   <surface ulx="0uly="0lrx="200"    lry="300">    <graphic url="Bovelles-49v.png"/>   </surface>  </surfaceGrp> </sourceDoc>
Content model
<content>
 <alternate minOccurs="1"
  maxOccurs="unbounded">
  <classRef key="model.global"/>
  <elementRef key="surface"/>
  <elementRef key="surfaceGrp"/>
 </alternate>
</content>
    
Schema Declaration
element surfaceGrp
{
   tei_att.global.attributes,
   tei_att.declaring.attributes,
   tei_att.typed.attributes,
   ( tei_model.global | tei_surface | tei_surfaceGrp )+
}

Appendix A.1.96 <surplus>

<surplus> (surplus) marks text present in the source which the editor believes to be superfluous or redundant. [12.3.3.1. Damage, Illegibility, and Supplied Text]
Moduletranscr — Formal specification
Attributes
reasonone or more words indicating why this text is believed to be superfluous, e.g. repeated, interpolated etc.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
Member of
Contained by
May contain
Example
I am dr Sr yrs <surplus reason="repeated">yrs</surplus> Sydney Smith
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element surplus
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.editLike.attributes,
   attribute reason { list { + } }?,
   tei_macro.paraContent
}

Appendix A.1.97 <tagUsage>

<tagUsage> (element usage) documents the usage of a specific element within a specified document. [2.3.4. The Tagging Declaration]
Moduleheader — Formal specification
Attributes
gi(generic identifier) specifies the name (generic identifier) of the element indicated by the tag, within the namespace indicated by the parent <namespace> element.
StatusRequired
Datatypeteidata.name
occursspecifies the number of occurrences of this element within the text.
StatusRecommended
Datatypeteidata.count
withId(with unique identifier) specifies the number of occurrences of this element within the text which bear a distinct value for the global xml:id attribute.
StatusRecommended
Datatypeteidata.count
Contained by
header: namespace
May contain
header: idno
transcr: ex subst
character data
Example
<tagsDecl partial="true">  <rendition xml:id="itscheme="css"   selector="foreign, hi"> font-style: italic; </rendition> <!-- ... -->  <namespace name="http://www.tei-c.org/ns/1.0">   <tagUsage gi="hioccurs="28withId="2"> Used to mark English words italicized in the copy text.</tagUsage>   <tagUsage gi="foreign">Used to mark non-English words in the copy text.</tagUsage> <!-- ... -->  </namespace> </tagsDecl>
Content model
<content>
 <macroRef key="macro.limitedContent"/>
</content>
    
Schema Declaration
element tagUsage
{
   tei_att.global.attributes,
   tei_att.datcat.attributes,
   attribute gi { text },
   attribute occurs { text }?,
   attribute withId { text }?,
   tei_macro.limitedContent
}

Appendix A.1.98 <tagsDecl>

<tagsDecl> (tagging declaration) provides detailed information about the tagging applied to a document. [2.3.4. The Tagging Declaration 2.3. The Encoding Description]
Moduleheader — Formal specification
Attributes
partialindicates whether the element types listed exhaustively include all those found within <text>, or represent only a subset.
StatusRecommended
Datatypeteidata.truthValue
Note

TEI recommended practice is to specify this attribute. When the <tagUsage> elements inside <tagsDecl> are used to list each of the element types in the associated <text>, the value should be given as false. When the <tagUsage> elements inside <tagsDecl> are used to provide usage information or default renditions for only a subset of the elements types within the associated <text>, the value should be true.

Member of
Contained by
header: encodingDesc
May contain
header: namespace
Example
<tagsDecl partial="true">  <rendition xml:id="rend-itscheme="css"   selector="emph, hi, name, title">font-style: italic;</rendition>  <namespace name="http://www.tei-c.org/ns/1.0">   <tagUsage gi="hioccurs="467"/>   <tagUsage gi="titleoccurs="45"/>  </namespace>  <namespace name="http://docbook.org/ns/docbook">   <tagUsage gi="paraoccurs="10"/>  </namespace> </tagsDecl>
If the partial attribute were not specified here, the implication would be that the document in question contains only <hi>, <title>, and <para> elements.
Content model
<content>
 <sequence>
  <elementRef key="rendition" minOccurs="0"
   maxOccurs="unbounded"/>
  <elementRef key="namespace" minOccurs="0"
   maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration
element tagsDecl
{
   tei_att.global.attributes,
   attribute partial { text }?,
   ( rendition*, tei_namespace* )
}

Appendix A.1.99 <taxonomy>

<taxonomy> (taxonomy) defines a typology either implicitly, by means of a bibliographic citation, or explicitly by a structured taxonomy. [2.3.7. The Classification Declaration]
Moduleheader — Formal specification
Attributes
Contained by
May contain
core: bibl desc
Note

Nested taxonomies are common in many fields, so the <taxonomy> element can be nested.

Example
<taxonomy xml:id="tax.b">  <bibl>Brown Corpus</bibl>  <category xml:id="tax.b.a">   <catDesc>Press Reportage</catDesc>   <category xml:id="tax.b.a1">    <catDesc>Daily</catDesc>   </category>   <category xml:id="tax.b.a2">    <catDesc>Sunday</catDesc>   </category>   <category xml:id="tax.b.a3">    <catDesc>National</catDesc>   </category>   <category xml:id="tax.b.a4">    <catDesc>Provincial</catDesc>   </category>   <category xml:id="tax.b.a5">    <catDesc>Political</catDesc>   </category>   <category xml:id="tax.b.a6">    <catDesc>Sports</catDesc>   </category>  </category>  <category xml:id="tax.b.d">   <catDesc>Religion</catDesc>   <category xml:id="tax.b.d1">    <catDesc>Books</catDesc>   </category>   <category xml:id="tax.b.d2">    <catDesc>Periodicals and tracts</catDesc>   </category>  </category> </taxonomy>
Example
<taxonomy>  <category xml:id="literature">   <catDesc>Literature</catDesc>   <category xml:id="poetry">    <catDesc>Poetry</catDesc>    <category xml:id="sonnet">     <catDesc>Sonnet</catDesc>     <category xml:id="shakesSonnet">      <catDesc>Shakespearean Sonnet</catDesc>     </category>     <category xml:id="petraSonnet">      <catDesc>Petrarchan Sonnet</catDesc>     </category>    </category>    <category xml:id="haiku">     <catDesc>Haiku</catDesc>    </category>   </category>   <category xml:id="drama">    <catDesc>Drama</catDesc>   </category>  </category>  <category xml:id="meter">   <catDesc>Metrical Categories</catDesc>   <category xml:id="feet">    <catDesc>Metrical Feet</catDesc>    <category xml:id="iambic">     <catDesc>Iambic</catDesc>    </category>    <category xml:id="trochaic">     <catDesc>trochaic</catDesc>    </category>   </category>   <category xml:id="feetNumber">    <catDesc>Number of feet</catDesc>    <category xml:id="pentameter">     <catDesc>>Pentameter</catDesc>    </category>    <category xml:id="tetrameter">     <catDesc>>Tetrameter</catDesc>    </category>   </category>  </category> </taxonomy> <!-- elsewhere in document --> <lg ana="#shakesSonnet #iambic #pentameter">  <l>Shall I compare thee to a summer's day</l> <!-- ... --> </lg>
Content model
<content>
 <alternate>
  <alternate>
   <alternate minOccurs="1"
    maxOccurs="unbounded">
    <elementRef key="category"/>
    <elementRef key="taxonomy"/>
   </alternate>
   <sequence>
    <alternate minOccurs="1"
     maxOccurs="unbounded">
     <classRef key="model.descLike"
      minOccurs="1" maxOccurs="1"/>
     <elementRef key="equiv" minOccurs="1"
      maxOccurs="1"/>
     <elementRef key="gloss" minOccurs="1"
      maxOccurs="1"/>
    </alternate>
    <alternate minOccurs="0"
     maxOccurs="unbounded">
     <elementRef key="category"/>
     <elementRef key="taxonomy"/>
    </alternate>
   </sequence>
  </alternate>
  <sequence>
   <classRef key="model.biblLike"/>
   <alternate minOccurs="0"
    maxOccurs="unbounded">
    <elementRef key="category"/>
    <elementRef key="taxonomy"/>
   </alternate>
  </sequence>
 </alternate>
</content>
    
Schema Declaration
element taxonomy
{
   tei_att.global.attributes,
   tei_att.datcat.attributes,
   (
      (
         ( tei_category | tei_taxonomy )+
       | (
            ( tei_model.descLike | equiv | gloss )+,
            ( tei_category | tei_taxonomy )*
         )
      )
    | ( tei_model.biblLike, ( tei_category | tei_taxonomy )* )
   )
}

Appendix A.1.100 <teiCorpus>

<teiCorpus> (TEI corpus) contains the whole of a TEI encoded corpus, comprising a single corpus header and one or more <TEI> elements, each containing a single text header and a text. [4. Default Text Structure 16.1. Varieties of Composite Text]
Modulecore — Formal specification
Attributes
version(version) specifies the version number of the TEI Guidelines against which this document is valid.
StatusOptional
Datatypeteidata.version
Note

Major editions of the Guidelines have long been informally referred to by a name made up of the letter P (for Proposal) followed by a digit. The current release is one of the many releases of the fifth major edition of the Guidelines, known as P5. This attribute may be used to associate a TEI document with a specific release of the P5 Guidelines, in the absence of a more precise association provided by the source attribute on the associated <schemaSpec>.

Member of
Contained by
core: teiCorpus
May contain
core: teiCorpus
header: teiHeader
textstructure: TEI text
Note

Should contain one <teiHeader> for the corpus, and a series of <TEI> elements, one for each text.

As with all elements in the TEI scheme (except <egXML>) this element is in the TEI namespace (see 5.7.2. Namespaces). Thus, when it is used as the outermost element of a TEI document, it is necessary to specify the TEI namespace on it. This is customarily achieved by including http://www.tei-c.org/ns/1.0 as the value of the XML namespace declaration (xmlns), without indicating a prefix, and then not using a prefix on TEI elements in the rest of the document. For example: <teiCorpus version="4.8.1" xml:lang="en" xmlns="http://www.tei-c.org/ns/1.0">.

Example
<teiCorpus version="3.3.0" xmlns="http://www.tei-c.org/ns/1.0">  <teiHeader> <!-- header for corpus -->  </teiHeader>  <TEI>   <teiHeader> <!-- header for first text -->   </teiHeader>   <text> <!-- content of first text -->   </text>  </TEI>  <TEI>   <teiHeader> <!-- header for second text -->   </teiHeader>   <text> <!-- content of second text -->   </text>  </TEI> <!-- more TEI elements here --> </teiCorpus>
Content model
<content>
 <sequence>
  <elementRef key="teiHeader"/>
  <classRef key="model.resource"
   minOccurs="0" maxOccurs="unbounded"/>
  <classRef key="model.describedResource"
   minOccurs="1" maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration
element teiCorpus
{
   tei_att.global.attributes,
   tei_att.typed.attributes,
   attribute version { text }?,
   ( tei_teiHeader, tei_model.resource*, tei_model.describedResource+ )
}

Appendix A.1.101 <teiHeader>

<teiHeader> (TEI header) supplies descriptive and declarative metadata associated with a digital resource or set of resources. [2.1.1. The TEI Header and Its Components 16.1. Varieties of Composite Text]
Moduleheader — Formal specification
Attributes
Contained by
core: teiCorpus
textstructure: TEI
May contain
Note

One of the few elements unconditionally required in any TEI document.

Example
<teiHeader>  <fileDesc>   <titleStmt>    <title>Shakespeare: the first folio (1623) in electronic form</title>    <author>Shakespeare, William (1564–1616)</author>    <respStmt>     <resp>Originally prepared by</resp>     <name>Trevor Howard-Hill</name>    </respStmt>    <respStmt>     <resp>Revised and edited by</resp>     <name>Christine Avern-Carr</name>    </respStmt>   </titleStmt>   <publicationStmt>    <distributor>Oxford Text Archive</distributor>    <address>     <addrLine>13 Banbury Road, Oxford OX2 6NN, UK</addrLine>    </address>    <idno type="OTA">119</idno>    <availability>     <p>Freely available on a non-commercial basis.</p>    </availability>    <date when="1968">1968</date>   </publicationStmt>   <sourceDesc>    <bibl>The first folio of Shakespeare, prepared by Charlton Hinman (The Norton Facsimile,        1968)</bibl>   </sourceDesc>  </fileDesc>  <encodingDesc>   <projectDesc>    <p>Originally prepared for use in the production of a series of old-spelling        concordances in 1968, this text was extensively checked and revised for use during the        editing of the new Oxford Shakespeare (Wells and Taylor, 1989).</p>   </projectDesc>   <editorialDecl>    <correction>     <p>Turned letters are silently corrected.</p>    </correction>    <normalization>     <p>Original spelling and typography is retained, except that long s and ligatured          forms are not encoded.</p>    </normalization>   </editorialDecl>   <refsDecl xml:id="ASLREF">    <cRefPattern matchPattern="(\S+) ([^.]+)\.(.*)"     replacementPattern="#xpath(//div1[@n='$1']/div2/[@n='$2']//lb[@n='$3'])">     <p>A reference is created by assembling the following, in the reverse order as that          listed here: <list>       <item>the <att>n</att> value of the preceding <gi>lb</gi>       </item>       <item>a period</item>       <item>the <att>n</att> value of the ancestor <gi>div2</gi>       </item>       <item>a space</item>       <item>the <att>n</att> value of the parent <gi>div1</gi>       </item>      </list>     </p>    </cRefPattern>   </refsDecl>  </encodingDesc>  <revisionDesc>   <list>    <item>     <date when="1989-04-12">12 Apr 89</date> Last checked by CAC</item>    <item>     <date when="1989-03-01">1 Mar 89</date> LB made new file</item>   </list>  </revisionDesc> </teiHeader>
Content model
<content>
 <sequence>
  <elementRef key="fileDesc"/>
  <classRef key="model.teiHeaderPart"
   minOccurs="0" maxOccurs="unbounded"/>
  <elementRef key="revisionDesc"
   minOccurs="0"/>
 </sequence>
</content>
    
Schema Declaration
element teiHeader
{
   tei_att.global.attributes,
   ( tei_fileDesc, tei_model.teiHeaderPart*, tei_revisionDesc? )
}

Appendix A.1.102 <term>

<term> (term) contains a single-word, multi-word, or symbolic designation which is regarded as a technical term. [3.4.1. Terms and Glosses]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
Note

When this element appears within an <index> element, it is understood to supply the form under which an index entry is to be made for that location. Elsewhere, it is understood simply to indicate that its content is to be regarded as a technical or specialised term. It may be associated with a <gloss> element by means of its ref attribute; alternatively a <gloss> element may point to a <term> element by means of its target attribute.

In formal terminological work, there is frequently discussion over whether terms must be atomic or may include multi-word lexical items, symbolic designations, or phraseological units. The <term> element may be used to mark any of these. No position is taken on the philosophical issue of what a term can be; the looser definition simply allows the <term> element to be used by practitioners of any persuasion.

As with other members of the att.canonical class, instances of this element occuring in a text may be associated with a canonical definition, either by means of a URI (using the ref attribute), or by means of some system-specific code value (using the key attribute). Because the mutually exclusive target and cRef attributes overlap with the function of the ref attribute, they are deprecated and may be removed at a subsequent release.

Example
A computational device that infers structure from grammatical strings of words is known as a <term>parser</term>, and much of the history of NLP over the last 20 years has been occupied with the design of parsers.
Example
We may define <term xml:id="TDPV1rend="sc">discoursal point of view</term> as <gloss target="#TDPV1">the relationship, expressed through discourse structure, between the implied author or some other addresser, and the fiction.</gloss>
Example
We may define <term ref="#TDPV2rend="sc">discoursal point of view</term> as <gloss xml:id="TDPV2">the relationship, expressed through discourse structure, between the implied author or some other addresser, and the fiction.</gloss>
Example
We discuss Leech's concept of <term ref="myGlossary.xml#TDPV2rend="sc">discoursal point of view</term> below.
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element term
{
   tei_att.global.attributes,
   tei_att.cReferencing.attributes,
   tei_att.canonical.attributes,
   tei_att.cmc.attributes,
   tei_att.declaring.attributes,
   tei_att.pointing.attributes,
   tei_att.sortable.attributes,
   tei_att.typed.attributes,
   tei_macro.phraseSeq
}

Appendix A.1.103 <text>

<text> (text) contains a single text of any kind, whether unitary or composite, for example a poem or drama, a collection of essays, a novel, a dictionary, or a corpus sample. [4. Default Text Structure 16.1. Varieties of Composite Text]
Moduletextstructure — Formal specification
Attributes
Member of
Contained by
core: teiCorpus
textstructure: TEI
May contain
Note

This element should not be used to represent a text which is inserted at an arbitrary point within the structure of another, for example as in an embedded or quoted narrative; the <floatingText> is provided for this purpose.

Example
<text>  <front>   <docTitle>    <titlePart>Autumn Haze</titlePart>   </docTitle>  </front>  <body>   <l>Is it a dragonfly or a maple leaf</l>   <l>That settles softly down upon the water?</l>  </body> </text>
ExampleThe body of a text may be replaced by a group of nested texts, as in the following schematic:
<text>  <front> <!-- front matter for the whole group -->  </front>  <group>   <text> <!-- first text -->   </text>   <text> <!-- second text -->   </text>  </group> </text>
Content model
<content>
 <sequence>
  <classRef key="model.global"
   minOccurs="0" maxOccurs="unbounded"/>
  <sequence minOccurs="0">
   <elementRef key="front"/>
   <classRef key="model.global"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
  <alternate>
   <elementRef key="body"/>
   <elementRef key="group"/>
  </alternate>
  <classRef key="model.global"
   minOccurs="0" maxOccurs="unbounded"/>
  <sequence minOccurs="0">
   <elementRef key="back"/>
   <classRef key="model.global"
    minOccurs="0" maxOccurs="unbounded"/>
  </sequence>
 </sequence>
</content>
    
Schema Declaration
element text
{
   tei_att.global.attributes,
   tei_att.declaring.attributes,
   tei_att.typed.attributes,
   tei_att.written.attributes,
   (
      tei_model.global*,
      ( ( tei_front, tei_model.global* )? ),
      ( tei_body | group ),
      tei_model.global*,
      ( ( tei_back, tei_model.global* )? )
   )
}

Appendix A.1.104 <textClass>

<textClass> (text classification) groups information which describes the nature or topic of a text in terms of a standard classification scheme, thesaurus, etc. [2.4.3. The Text Classification]
Moduleheader — Formal specification
Attributes
Member of
Contained by
header: profileDesc
May contain
header: catRef
Example
<taxonomy>  <category xml:id="acprose">   <catDesc>Academic prose</catDesc>  </category> <!-- other categories here --> </taxonomy> <!-- ... --> <textClass>  <catRef target="#acprose"/>  <classCode scheme="http://www.udcc.org">001.9</classCode>  <keywords scheme="http://authorities.loc.gov">   <list>    <item>End of the world</item>    <item>History - philosophy</item>   </list>  </keywords> </textClass>
Schematron
<sch:pattern is-a="declarable"> <sch:param name="tde" value="tei:textClass"/> </sch:pattern>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <elementRef key="classCode"/>
  <elementRef key="catRef"/>
  <elementRef key="keywords"/>
 </alternate>
</content>
    
Schema Declaration
element textClass
{
   tei_att.global.attributes,
   tei_att.declarable.attributes,
   ( classCode | tei_catRef | keywords )*
}

Appendix A.1.105 <time>

<time> (time) contains a phrase defining a time of day in any format. [3.6.4. Dates and Times]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
Example
As he sat smiling, the quarter struck — <time when="11:45:00">the quarter to twelve</time>.
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.phrase"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Schema Declaration
element time
{
   tei_att.global.attributes,
   tei_att.calendarSystem.attributes,
   tei_att.canonical.attributes,
   tei_att.cmc.attributes,
   tei_att.datable.attributes,
   tei_att.dimensions.attributes,
   tei_att.editLike.attributes,
   tei_att.typed.attributes,
   ( text | tei_model.gLike | tei_model.phrase | tei_model.global )*
}

Appendix A.1.106 <title>

<title> (title) contains a title for any kind of work. [3.12.2.2. Titles, Authors, and Editors 2.2.1. The Title Statement 2.2.5. The Series Statement]
Modulecore — Formal specification
Attributes
typeclassifies the title according to some convenient typology.
Derived fromatt.typed
StatusOptional
Datatypeteidata.enumerated
Sample values include:
main
main title
sub
(subordinate) subtitle, title of part
alt
(alternate) alternate title, often in another language, by which the work is also known
short
abbreviated form of title
desc
(descriptive) descriptive paraphrase of the work functioning as a title
Note

This attribute is provided for convenience in analysing titles and processing them according to their type; where such specialized processing is not necessary, there is no need for such analysis, and the entire title, including subtitles and any parallel titles, may be enclosed within a single <title> element.

levelindicates the bibliographic level for a title, that is, whether it identifies an article, book, journal, series, or unpublished material.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
a
(analytic) the title applies to an analytic item, such as an article, poem, or other work published as part of a larger item.
m
(monographic) the title applies to a monograph such as a book or other item considered to be a distinct publication, including single volumes of multi-volume works
j
(journal) the title applies to any serial or periodical publication such as a journal, magazine, or newspaper
s
(series) the title applies to a series of otherwise distinct publications such as a collection
u
(unpublished) the title applies to any unpublished material (including theses and dissertations unless published by a commercial press)
Note

The level of a title is sometimes implied by its context: for example, a title appearing directly within an <analytic> element is ipso facto of level ‘a’, and one appearing within a <series> element of level ‘s’. For this reason, the level attribute is not required in contexts where its value can be unambiguously inferred. Where it is supplied in such contexts, its value should not contradict the value implied by its parent element.

Member of
Contained by
May contain
Note

The attributes key and ref, inherited from the class att.canonical may be used to indicate the canonical form for the title; the former, by supplying (for example) the identifier of a record in some external library system; the latter by pointing to an XML element somewhere containing the canonical form of the title.

Example
<title>Information Technology and the Research Process: Proceedings of a conference held at Cranfield Institute of Technology, UK, 18–21 July 1989</title>
Example
<title>Hardy's Tess of the D'Urbervilles: a machine readable edition</title>
Example
<title type="full">  <title type="main">Synthèse</title>  <title type="sub">an international journal for    epistemology, methodology and history of    science</title> </title>
Content model
<content>
 <macroRef key="macro.paraContent"/>
</content>
    
Schema Declaration
element title
{
   tei_att.global.attributes,
   tei_att.canonical.attributes,
   tei_att.cmc.attributes,
   tei_att.datable.attributes,
   tei_att.typed.attribute.subtype,
   attribute type { text }?,
   attribute level { "a" | "m" | "j" | "s" | "u" }?,
   tei_macro.paraContent
}

Appendix A.1.107 <titleStmt>

<titleStmt> (title statement) groups information about the title of a work and those responsible for its content. [2.2.1. The Title Statement 2.2. The File Description]
Moduleheader — Formal specification
Attributes
Contained by
header: fileDesc
May contain
Example
<titleStmt>  <title>Capgrave's Life of St. John Norbert: a machine-readable transcription</title>  <respStmt>   <resp>compiled by</resp>   <name>P.J. Lucas</name>  </respStmt> </titleStmt>
Content model
<content>
 <sequence>
  <elementRef key="title" minOccurs="1"
   maxOccurs="unbounded"/>
  <classRef key="model.respLike"
   minOccurs="0" maxOccurs="unbounded"/>
 </sequence>
</content>
    
Schema Declaration
element titleStmt
{
   tei_att.global.attributes,
   ( tei_title+, tei_model.respLike* )
}

Appendix A.1.108 <transpose>

<transpose> describes a single textual transposition as an ordered list of at least two pointers specifying the order in which the elements indicated should be re-combined. [12.3.4.5. Transpositions]
Moduletranscr — Formal specification
Attributes
Contained by
transcr: listTranspose
May containEmpty element
Note

Transposition is usually indicated in a document by a metamark such as a wavy line or numbering.

The order in which <ptr> elements appear within a <transpose> element should correspond with the desired order, as indicated by the metamark.

Example
<transpose>  <ptr target="#ib02"/>  <ptr target="#ib01"/> </transpose>
The transposition recorded here indicates that the content of the element with identifier ib02 should appear before the content of the element with identifier ib01.
Content model
<content>
 <elementRef key="ptr" minOccurs="2"
  maxOccurs="unbounded"/>
</content>
    
Schema Declaration
element transpose { tei_att.global.attributes, ( ptr, ptr, ptr* ) }

Appendix A.1.109 <undo>

<undo> indicates one or more marked-up interventions in a document which have subsequently been marked for cancellation. [12.3.4.4. Confirmation, Cancellation, and Reinstatement of Modifications]
Moduletranscr — Formal specification
Attributes
targetpoints to one or more elements representing the interventions which are to be reverted or undone.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Member of
Contained by
May containEmpty element
Example
<line>This is <del change="#s2rend="overstrike">   <seg xml:id="undo-a">just some</seg>    sample <seg xml:id="undo-b">text</seg>,    we need</del>  <add change="#s2">not</add> a real example.</line> <undo target="#undo-a #undo-b"  rend="dottedchange="#s3"/>
This encoding represents the following sequence of events:
  • "This is just some sample text, we need a real example" is written
  • At stage s2, "just some sample text, we need" is deleted by overstriking, and "not" is added
  • At stage s3, parts of the deletion are cancelled by underdotting, thus reinstating the words "just some" and "text".
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element undo
{
   tei_att.global.attributes,
   tei_att.dimensions.attributes,
   tei_att.spanning.attributes,
   tei_att.transcriptional.attributes,
   attribute target { list { + } }?,
   empty
}

Appendix A.1.110 <unicodeProp>

<unicodeProp> (unicode property) provides a Unicode property for a character (or glyph). [5.2.1. Character Properties]
Modulegaiji — Formal specification
Attributes
namespecifies the normalized name of a Unicode property.
StatusRequired
Datatypeteidata.xmlName
Legal values are:
Age
AHex
Alpha
Alphabetic
ASCII_Hex_Digit
bc
Bidi_C
Bidi_Class
Bidi_Control
Bidi_M
Bidi_Mirrored
Bidi_Mirroring_Glyph
Bidi_Paired_Bracket
Bidi_Paired_Bracket_Type
blk
Block
bmg
bpb
bpt
Canonical_Combining_Class
Case_Folding
Case_Ignorable
Cased
ccc
CE
cf
Changes_When_Casefolded
Changes_When_Casemapped
Changes_When_Lowercased
Changes_When_NFKC_Casefolded
Changes_When_Titlecased
Changes_When_Uppercased
CI
Comp_Ex
Composition_Exclusion
CWCF
CWCM
CWKCF
CWL
CWT
CWU
Dash
Decomposition_Mapping
Decomposition_Type
Default_Ignorable_Code_Point
Dep
Deprecated
DI
Dia
Diacritic
dm
dt
ea
East_Asian_Width
EqUIdeo
Equivalent_Unified_Ideograph
Expands_On_NFC
Expands_On_NFD
Expands_On_NFKC
Expands_On_NFKD
Ext
Extender
FC_NFKC
FC_NFKC_Closure
Full_Composition_Exclusion
gc
GCB
General_Category
Gr_Base
Gr_Ext
Gr_Link
Grapheme_Base
Grapheme_Cluster_Break
Grapheme_Extend
Grapheme_Link
Hangul_Syllable_Type
Hex
Hex_Digit
hst
Hyphen
ID_Continue
ID_Start
IDC
Ideo
Ideographic
IDS
IDS_Binary_Operator
IDS_Trinary_Operator
IDSB
IDST
Indic_Positional_Category
Indic_Syllabic_Category
InPC
InSC
isc
ISO_Comment
Jamo_Short_Name
jg
Join_C
Join_Control
Joining_Group
Joining_Type
JSN
jt
kAccountingNumeric
kCompatibilityVariant
kIICore
kIRG_GSource
kIRG_HSource
kIRG_JSource
kIRG_KPSource
kIRG_KSource
kIRG_MSource
kIRG_TSource
kIRG_USource
kIRG_VSource
kOtherNumeric
kPrimaryNumeric
kRSUnicode
lb
lc
Line_Break
LOE
Logical_Order_Exception
Lower
Lowercase
Lowercase_Mapping
Math
na
na1
Name
Name_Alias
NChar
NFC_QC
NFC_Quick_Check
NFD_QC
NFD_Quick_Check
NFKC_Casefold
NFKC_CF
NFKC_QC
NFKC_Quick_Check
NFKD_QC
NFKD_Quick_Check
Noncharacter_Code_Point
nt
Numeric_Type
Numeric_Value
nv
OAlpha
ODI
OGr_Ext
OIDC
OIDS
OLower
OMath
Other_Alphabetic
Other_Default_Ignorable_Code_Point
Other_Grapheme_Extend
Other_ID_Continue
Other_ID_Start
Other_Lowercase
Other_Math
Other_Uppercase
OUpper
Pat_Syn
Pat_WS
Pattern_Syntax
Pattern_White_Space
PCM
Prepended_Concatenation_Mark
QMark
Quotation_Mark
Radical
Regional_Indicator
RI
SB
sc
scf
Script
Script_Extensions
scx
SD
Sentence_Break
Sentence_Terminal
Simple_Case_Folding
Simple_Lowercase_Mapping
Simple_Titlecase_Mapping
Simple_Uppercase_Mapping
slc
Soft_Dotted
stc
STerm
suc
tc
Term
Terminal_Punctuation
Titlecase_Mapping
uc
UIdeo
Unicode_1_Name
Unified_Ideograph
Upper
Uppercase
Uppercase_Mapping
Variation_Selector
Vertical_Orientation
vo
VS
WB
White_Space
Word_Break
WSpace
XID_Continue
XID_Start
XIDC
XIDS
XO_NFC
XO_NFD
XO_NFKC
XO_NFKD
valuespecifies the value of a named Unicode property.
StatusRequired
Datatypeteidata.text
Contained by
gaiji: char glyph
May containEmpty element
Note

A definitive list of current Unicode property names is provided in The Unicode Standard.

Example
<char xml:id="U4EBA_circled">  <unicodeProp name="Decomposition_Mapping"   value="circleversion="12.1"/>  <localProp name="Name"   value="CIRCLED IDEOGRAPH 4EBA"/>  <localProp name="daikanwavalue="36"/>  <mapping type="standard"></mapping> </char>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element unicodeProp
{
   tei_att.global.attributes,
   tei_att.gaijiProp.attribute.version,
   tei_att.gaijiProp.attribute.scheme,
   tei_att.datable.attribute.period,
   tei_att.datable.w3c.attribute.when,
   tei_att.datable.w3c.attribute.notBefore,
   tei_att.datable.w3c.attribute.notAfter,
   tei_att.datable.w3c.attribute.from,
   tei_att.datable.w3c.attribute.to,
   attribute name
   {
      "Age"
    | "AHex"
    | "Alpha"
    | "Alphabetic"
    | "ASCII_Hex_Digit"
    | "bc"
    | "Bidi_C"
    | "Bidi_Class"
    | "Bidi_Control"
    | "Bidi_M"
    | "Bidi_Mirrored"
    | "Bidi_Mirroring_Glyph"
    | "Bidi_Paired_Bracket"
    | "Bidi_Paired_Bracket_Type"
    | "blk"
    | "Block"
    | "bmg"
    | "bpb"
    | "bpt"
    | "Canonical_Combining_Class"
    | "Case_Folding"
    | "Case_Ignorable"
    | "Cased"
    | "ccc"
    | "CE"
    | "cf"
    | "Changes_When_Casefolded"
    | "Changes_When_Casemapped"
    | "Changes_When_Lowercased"
    | "Changes_When_NFKC_Casefolded"
    | "Changes_When_Titlecased"
    | "Changes_When_Uppercased"
    | "CI"
    | "Comp_Ex"
    | "Composition_Exclusion"
    | "CWCF"
    | "CWCM"
    | "CWKCF"
    | "CWL"
    | "CWT"
    | "CWU"
    | "Dash"
    | "Decomposition_Mapping"
    | "Decomposition_Type"
    | "Default_Ignorable_Code_Point"
    | "Dep"
    | "Deprecated"
    | "DI"
    | "Dia"
    | "Diacritic"
    | "dm"
    | "dt"
    | "ea"
    | "East_Asian_Width"
    | "EqUIdeo"
    | "Equivalent_Unified_Ideograph"
    | "Expands_On_NFC"
    | "Expands_On_NFD"
    | "Expands_On_NFKC"
    | "Expands_On_NFKD"
    | "Ext"
    | "Extender"
    | "FC_NFKC"
    | "FC_NFKC_Closure"
    | "Full_Composition_Exclusion"
    | "gc"
    | "GCB"
    | "General_Category"
    | "Gr_Base"
    | "Gr_Ext"
    | "Gr_Link"
    | "Grapheme_Base"
    | "Grapheme_Cluster_Break"
    | "Grapheme_Extend"
    | "Grapheme_Link"
    | "Hangul_Syllable_Type"
    | "Hex"
    | "Hex_Digit"
    | "hst"
    | "Hyphen"
    | "ID_Continue"
    | "ID_Start"
    | "IDC"
    | "Ideo"
    | "Ideographic"
    | "IDS"
    | "IDS_Binary_Operator"
    | "IDS_Trinary_Operator"
    | "IDSB"
    | "IDST"
    | "Indic_Positional_Category"
    | "Indic_Syllabic_Category"
    | "InPC"
    | "InSC"
    | "isc"
    | "ISO_Comment"
    | "Jamo_Short_Name"
    | "jg"
    | "Join_C"
    | "Join_Control"
    | "Joining_Group"
    | "Joining_Type"
    | "JSN"
    | "jt"
    | "kAccountingNumeric"
    | "kCompatibilityVariant"
    | "kIICore"
    | "kIRG_GSource"
    | "kIRG_HSource"
    | "kIRG_JSource"
    | "kIRG_KPSource"
    | "kIRG_KSource"
    | "kIRG_MSource"
    | "kIRG_TSource"
    | "kIRG_USource"
    | "kIRG_VSource"
    | "kOtherNumeric"
    | "kPrimaryNumeric"
    | "kRSUnicode"
    | "lb"
    | "lc"
    | "Line_Break"
    | "LOE"
    | "Logical_Order_Exception"
    | "Lower"
    | "Lowercase"
    | "Lowercase_Mapping"
    | "Math"
    | "na"
    | "na1"
    | "Name"
    | "Name_Alias"
    | "NChar"
    | "NFC_QC"
    | "NFC_Quick_Check"
    | "NFD_QC"
    | "NFD_Quick_Check"
    | "NFKC_Casefold"
    | "NFKC_CF"
    | "NFKC_QC"
    | "NFKC_Quick_Check"
    | "NFKD_QC"
    | "NFKD_Quick_Check"
    | "Noncharacter_Code_Point"
    | "nt"
    | "Numeric_Type"
    | "Numeric_Value"
    | "nv"
    | "OAlpha"
    | "ODI"
    | "OGr_Ext"
    | "OIDC"
    | "OIDS"
    | "OLower"
    | "OMath"
    | "Other_Alphabetic"
    | "Other_Default_Ignorable_Code_Point"
    | "Other_Grapheme_Extend"
    | "Other_ID_Continue"
    | "Other_ID_Start"
    | "Other_Lowercase"
    | "Other_Math"
    | "Other_Uppercase"
    | "OUpper"
    | "Pat_Syn"
    | "Pat_WS"
    | "Pattern_Syntax"
    | "Pattern_White_Space"
    | "PCM"
    | "Prepended_Concatenation_Mark"
    | "QMark"
    | "Quotation_Mark"
    | "Radical"
    | "Regional_Indicator"
    | "RI"
    | "SB"
    | "sc"
    | "scf"
    | "Script"
    | "Script_Extensions"
    | "scx"
    | "SD"
    | "Sentence_Break"
    | "Sentence_Terminal"
    | "Simple_Case_Folding"
    | "Simple_Lowercase_Mapping"
    | "Simple_Titlecase_Mapping"
    | "Simple_Uppercase_Mapping"
    | "slc"
    | "Soft_Dotted"
    | "stc"
    | "STerm"
    | "suc"
    | "tc"
    | "Term"
    | "Terminal_Punctuation"
    | "Titlecase_Mapping"
    | "uc"
    | "UIdeo"
    | "Unicode_1_Name"
    | "Unified_Ideograph"
    | "Upper"
    | "Uppercase"
    | "Uppercase_Mapping"
    | "Variation_Selector"
    | "Vertical_Orientation"
    | "vo"
    | "VS"
    | "WB"
    | "White_Space"
    | "Word_Break"
    | "WSpace"
    | "XID_Continue"
    | "XID_Start"
    | "XIDC"
    | "XIDS"
    | "XO_NFC"
    | "XO_NFD"
    | "XO_NFKC"
    | "XO_NFKD"
   },
   attribute value { text },
   empty
}

Appendix A.1.111 <unihanProp>

<unihanProp> (unihan property) holds the name and value of a normative or informative Unihan character (or glyph) property as part of its attributes. [5.2.1. Character Properties]
Modulegaiji — Formal specification
Attributes
namespecifies the normalized name of a unicode han database (Unihan) property.
StatusRequired
Datatypeteidata.xmlName
Legal values are:
kZVariant
kAccountingNumeric
kBigFive
kCCCII
kCNS1986
kCNS1992
kCangjie
kCantonese
kCheungBauer
kCheungBauerIndex
kCihaiT
kCompatibilityVariant
kCowles
kDaeJaweon
kDefinition
kEACC
kFenn
kFennIndex
kFourCornerCode
kFrequency
kGB0
kGB1
kGB3
kGB5
kGB7
kGB8
kGSR
kGradeLevel
kHDZRadBreak
kHKGlyph
kHKSCS
kHanYu
kHangul
kHanyuPinlu
kHanyuPinyin
kIBMJapan
kIICore
kIRGDaeJaweon
kIRGDaiKanwaZiten
kIRGHanyuDaZidian
kIRGKangXi
kIRG_GSource
kIRG_HSource
kIRG_JSource
kIRG_KPSource
kIRG_KSource
kIRG_MSource
kIRG_TSource
kIRG_USource
kIRG_VSource
kJIS0213
kJa
kJapaneseKun
kJapaneseOn
kJinmeiyoKanji
kJis0
kJis1
kJoyoKanji
kKPS0
kKPS1
kKSC0
kKSC1
kKangXi
kKarlgren
kKorean
kKoreanEducationHanja
kKoreanName
kLau
kMainlandTelegraph
kMandarin
kMatthews
kMeyerWempe
kMorohashi
kNelson
kOtherNumeric
kPhonetic
kPrimaryNumeric
kPseudoGB1
kRSAdobe_Japan1_6
kRSJapanese
kRSKanWa
kRSKangXi
kRSKorean
kRSUnicode
kSBGY
kSemanticVariant
kSimplifiedVariant
kSpecializedSemanticVariant
kTGH
kTaiwanTelegraph
kTang
kTotalStrokes
kTraditionalVariant
kVietnamese
kXHC1983
kXerox
valuespecifies the value of a named Unihan property
StatusRequired
Datatypeteidata.word
Contained by
gaiji: char glyph
May containEmpty element
Note

A definitive list of current Unihan property names is provided in the Unicode Han Database.

Example
<unihanProp name="kRSKangXivalue="120.5"  version="12.1"/>
Content model
<content>
 <empty/>
</content>
    
Schema Declaration
element unihanProp
{
   tei_att.global.attributes,
   tei_att.gaijiProp.attribute.version,
   tei_att.gaijiProp.attribute.scheme,
   tei_att.datable.attribute.period,
   tei_att.datable.w3c.attribute.when,
   tei_att.datable.w3c.attribute.notBefore,
   tei_att.datable.w3c.attribute.notAfter,
   tei_att.datable.w3c.attribute.from,
   tei_att.datable.w3c.attribute.to,
   attribute name
   {
      "kZVariant"
    | "kAccountingNumeric"
    | "kBigFive"
    | "kCCCII"
    | "kCNS1986"
    | "kCNS1992"
    | "kCangjie"
    | "kCantonese"
    | "kCheungBauer"
    | "kCheungBauerIndex"
    | "kCihaiT"
    | "kCompatibilityVariant"
    | "kCowles"
    | "kDaeJaweon"
    | "kDefinition"
    | "kEACC"
    | "kFenn"
    | "kFennIndex"
    | "kFourCornerCode"
    | "kFrequency"
    | "kGB0"
    | "kGB1"
    | "kGB3"
    | "kGB5"
    | "kGB7"
    | "kGB8"
    | "kGSR"
    | "kGradeLevel"
    | "kHDZRadBreak"
    | "kHKGlyph"
    | "kHKSCS"
    | "kHanYu"
    | "kHangul"
    | "kHanyuPinlu"
    | "kHanyuPinyin"
    | "kIBMJapan"
    | "kIICore"
    | "kIRGDaeJaweon"
    | "kIRGDaiKanwaZiten"
    | "kIRGHanyuDaZidian"
    | "kIRGKangXi"
    | "kIRG_GSource"
    | "kIRG_HSource"
    | "kIRG_JSource"
    | "kIRG_KPSource"
    | "kIRG_KSource"
    | "kIRG_MSource"
    | "kIRG_TSource"
    | "kIRG_USource"
    | "kIRG_VSource"
    | "kJIS0213"
    | "kJa"
    | "kJapaneseKun"
    | "kJapaneseOn"
    | "kJinmeiyoKanji"
    | "kJis0"
    | "kJis1"
    | "kJoyoKanji"
    | "kKPS0"
    | "kKPS1"
    | "kKSC0"
    | "kKSC1"
    | "kKangXi"
    | "kKarlgren"
    | "kKorean"
    | "kKoreanEducationHanja"
    | "kKoreanName"
    | "kLau"
    | "kMainlandTelegraph"
    | "kMandarin"
    | "kMatthews"
    | "kMeyerWempe"
    | "kMorohashi"
    | "kNelson"
    | "kOtherNumeric"
    | "kPhonetic"
    | "kPrimaryNumeric"
    | "kPseudoGB1"
    | "kRSAdobe_Japan1_6"
    | "kRSJapanese"
    | "kRSKanWa"
    | "kRSKangXi"
    | "kRSKorean"
    | "kRSUnicode"
    | "kSBGY"
    | "kSemanticVariant"
    | "kSimplifiedVariant"
    | "kSpecializedSemanticVariant"
    | "kTGH"
    | "kTaiwanTelegraph"
    | "kTang"
    | "kTotalStrokes"
    | "kTraditionalVariant"
    | "kVietnamese"
    | "kXHC1983"
    | "kXerox"
   },
   attribute value { text },
   empty
}

Appendix A.1.112 <unit>

<unit> contains a symbol, a word or a phrase referring to a unit of measurement in any kind of formal or informal system. [3.6.3. Numbers and Measures]
Modulecore — Formal specification
Attributes
Member of
Contained by
May contain
ExampleHere is an example of a <unit> element holding a unitRef attribute that points to a definition of the unit in the TEI header.
<measure>  <num>3</num>  <unit unitRef="#ell">ells</unit> </measure> <!-- In the TEI Header: --> <encodingDesc>  <unitDecl>   <unitDef xml:id="ell">    <label>ell</label>    <placeName ref="#iceland"/>    <desc>A unit of measure for cloth, roughly equivalent to 18 inches, or from an adult male’s elbow to the tip of the middle finger.</desc>   </unitDef>  </unitDecl> </encodingDesc>
Example
<measure>  <num>2</num>  <unit>kg</unit> </measure>
Example
<measure type="value">  <num>3</num>  <unit type="timeunit="min">minute</unit> </measure>
Example
<measure type="interval">  <num atLeast="1.2">1.2</num> to <num atMost="5.6">5.6</num>  <unit type="velocityunit="km/h">km/h</unit> </measure>
Example
<p>Light travels at <num value="3E10">3×10^10</num>  <unit type="rateunit="cm/s">   <unit type="space">cm</unit> per <unit type="time">second</unit>  </unit>.</p>
Content model
<content>
 <macroRef key="macro.phraseSeq"/>
</content>
    
Schema Declaration
element unit
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.measurement.attributes,
   tei_att.typed.attributes,
   tei_macro.phraseSeq
}

Appendix A.1.113 <w>

<w> (word) represents a grammatical (not necessarily orthographic) word. [18.1. Linguistic Segment Categories 18.4.2. Lightweight Linguistic Annotation]
Moduleanalysis — Formal specification
Attributes
Member of
Contained by
May contain
ExampleThis example is adapted from the Folger Library’s Early Modern English Drama version of The Wits: a Comedy by William Davenant.
<l>  <w lemma="itpos="pn"   xml:id="A19883-003-a-0100">IT</w>  <w lemma="havepos="vvz"   xml:id="A19883-003-a-0110">hath</w>  <w lemma="bepos="vvn"   xml:id="A19883-003-a-0120">been</w>  <w lemma="saypos="vvn"   xml:id="A19883-003-a-0130">said</w>  <w lemma="ofpos="acp-p"   xml:id="A19883-003-a-0140">of</w>  <w lemma="oldpos="j"   xml:id="A19883-003-a-0150">old</w>  <pc xml:id="A19883-003-a-0160">,</pc>  <w lemma="thatpos="cs"   xml:id="A19883-003-a-0170">that</w>  <w lemma="playpos="vvz"   xml:id="A19883-003-a-0180">   <choice>    <orig>Playes</orig>    <reg>Plays</reg>   </choice>  </w>  <w lemma="bepos="vvb"   xml:id="A19883-003-a-0190">are</w>  <w lemma="feastpos="n2"   xml:id="A19883-003-a-0200">Feasts</w>  <pc xml:id="A19883-003-a-0210">,</pc> </l> <l xml:id="A19883-e100220">  <w lemma="poetpos="n2"   xml:id="A19883-003-a-0220">Poets</w>  <w lemma="thepos="d"   xml:id="A19883-003-a-0230">the</w>  <w lemma="cookpos="n2"   xml:id="A19883-003-a-0240">   <choice>    <orig>Cookes</orig>    <reg>Cooks</reg>   </choice>  </w>  <pc xml:id="A19883-003-a-0250">,</pc>  <w lemma="andpos="cc"   xml:id="A19883-003-a-0260">and</w>  <w lemma="thepos="d"   xml:id="A19883-003-a-0270">the</w>  <w lemma="spectatorpos="n2"   xml:id="A19883-003-a-0280">Spectators</w>  <w lemma="guestpos="n2"   xml:id="A19883-003-a-0290">Guests</w>  <pc xml:id="A19883-003-a-0300">,</pc> </l> <l xml:id="A19883-e100230">  <w lemma="thepos="d"   xml:id="A19883-003-a-0310">The</w>  <w lemma="actorpos="n2"   xml:id="A19883-003-a-0320">Actors</w>  <w lemma="waiterpos="n2"   xml:id="A19883-003-a-0330">Waiters</w>  <pc xml:id="A19883-003-a-0340">:</pc> <!-- ... --> </l>
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <elementRef key="seg"/>
  <elementRef key="w"/>
  <elementRef key="m"/>
  <elementRef key="c"/>
  <elementRef key="pc"/>
  <classRef key="model.global"/>
  <classRef key="model.lPart"/>
  <classRef key="model.hiLike"/>
  <classRef key="model.pPart.edit"/>
 </alternate>
</content>
    
Schema Declaration
element w
{
   tei_att.global.attributes,
   tei_att.cmc.attributes,
   tei_att.linguistic.attributes,
   tei_att.notated.attributes,
   tei_att.segLike.attributes,
   tei_att.typed.attributes,
   (
      text
    | tei_model.gLike
    | seg
    | tei_w
    | m
    | c
    | tei_pc
    | tei_model.global
    | tei_model.lPart
    | tei_model.hiLike
    | tei_model.pPart.edit
   )*
}

Appendix A.1.114 <zone>

<zone> defines any two-dimensional area within a <surface> element. [12.1. Digital Facsimiles 12.2.2. Embedded Transcription]
Moduletranscr — Formal specification
Attributes
rotateindicates the amount by which this zone has been rotated clockwise, with respect to the normal orientation of the parent <surface> element as implied by the dimensions given in the <msDesc> element or by the coordinates of the <surface> itself. The orientation is expressed in arc degrees.
StatusOptional
Datatypeteidata.numeric
Default0
Member of
Contained by
transcr: line surface zone
May contain
Note

The position of every zone for a given surface is always defined by reference to the coordinate system defined for that surface.

A graphic element contained by a zone represents the whole of the zone.

A zone may be of any shape. The attribute points may be used to define a polygonal zone, using the coordinate system defined by its parent surface.

A zone is always a closed polygon. Repeating the initial coordinate at the end of the sequence is optional. To encode an unclosed path, use the <path> element.

Example
<surface ulx="14.54uly="16.14lrx="0"  lry="0">  <graphic url="stone.jpg"/>  <zone points="4.6,6.3 5.25,5.85 6.2,6.6 8.19222,7.4125 9.89222,6.5875 10.9422,6.1375 11.4422,6.7125 8.21722,8.3125 6.2,7.65"/> </surface>
This example defines a non-rectangular zone: see the illustration in section [[undefined PH-surfzone]].
Example
<facsimile>  <surface ulx="50uly="20lrx="400"   lry="280">   <zone ulx="0uly="0lrx="500lry="321">    <graphic url="graphic.png"/>   </zone>  </surface> </facsimile>
This example defines a zone which has been defined as larger than its parent surface in order to match the dimensions of the graphic it contains.
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.graphicLike"/>
  <classRef key="model.global"/>
  <elementRef key="surface"/>
  <classRef key="model.linePart"/>
 </alternate>
</content>
    
Schema Declaration
element zone
{
   tei_att.global.attributes,
   tei_att.coordinated.attributes,
   tei_att.typed.attributes,
   tei_att.written.attributes,
   attribute rotate { text }?,
   (
      text
    | tei_model.gLike
    | tei_model.graphicLike
    | tei_model.global
    | tei_surface
    | tei_model.linePart
   )*
}

Appendix A.2 Model classes

Appendix A.2.1 model.addressLike

model.addressLike groups elements used to represent a postal or email address. [1. The TEI Infrastructure]
Moduletei — Formal specification
Used by
Membersemail

Appendix A.2.2 model.applicationLike

model.applicationLike groups elements used to record application-specific information about a document in its header.
Moduletei — Formal specification
Used by
Membersapplication

Appendix A.2.3 model.attributable

model.attributable groups elements that contain a word or phrase that can be attributed to a source. [3.3.3. Quotation 4.3.2. Floating Texts]
Moduletei — Formal specification
Used by
Membersmodel.quoteLike

Appendix A.2.4 model.availabilityPart

model.availabilityPart groups elements such as licences and paragraphs of text which may appear as part of an availability statement. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduletei — Formal specification
Used by
Memberslicence

Appendix A.2.5 model.biblLike

model.biblLike groups elements containing a bibliographic description. [3.12. Bibliographic Citations and References]
Moduletei — Formal specification
Used by
Membersbibl

Appendix A.2.6 model.biblPart

model.biblPart groups elements which represent components of a bibliographic description. [3.12. Bibliographic Citations and References]
Moduletei — Formal specification
Used by
Membersmodel.imprintPart[pubPlace publisher] model.respLike[funder meeting respStmt] availability bibl edition extent

Appendix A.2.7 model.common

model.common groups common chunk- and inter-level elements. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.cmc model.divPart[model.lLike model.pLike[p]] model.inter[model.attributable[model.quoteLike] model.biblLike[bibl] model.egLike model.labelLike[desc label] model.listLike model.oddDecl model.stageLike]
Note

This class defines the set of chunk- and inter-level elements; it is used in many content models, including those for textual divisions.

Appendix A.2.8 model.dateLike

model.dateLike groups elements containing temporal expressions. [3.6.4. Dates and Times 14.4. Dates]
Moduletei — Formal specification
Used by
Membersdate time

Appendix A.2.9 model.descLike

model.descLike groups elements which contain a description of their function.
Moduletei — Formal specification
Used by
Membersdesc

Appendix A.2.10 model.describedResource

model.describedResource groups elements which contain the content of a digital resource and its metadata; these elements may serve as the outermost or ‘root’ element of a TEI-conformant document. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
MembersTEI teiCorpus

Appendix A.2.11 model.divBottom

model.divBottom groups elements appearing at the end of a text division. [4.2. Elements Common to All Divisions]
Moduletei — Formal specification
Used by
Membersmodel.divBottomPart model.divWrapper[docDate meeting]

Appendix A.2.12 model.divLike

model.divLike groups elements used to represent un-numbered generic structural divisions.
Moduletei — Formal specification
Used by
Membersdiv

Appendix A.2.13 model.divPart

model.divPart groups paragraph-level elements appearing directly within divisions. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.lLike model.pLike[p]
Note

Note that this element class does not include members of the model.inter class, which can appear either within or between paragraph-level items.

Appendix A.2.14 model.divTop

model.divTop groups elements appearing at the beginning of a text division. [4.2. Elements Common to All Divisions]
Moduletei — Formal specification
Used by
Membersmodel.divTopPart[model.headLike[head]] model.divWrapper[docDate meeting]

Appendix A.2.15 model.divTopPart

model.divTopPart groups elements which can occur only at the beginning of a text division. [4.6. Title Pages]
Moduletei — Formal specification
Used by
Membersmodel.headLike[head]

Appendix A.2.16 model.divWrapper

model.divWrapper groups elements which can appear at either top or bottom of a textual division. [4.2. Elements Common to All Divisions]
Moduletei — Formal specification
Used by
MembersdocDate meeting

Appendix A.2.17 model.editorialDeclPart

model.editorialDeclPart groups elements which may be used inside <editorialDecl> and appear multiple times.
Moduletei — Formal specification
Used by
Memberscorrection hyphenation normalization quotation segmentation

Appendix A.2.18 model.emphLike

model.emphLike groups phrase-level elements which are typographically distinct and to which a specific function can be attributed. [3.3. Highlighting and Quotation]
Moduletei — Formal specification
Used by
Membersterm title

Appendix A.2.19 model.encodingDescPart

model.encodingDescPart groups elements which may be used inside <encodingDesc> and appear multiple times.
Moduletei — Formal specification
Used by
MembersappInfo charDecl classDecl editorialDecl listPrefixDef projectDesc tagsDecl

Appendix A.2.20 model.frontPart

model.frontPart groups elements which appear at the level of divisions within front or back matter. [7.1. Front and Back Matter ]
Moduletei — Formal specification
Used by
Membersmodel.frontPart.drama

Appendix A.2.21 model.gLike

model.gLike groups elements used to represent individual non-Unicode characters or glyphs.
Moduletei — Formal specification
Used by
Membersg

Appendix A.2.23 model.global.edit

model.global.edit groups globally available elements which perform a specifically editorial function. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
MembersaddSpan damageSpan delSpan gap space

Appendix A.2.24 model.global.meta

model.global.meta groups globally available elements which describe the status of other elements. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
MemberslistTranspose substJoin
Note

Elements in this class are typically used to hold groups of links or of abstract interpretations, or by provide indications of certainty etc. It may find be convenient to localize all metadata elements, for example to contain them within the same divison as the elements that they relate to; or to locate them all to a division of their own. They may however appear at any point in a TEI text.

Appendix A.2.25 model.graphicLike

model.graphicLike groups elements containing images, formulae, and similar objects. [3.10. Graphics and Other Non-textual Components]
Moduletei — Formal specification
Used by
Membersgraphic media

Appendix A.2.26 model.headLike

model.headLike groups elements used to provide a title or heading at the start of a text division.
Moduletei — Formal specification
Used by
Membershead

Appendix A.2.27 model.highlighted

model.highlighted groups phrase-level elements which are typographically distinct. [3.3. Highlighting and Quotation]
Moduletei — Formal specification
Used by
Membersmodel.emphLike[term title] model.hiLike

Appendix A.2.28 model.imprintPart

model.imprintPart groups the bibliographic elements which occur inside imprints. [3.12. Bibliographic Citations and References]
Moduletei — Formal specification
Used by
MemberspubPlace publisher

Appendix A.2.29 model.inter

model.inter groups elements which can appear either within or between paragraph-like elements. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.attributable[model.quoteLike] model.biblLike[bibl] model.egLike model.labelLike[desc label] model.listLike model.oddDecl model.stageLike

Appendix A.2.30 model.labelLike

model.labelLike groups elements used to gloss or explain other parts of a document.
Moduletei — Formal specification
Used by
Membersdesc label

Appendix A.2.31 model.limitedPhrase

model.limitedPhrase groups phrase-level elements excluding those elements primarily intended for transcription of existing sources. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.emphLike[term title] model.hiLike model.pPart.data[model.addressLike[email] model.dateLike[date time] model.measureLike[measure num unit] model.nameLike[model.nameLike.agent[name] model.offsetLike model.placeStateLike[model.placeNamePart] idno]] model.pPart.editorial[ex subst] model.pPart.msdesc model.phrase.xml model.ptrLike[ref]

Appendix A.2.32 model.linePart

model.linePart groups transcriptional elements which appear within lines or zones of a source-oriented transcription within a <sourceDoc> element.
Moduletei — Formal specification
Used by
Membersmodel.hiLike damage handShift line mod path pc redo restore retrace undo w zone

Appendix A.2.33 model.measureLike

model.measureLike groups elements which denote a number, a quantity, a measurement, or similar piece of text that conveys some numerical meaning. [3.6.3. Numbers and Measures]
Moduletei — Formal specification
Used by
Membersmeasure num unit

Appendix A.2.34 model.milestoneLike

model.milestoneLike groups milestone-style elements used to represent reference systems. [1.3. The TEI Class System 3.11.3. Milestone Elements]
Moduletei — Formal specification
Used by
Membersfw pb

Appendix A.2.35 model.nameLike

model.nameLike groups elements which name or refer to a person, place, or organization.
Moduletei — Formal specification
Used by
Membersmodel.nameLike.agent[name] model.offsetLike model.placeStateLike[model.placeNamePart] idno
Note

A superset of the naming elements that may appear in datelines, addresses, statements of responsibility, etc.

Appendix A.2.36 model.nameLike.agent

model.nameLike.agent groups elements which contain names of individuals or corporate bodies. [3.6. Names, Numbers, Dates, Abbreviations, and Addresses]
Moduletei — Formal specification
Used by
Membersname
Note

This class is used in the content model of elements which reference names of people or organizations.

Appendix A.2.37 model.noteLike

model.noteLike groups globally-available note-like elements. [3.9. Notes, Annotation, and Indexing]
Moduletei — Formal specification
Used by
Membersnote

Appendix A.2.39 model.pLike.front

model.pLike.front groups paragraph-like elements which can occur as direct constituents of front matter. [4.6. Title Pages]
Moduletei — Formal specification
Used by
MembersdocDate head

Appendix A.2.40 model.pPart.data

model.pPart.data groups phrase-level elements containing names, dates, numbers, measures, and similar data. [3.6. Names, Numbers, Dates, Abbreviations, and Addresses]
Moduletei — Formal specification
Used by
Membersmodel.addressLike[email] model.dateLike[date time] model.measureLike[measure num unit] model.nameLike[model.nameLike.agent[name] model.offsetLike model.placeStateLike[model.placeNamePart] idno]

Appendix A.2.41 model.pPart.edit

model.pPart.edit groups phrase-level elements for simple editorial correction and transcription. [3.5. Simple Editorial Changes]
Moduletei — Formal specification
Used by
Membersmodel.pPart.editorial[ex subst] model.pPart.transcriptional[damage handShift mod redo restore retrace secl supplied surplus undo]

Appendix A.2.42 model.pPart.editorial

model.pPart.editorial groups phrase-level elements for simple editorial interventions that may be useful both in transcribing and in authoring. [3.5. Simple Editorial Changes]
Moduletei — Formal specification
Used by
Membersex subst

Appendix A.2.43 model.pPart.transcriptional

model.pPart.transcriptional groups phrase-level elements used for editorial transcription of pre-existing source materials. [3.5. Simple Editorial Changes]
Moduletei — Formal specification
Used by
Membersdamage handShift mod redo restore retrace secl supplied surplus undo

Appendix A.2.45 model.phrase

model.phrase groups elements which can occur at the level of individual words or phrases. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersmodel.graphicLike[graphic media] model.highlighted[model.emphLike[term title] model.hiLike] model.lPart model.pPart.data[model.addressLike[email] model.dateLike[date time] model.measureLike[measure num unit] model.nameLike[model.nameLike.agent[name] model.offsetLike model.placeStateLike[model.placeNamePart] idno]] model.pPart.edit[model.pPart.editorial[ex subst] model.pPart.transcriptional[damage handShift mod redo restore retrace secl supplied surplus undo]] model.pPart.msdesc model.phrase.xml model.ptrLike[ref] model.segLike[pc s w] model.specDescLike
Note

This class of elements can occur within paragraphs, list items, lines of verse, etc.

Appendix A.2.46 model.placeStateLike

model.placeStateLike groups elements which describe changing states of a place.
Moduletei — Formal specification
Used by
Membersmodel.placeNamePart

Appendix A.2.47 model.profileDescPart

model.profileDescPart groups elements which may be used inside <profileDesc> and appear multiple times.
Moduletei — Formal specification
Used by
MembershandNotes langUsage listTranspose particDesc settingDesc textClass

Appendix A.2.48 model.ptrLike

model.ptrLike groups elements used for purposes of location and reference. [3.7. Simple Links and Cross-References]
Moduletei — Formal specification
Used by
Membersref

Appendix A.2.49 model.publicationStmtPart.agency

model.publicationStmtPart.agency groups the child elements of a <publicationStmt> element of the TEI header that indicate an authorising agent. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduletei — Formal specification
Used by
Memberspublisher
Note

The ‘agency’ child elements, while not required, are required if one of the ‘detail’ child elements is to be used. It is not valid to have a ‘detail’ child element without a preceding ‘agency’ child element.

See also model.publicationStmtPart.detail.

Appendix A.2.50 model.publicationStmtPart.detail

model.publicationStmtPart.detail groups the agency-specific child elements of the <publicationStmt> element of the TEI header. [2.2.4. Publication, Distribution, Licensing, etc.]
Moduletei — Formal specification
Used by
Membersmodel.ptrLike[ref] availability date idno pubPlace
Note

A ‘detail’ child element may not occur unless an ‘agency’ child element precedes it.

See also model.publicationStmtPart.agency.

Appendix A.2.51 model.resource

model.resource groups separate elements which constitute the content of a digital resource, as opposed to its metadata. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Membersfacsimile sourceDoc text

Appendix A.2.52 model.respLike

model.respLike groups elements which are used to indicate intellectual or other significant responsibility, for example within a bibliographic element.
Moduletei — Formal specification
Used by
Membersfunder meeting respStmt

Appendix A.2.53 model.segLike

model.segLike groups elements used for arbitrary segmentation. [17.3. Blocks, Segments, and Anchors 18.1. Linguistic Segment Categories]
Moduletei — Formal specification
Used by
Memberspc s w
Note

The principles on which segmentation is carried out, and any special codes or attribute values used, should be defined explicitly in the <segmentation> element of the <encodingDesc> within the associated TEI header.

Appendix A.2.54 model.teiHeaderPart

model.teiHeaderPart groups high level elements which may appear more than once in a TEI header.
Moduletei — Formal specification
Used by
MembersencodingDesc profileDesc

Appendix A.3 Attribute classes

Appendix A.3.1 att.anchoring

att.anchoring (anchoring) provides attributes for use on annotations, e.g. notes and groups of notes describing the existence and position of an anchor for annotations.
Moduletei — Formal specification
Membersnote
Attributes
anchored(anchored) indicates whether the copy text shows the exact place of reference for the note.
StatusOptional
Datatypeteidata.truthValue
Defaulttrue
Note

In modern texts, notes are usually anchored by means of explicit footnote or endnote symbols. An explicit indication of the phrase or line annotated may however be used instead (e.g. ‘page 218, lines 3–4’). The anchored attribute indicates whether any explicit location is given, whether by symbol or by prose cross-reference. The value true indicates that such an explicit location is indicated in the copy text; the value false indicates that the copy text does not indicate a specific place of attachment for the note. If the specific symbols used in the copy text at the location the note is anchored are to be recorded, use the n attribute.

targetEnd(target end) points to the end of the span to which the note is attached, if the note is not embedded in the text at that point.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

This attribute is retained for backwards compatibility; it may be removed at a subsequent release of the Guidelines. The recommended way of pointing to a span of elements is by means of the range function of XPointer, as further described in 17.2.4.6. range().

Example
<p>(...) tamen reuerendos dominos archiepiscopum et canonicos Leopolienses necnon episcopum in duplicibus Quatuortemporibus<anchor xml:id="A55234"/> totaliter expediui...</p> <!-- elsewhere in the document --> <noteGrp targetEnd="#A55234">  <note xml:lang="en"> Quatuor Tempora, so called dry fast days.  </note>  <note xml:lang="pl"> Quatuor Tempora, tzw. Suche dni postne.  </note> </noteGrp>

Appendix A.3.2 att.ascribed

att.ascribed provides attributes for elements representing speech or action that can be ascribed to a specific individual. [3.3.3. Quotation 8.3. Elements Unique to Spoken Texts]
Moduletei — Formal specification
Memberschange setting
Attributes
whoindicates the person, or group of people, to whom the element content is ascribed.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
In the following example from Hamlet, speeches (<sp>) in the body of the play are linked to <role> elements in the <castList> using the who attribute.
<castItem type="role">  <role xml:id="Barnardo">Bernardo</role> </castItem> <castItem type="role">  <role xml:id="Francisco">Francisco</role>  <roleDesc>a soldier</roleDesc> </castItem> <!-- ... --> <sp who="#Barnardo">  <speaker>Bernardo</speaker>  <l n="1">Who's there?</l> </sp> <sp who="#Francisco">  <speaker>Francisco</speaker>  <l n="2">Nay, answer me: stand, and unfold yourself.</l> </sp>
Note

For transcribed speech, this will typically identify a participant or participant group; in other contexts, it will point to any identified <person> element.

Appendix A.3.3 att.breaking

att.breaking provides attributes to indicate whether or not the element concerned is considered to mark the end of an orthographic token in the same way as whitespace. [3.11.3. Milestone Elements]
Moduletei — Formal specification
Memberspb
Attributes
breakindicates whether or not the element bearing this attribute should be considered to mark the end of an orthographic token in the same way as whitespace.
StatusRecommended
Datatypeteidata.enumerated
Sample values include
yes
the element bearing this attribute is considered to mark the end of any adjacent orthographic token irrespective of the presence of any adjacent whitespace
no
the element bearing this attribute is considered not to mark the end of any adjacent orthographic token irrespective of the presence of any adjacent whitespace
maybe
the encoding does not take any position on this issue.
In the following lines from the Dream of the Rood, the words lāðost and reord-berendum each start on one line and continue onto the next.
<ab> ...eƿesa tome iu icƿæs ȝeƿorden ƿita heardoſt . leodum la<lb break="no"/> ðost ærþan ichim lifes ƿeȝ rihtne ȝerymde reord be<lb break="no"/> rendum hƿæt me þaȝeƿeorðode ƿuldres ealdor ofer... </ab>

Appendix A.3.4 att.cReferencing

att.cReferencing provides attributes that may be used to supply a canonical reference as a means of identifying the target of a pointer.
Moduletei — Formal specification
Membersref term
Attributes
cRef(canonical reference) specifies the destination of the pointer by supplying a canonical reference expressed using the scheme defined in a <refsDecl> element in the TEI header.
StatusOptional
Datatypeteidata.text
Note

The value of cRef should be constructed so that when the algorithm for the resolution of canonical references (described in section 17.2.5. Canonical References) is applied to it the result is a valid URI reference to the intended target.

The <refsDecl> to use may be indicated with the decls attribute.

Currently these Guidelines only provide for a single canonical reference to be encoded on any given <ptr> element.

Appendix A.3.5 att.calendarSystem

att.calendarSystem provides attributes for indicating calendar systems to which a date belongs. [3.6.4. Dates and Times 14.4. Dates]
Moduletei — Formal specification
Membersdate docDate time
Attributes
calendarindicates one or more systems or calendars to which the date represented by the content of this element belongs.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Schematron
<sch:rule context="tei:*[@calendar]"> <sch:assert test="string-length( normalize-space(.) ) gt 0"> @calendar indicates one or more systems or calendars to which the date represented by the content of this element belongs, but this <sch:name/> element has no textual content.</sch:assert> </sch:rule>
He was born on <date calendar="#gregorian">Feb. 22, 1732</date> (<date calendar="#julian"  when="1732-02-22">Feb. 11, 1731/32, O.S.</date>).
He was born on <date calendar="#gregorian #julian"  when="1732-02-22">Feb. 22, 1732 (Feb. 11, 1731/32, O.S.)</date>.
Note

Note that the calendar attribute declares the calendar system used to interpret the textual content of an element, as it appears on an original source. It does not modify the interpretation of the normalization attributes provided by att.datable.w3c, att.datable.iso, or att.datable.custom. Attributes from those first two classes are always interpreted as Gregorian or proleptic Gregorian dates, as per the respective standards on which they are based. The calender system used to interpret the last (att.datable.custom) may be specified with datingMethod.

Appendix A.3.6 att.canonical

att.canonical provides attributes that can be used to associate a representation such as a name or title with canonical information about the object being named or referenced. [14.1.1. Linking Names and Their Referents]
Moduletei — Formal specification
Membersatt.naming[att.personal[name] pubPlace] bibl catDesc date funder meeting publisher resp respStmt term time title
Attributes
keyprovides an externally-defined means of identifying the entity (or entities) being named, using a coded value of some kind.
StatusOptional
Datatypeteidata.text
<author>  <name key="Hugo, Victor (1802-1885)"   ref="http://www.idref.fr/026927608">Victor Hugo</name> </author>
Note

The value may be a unique identifier from a database, or any other externally-defined string identifying the referent. No particular syntax is proposed for the values of the key attribute, since its form will depend entirely on practice within a given project.

ref(reference) provides an explicit means of locating a full definition or identity for the entity being named by means of one or more URIs.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
<name ref="http://viaf.org/viaf/109557338"  type="person">Seamus Heaney</name>
Note

The value must point directly to one or more XML elements or other resources by means of one or more URIs, separated by whitespace. If more than one is supplied the implication is that the name identifies several distinct entities.

ExampleIn this contrived example, a canonical reference to the same organisation is provided in four different ways.
<author n="1">  <name ref="http://nzetc.victoria.ac.nz/tm/scholarly/name-427308.html"   type="organisation">New Zealand Parliament, Legislative Council</name> </author>   <author n="2">  <name ref="nzvn:427308"   type="organisation">New Zealand Parliament, Legislative Council</name> </author>   <author n="3">  <name ref="./named_entities.xml#o427308"   type="organisation">New Zealand Parliament, Legislative Council</name> </author>   <author n="4">  <name key="name-427308"   type="organisation">New Zealand Parliament, Legislative Council</name> </author>
The first presumes the availability of an internet connection and a processor that can resolve a URI (most can). The second requires, in addition, a <prefixDef> that declares how the nzvm prefix should be interpreted. The third does not require an internet connection, but does require that a file named named_entities.xml be in the same directory as the TEI document. The fourth requires that an entire external system for key resolution be available.
Note

The key attribute is more flexible and general-purpose, but its use in interchange requires that documentation about how the key is to be resolved be sent to the recipient of the TEI document. In contrast values of the ref attribute are resolved using the widely accepted protocols for a URI, and thus less documentation, if any, is likely required by the recipient in data interchange.

These guidelines provide no semantic basis or suggested precedence when both key and ref are provided. For this reason simultaneous use of both is not recommended unless documentation explaining the use is provided, probably in an ODD customization, for interchange.

Appendix A.3.7 att.cmc

att.cmc (computer-mediated communication) provides attributes categorizing how the element content was created in a CMC environment.
Moduletei — Formal specification
Membersbibl date desc docDate email gap graphic head idno label measure media meeting name note num p pb pc ref s term time title unit w
Attributes
generatedBy(generated by) categorizes how the content of an element was generated in a CMC environment.
StatusOptional
Datatypeteidata.enumerated
Schematron
<sch:rule context="tei:*[@generatedBy]"> <sch:assert test="ancestor-or-self::tei:post">The @generatedBy attribute is for use within a <post> element.</sch:assert> </sch:rule>
Suggested values include:
human
the content was ‘naturally’ typed or spoken by a human user
template
the content was generated after a human user activated a template for its insertion
system
the content was generated by the system, i.e. the CMC environment
bot
the content was generated by a bot, i.e. a non-human agent, typically one that is not part of the CMC environment itself
unspecified
the content was generated by an unknown or unspecified process
automatic system message in chat: user moves on to another chatroom
<post type="event"  generatedBy="system"  who="#system"  rend="color:blue">  <p>   <name type="nickname"    corresp="#A02">McMike</name> geht    in einen anderen Raum: <name type="roomname">Kreuzfahrt</name>  </p> </post>
automatic system message in chat: user enters a chatroom
<post type="event"  generatedBy="system">  <p>   <name type="nickname"    corresp="#A08">c_bo</name> betritt    den Raum. </p> </post>
automatic system message in chat: user changes his font color
<post type="event"  generatedBy="system"  rend="color:red">  <p>   <name type="nickname"    corresp="#A08">c_bo</name> hat die    Farbe gewechselt.  </p> </post>
An automatic signature of user including an automatic timestamp (Wikipedia discussion, anonymized). The specification of generatedBy at the inner element <signed> is meant to override the specification at the outer element <post>. This is generally possible when the outer generatedBy value is "human".
<post type="standard"  generatedBy="human"  indentLevel="2"  synch="#t00394407"  who="#WU00005582">  <p> Kurze Nachfrage: Die Hieros für den Goldnamen stammen    auch von Beckerath gem. Literatur ? Grüße --</p>  <signed generatedBy="template"   rend="inline">   <gap reason="signatureContent"/>   <time generatedBy="template">18:50, 22. Okt. 2008 (CEST)</time>  </signed> </post>
Wikipedia talk page: user signature
<post type="written"  generatedBy="human"> <!-- ... main content of posting ... -->  <signed generatedBy="template">   <gap reason="signatureContent"/>   <time generatedBy="template">12:01, 12. Jun. 2009 (CEST)</time>  </signed> </post>

Appendix A.3.8 att.coordinated

att.coordinated provides attributes that can be used to position their parent element within a two dimensional coordinate system.
Moduletranscr — Formal specification
Membersline path surface zone
Attributes
startindicates the element within a transcription of the text containing at least the start of the writing represented by this zone or surface.
StatusOptional
Datatypeteidata.pointer
ulxgives the x coordinate value for the upper left corner of a rectangular space.
StatusOptional
Datatypeteidata.numeric
ulygives the y coordinate value for the upper left corner of a rectangular space.
StatusOptional
Datatypeteidata.numeric
lrxgives the x coordinate value for the lower right corner of a rectangular space.
StatusOptional
Datatypeteidata.numeric
lrygives the y coordinate value for the lower right corner of a rectangular space.
StatusOptional
Datatypeteidata.numeric
pointsidentifies a two dimensional area by means of a series of pairs of numbers, each of which gives the x,y coordinates of a point on a line enclosing the area.
StatusOptional
Datatype3–∞ occurrences of teidata.point separated by whitespace

Appendix A.3.9 att.damaged

att.damaged provides attributes describing the nature of any physical damage affecting a reading. [12.3.3.1. Damage, Illegibility, and Supplied Text 1.3.1. Attribute Classes]
Moduletei — Formal specification
Membersdamage damageSpan
Attributes
agentcategorizes the cause of the damage, if it can be identified.
StatusOptional
Datatypeteidata.enumerated
Sample values include:
rubbing
damage results from rubbing of the leaf edges
mildew
damage results from mildew on the leaf surface
smoke
damage results from smoke
degreeprovides a coded representation of the degree of damage, either as a number between 0 (undamaged) and 1 (very extensively damaged), or as one of the codes high, medium, low, or unknown. The <damage> element with the degree attribute should only be used where the text may be read with some confidence; text supplied from other sources should be tagged as <supplied>.
StatusOptional
Datatypeteidata.probCert
Note

The <damage> element is appropriate where it is desired to record the fact of damage although this has not affected the readability of the text, for example a weathered inscription. Where the damage has rendered the text more or less illegible either the <unclear> tag (for partial illegibility) or the <gap> tag (for complete illegibility, with no text supplied) should be used, with the information concerning the damage given in the attribute values of these tags. See section 12.3.3.2. Use of the gap, del, damage, unclear, and supplied Elements in Combination for discussion of the use of these tags in particular circumstances.

groupassigns an arbitrary number to each stretch of damage regarded as forming part of the same physical phenomenon.
StatusOptional
Datatypeteidata.count

Appendix A.3.10 att.datable

att.datable provides attributes for normalization of elements that contain dates, times, or datable events. [3.6.4. Dates and Times 14.4. Dates]
Moduletei — Formal specification
Membersatt.gaijiProp[localProp unicodeProp unihanProp] application change date docDate funder idno licence mapping meeting name resp time title
Attributes
periodsupplies pointers to one or more definitions of named periods of time (typically <category>s, <date>s, or <event>s) within which the datable item is understood to have occurred.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

This ‘superclass’ provides attributes that can be used to provide normalized values of temporal information. By default, the attributes from the att.datable.w3c class are provided. If the module for names & dates is loaded, this class also provides attributes from the att.datable.iso and att.datable.custom classes. In general, the possible values of attributes restricted to the W3C datatypes form a subset of those values available via the ISO 8601 standard. However, the greater expressiveness of the ISO datatypes may not be needed, and there exists much greater software support for the W3C datatypes.

Appendix A.3.11 att.datable.w3c

att.datable.w3c provides attributes for normalization of elements that contain datable events conforming to the W3C XML Schema Part 2: Datatypes Second Edition. [3.6.4. Dates and Times 14.4. Dates]
Moduletei — Formal specification
Membersatt.datable[att.gaijiProp[localProp unicodeProp unihanProp] application change date docDate funder idno licence mapping meeting name resp time title]
Attributes
whensupplies the value of the date or time in a standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.w3c
Examples of W3C date, time, and date & time formats.
<p>  <date when="1945-10-24">24 Oct 45</date>  <date when="1996-09-24T07:25:00Z">September 24th, 1996 at 3:25 in the morning</date>  <time when="1999-01-04T20:42:00-05:00">Jan 4 1999 at 8 pm</time>  <time when="14:12:38">fourteen twelve and 38 seconds</time>  <date when="1962-10">October of 1962</date>  <date when="--06-12">June 12th</date>  <date when="---01">the first of the month</date>  <date when="--08">August</date>  <date when="2006">MMVI</date>  <date when="0056">AD 56</date>  <date when="-0056">56 BC</date> </p>
This list begins in the year 1632, more precisely on Trinity Sunday, i.e. the Sunday after Pentecost, in that year the <date calendar="#julian"  when="1632-06-06">27th of May (old style)</date>.
<opener>  <dateline>   <placeName>Dorchester, Village,</placeName>   <date when="1828-03-02">March 2d. 1828.</date>  </dateline>  <salute>To    Mrs. Cornell,</salute> Sunday <time when="12:00:00">noon.</time> </opener>
notBeforespecifies the earliest possible date for the event in standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.w3c
notAfterspecifies the latest possible date for the event in standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.w3c
fromindicates the starting point of the period in standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.w3c
toindicates the ending point of the period in standard form, e.g. yyyy-mm-dd.
StatusOptional
Datatypeteidata.temporal.w3c
Schematron
<sch:rule context="tei:*[@when]"> <sch:report test="@notBefore|@notAfter|@from|@to"  role="nonfatal">The @when attribute cannot be used with any other att.datable.w3c attributes.</sch:report> </sch:rule>
Schematron
<sch:rule context="tei:*[@from]"> <sch:report test="@notBefore"  role="nonfatal">The @from and @notBefore attributes cannot be used together.</sch:report> </sch:rule>
Schematron
<sch:rule context="tei:*[@to]"> <sch:report test="@notAfter"  role="nonfatal">The @to and @notAfter attributes cannot be used together.</sch:report> </sch:rule>
Example
<date from="1863-05-28to="1863-06-01">28 May through 1 June 1863</date>
Note

The value of these attributes should be a normalized representation of the date, time, or combined date & time intended, in any of the standard formats specified by XML Schema Part 2: Datatypes Second Edition, using the Gregorian calendar.

The most commonly-encountered format for the date portion of a temporal attribute is yyyy-mm-dd, but yyyy, --mm, ---dd, yyyy-mm, or --mm-dd may also be used. For the time part, the form hh:mm:ss is used.

Note that this format does not currently permit use of the value 0000 to represent the year 1 BCE; instead the value -0001 should be used.

Appendix A.3.12 att.datcat

att.datcat provides attributes that are used to align XML elements or attributes with the appropriate Data Categories (DCs) defined by an external taxonomy, in this way establishing the identity of information containers and values, and providing means of interpreting them. [10.5.2. Lexical View 19.3. Other Atomic Feature Values]
Moduletei — Formal specification
Membersatt.segLike[pc s w] category tagUsage taxonomy
Attributes
datcatprovides a pointer to a definition of, and/or general information about, (a) an information container (element or attribute) or (b) a value of an information container (element content or attribute value), by referencing an external taxonomy or ontology. If valueDatcat is present in the immediate context, this attribute takes on role (a), while valueDatcat performs role (b).
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
valueDatcatprovides a definition of, and/or general information about a value of an information container (element content or attribute value), by reference to an external taxonomy or ontology. Used especially where a contrast with datcat is needed.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
targetDatcatprovides a definition of, and/or general information about, information structure of an object referenced or modeled by the containing element, by reference to an external taxonomy or ontology. This attribute has the characteristics of the datcat attribute, except that it addresses not its containing element, but an object that is being referenced or modeled by its containing element.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
ExampleThe example below presents the TEI encoding of the name-value pair <part of speech, common noun>, where the name (key) ‘part of speech’ is abbreviated as ‘POS’, and the value, ‘common noun’ is symbolized by ‘NN’. The entire name-value pair is encoded by means of the element <f>. In TEI XML, that element acts as the container, labeled with the name attribute. Its contents may be complex or simple. In the case at hand, the content is the symbol ‘NN’.The datcat attribute relates the feature name (i.e., the key) to the data category ‘part of speech’, while the attribute valueDatcat relates the feature value to the data category common noun. Both these data categories should be defined in an external and preferably open reference taxonomy or ontology.
<fs>  <f name="POS"   datcat="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3">   <symbol valueDatcat="http://hdl.handle.net/11459/CCR_C-1256_7ec6083c-23d4-224d-6f94-eecbe6861545"    value="NN"/>  </f> <!-- ... --> </fs>
‘NN’ is the symbol for common noun used e.g. in the CLAWS-7 tagset defined by the University Centre for Computer Corpus Research on Language at the University of Lancaster. The very same data category used for tagging an early version of the British National Corpus, and coming from the BNC Basic (C5) tagset, uses the symbol ‘NN0’ (rather than ‘NN’). Making these values semantically interoperable would be extremely difficult without a human expert if they were not anchored in a single point of an established reference taxonomy of morphosyntactic data categories. In the case at hand, the string http://hdl.handle.net/11459/CCR_C-1256_7ec6083c-23d4-224d-6f94-eecbe6861545 is both a persistent identifier of the data category in question, as well as a pointer to a shared definition of common noun.While the symbols ‘NN’, ‘NN0’, and many others (often coming from languages other than English) are implicitly members of the container category ‘part of speech’, it is sometimes useful not to rely on such an implicit relationship but rather use an explicit identifier for that data category, to distinguish it from other morphosyntactic data categories, such as gender, tense, etc. For that purpose, the above example uses the datcat attribute to reference a definition of part of speech. The reference taxonomy in this example is the CLARIN Concept Registry.If the feature structure markup exemplified above is to be repeated many times in a single document, it is much more efficient to gather the persistent identifiers in a single place and to only reference them, implicitly or directly, from feature structure markup. The following example is much more concise than the one above and relies on the concepts of feature structure declaration and feature value library, discussed in chapter [[undefined FS]].
<fs>  <f name="POSfVal="#commonNoun"/> <!-- ... --> </fs>
The assumption here is that the relevant feature values are collected in a place that the annotation document in question has access to — preferably, a single document per linguistic resource, for example an <fsdDecl> that is XIncluded as a sibling of <text> or a child of <encodingDesc>; a <taxonomy> available resource-wide (e.g., in a shared header) is also an option.The example below presents an <fvLib> element that collects the relevant feature values (most of them omitted). At the same time, this example shows one way of encoding a tagset, i.e., an established inventory of values of (in the case at hand) morphosyntactic categories.
<fvLib n="POS values">  <symbol xml:id="commonNounvalue="NN"   datcat="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3"/>  <symbol xml:id="properNounvalue="NP"   datcat="http://hdl.handle.net/11459/CCR_C-1371_fbebd9ec-a7f4-9a36-d6e9-88ee16b944ae"/> <!-- ... --> </fvLib>
Note that these Guidelines do not prescribe a specific choice between datcat and valueDatcat in such cases. The former is the generic way of referencing a data category, whereas the latter is more specific, in that it references a data category that represents a value. The choice between them comes into play where a single element — or a tight element complex, such as the <f>/<symbol> complex illustrated above — make it necessary or useful to distinguish between the container data category and its value.
ExampleIn the context of dictionaries designed with semantic interoperability in mind, the following example ensures that the <pos> element is interpreted as the same information container as in the case of the example of <f name="POS"> above.
<gramGrp>  <pos datcat="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3"   valueDatcat="http://hdl.handle.net/11459/CCR_C-1256_7ec6083c-23d4-224d-6f94-eecbe6861545">NN</pos> </gramGrp>
Efficiency of this type of interoperable markup demands that the references to the particular data categories should best be provided in a single place within the dictionary (or a single place within the project), rather than being repeated inside every entry. For the container elements, this can be achieved at the level of <tagUsage>, although here, the valueDatcat attribute should be used, because it is not the <tagUsage> element that is associated with the relevant data category, but rather the element <pos> (or <case>, etc.) that is described by <tagUsage>:
<tagsDecl partial="true"> <!-- ... -->  <namespace name="http://www.tei-c.org/ns/1.0">   <tagUsage gi="pos"    targetDatcat="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3">Contains the part of speech.</tagUsage>   <tagUsage gi="case"    targetDatcat="http://hdl.handle.net/11459/CCR_C-1840_9f4e319c-f233-6c90-9117-7270e215f039">Contains information about the grammatical case that the described form is inflected for.</tagUsage> <!-- ... -->  </namespace> </tagsDecl>
Another possibility is to shorten the URIs by means of the <prefixDef> mechanism, as illustrated below:
<listPrefixDef>  <prefixDef ident="ccrmatchPattern="pos"   replacementPattern="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3"/>  <prefixDef ident="ccrmatchPattern="adj"   replacementPattern="http://hdl.handle.net/11459/CCR_C-1230_23653c21-fca1-edf8-fd7c-3df2d6499157"/> </listPrefixDef> <!-- ... --> <entry> <!--...-->  <form>   <orth>isotope</orth>  </form>  <gramGrp>   <pos datcat="ccr:pos"    valueDatcat="ccr:adj">adj</pos>  </gramGrp> <!--...--> </entry>
This mechanism creates implications that are not always wanted, among others, in the case at hand, suggesting that the identifiers ‘pos’ and ‘adj’ belong to a namespace associated with the CLARIN Concept Repository (CCR), whereas that is solely a shorthand mechanism whose scope is the current resource. Documenting this clearly in the header of the dictionary is therefore advised.Yet another possibility is to associate the information about the relationship between a TEI markup element and the data category that it is intended to model already at the level of modeling the dictionary resource, that is, at the level of the ODD, in the <equiv> element that is a child of <elementSpec> or <attDef>.
ExampleThe <taxonomy> element is a handy tool for encoding taxonomies that are later referenced by att.datcat attributes, but it can also act as an intermediary device, for example holding a fragment of an external taxonomy (or ‘flattening’ an external ontology) that is relevant to the project or document at hand. (It is also imaginable that, for the purpose of the project at hand, the local <taxonomy> element combines vocabularies that originate from more than one external taxonomy or ontology.) In such cases, the <taxonomy> creates a local layer of indirection: the att.datcat attributes internal to the resource may reference the <category> elements stored in the header (as well as the <taxonomy> element itself), whereas these same <category> and <taxonomy> elements use att.datcat attributes to reference the original taxonomy or ontology.
<encodingDesc> <!-- ... -->  <classDecl> <!-- ... -->   <taxonomy xml:id="UD-SYN"    datcat="https://universaldependencies.org/u/dep/index.html">    <desc>     <term>UD syntactic relations</term>    </desc>    <category xml:id="acl"     valueDatcat="https://universaldependencies.org/u/dep/acl.html">     <catDesc>      <term>acl</term>: Clausal modifier of noun (adjectival clause)</catDesc>    </category>    <category xml:id="acl_relcl"     valueDatcat="https://universaldependencies.org/u/dep/acl-relcl.html">     <catDesc>      <term>acl:relcl</term>: relative clause modifier</catDesc>    </category>    <category xml:id="advcl"     valueDatcat="https://universaldependencies.org/u/dep/advcl.html">     <catDesc>      <term>advcl</term>: Adverbial clause modifier</catDesc>    </category> <!-- ... -->   </taxonomy>  </classDecl> </encodingDesc>
The above fragment was excerpted from the GB subset of the ParlaMint project in April 2023, and enriched with att.datcat attributes for the purpose of illustrating the mechanism described here.Note that, in the ideal case, the values of att.datcat attributes should be persistent identifiers, and that the addressing scheme of Universal Dependencies is treated here as persistent for the sake of illustration. Note also that the contrast between datcat used on <taxonomy> on the one hand, and the valueDatcat used on <category> on the other, is not mandatory: both kinds of relations could be encoded by means of the generic datcat attribute, but using the former for the container and the latter for the content is more user-friendly.
ExampleThe targetDatcat attribute is designed to be used in, e.g., feature structure declarations, and is analogous to the targetLang attribute of the att.pointing class, in that it describes the object that is being referenced, rather than the referencing object.
<fDecl name="POS"  targetDatcat="http://hdl.handle.net/11459/CCR_C-396_5a972b93-2294-ab5c-a541-7c344c5f26c3">  <fDescr>part of speech (morphosyntactic category)</fDescr>  <vRange>   <vAlt>    <symbol value="NN"     datcat="http://hdl.handle.net/11459/CCR_C-1256_7ec6083c-23d4-224d-6f94-eecbe6861545"/>    <symbol value="NP"     datcat="http://hdl.handle.net/11459/CCR_C-1371_fbebd9ec-a7f4-9a36-d6e9-88ee16b944ae"/> <!-- ... -->   </vAlt>  </vRange> </fDecl>
Above, the <fDecl> uses targetDatcat, because if it were to use datcat, it would be asserting that it is an instance of the container data category part of speech, whereas it is not — it models a container (<f>) that encodes a part of speech. Note also that it is the <f> that is modeled above, not its values, which are used as direct references to data categories; hence the use of datcat in the <symbol> element.
ExampleThe att.datcat attributes can be used for any sort of taxonomies. The example below illustrates their usefulness for describing usage domain labels in dictionaries on the example of the Diccionario da Lingua Portugueza by António de Morais Silva, retro-digitised in the MORDigital project.
<!-- in the dictionary header --><encodingDesc>  <classDecl>   <taxonomy xml:id="domains"> <!--...-->    <category xml:id="domain.medical_and_health_sciences">     <catDesc xml:lang="en">Medical and Health Sciences</catDesc>     <catDesc xml:lang="pt">Ciências Médicas e da Saúde</catDesc>     <category xml:id="domain.medical_and_health_sciences.medicine"      valueDatcat="https://vocabs.rossio.fcsh.unl.pt/pub/morais_domains/pt/page/0025">      <catDesc xml:lang="en">       <term>Medicine</term>       <gloss> <!--...-->       </gloss>      </catDesc>      <catDesc xml:lang="pt">       <term>Medicina</term>       <gloss> <!--...-->       </gloss>      </catDesc>     </category>    </category> <!--...-->   </taxonomy>  </classDecl> </encodingDesc> <!-- inside an <entry> element: --> <usg type="domain"  valueDatcat="#domain.medical_and_health_sciences.medicine">Med.</usg>
In the Morais dictionary, the relevant domain labels are in the header, getting referenced inside the dictionary, from <usg> elements. The vocabulary used for dictionary-internal labelling is in turn anchored in the MorDigital controlled vocabulary service of the NOVA University of Lisbon – School of Social Sciences and Humanities (NOVA FCSH).
Note

The TEI Abstract Model can be expressed as a hierarchy of attribute-value matrices (AVMs) of various types and of various levels of complexity, nested or grouped in various ways. At the most abstract level, an AVM consists of an information container and the value (contents) of that container.

A simple example of an XML serialization of such structures is, on the one hand, the opening and closing tags that delimit and name the container, and, on the other, the content enclosed by the two tags that constitues the value. An analogous example is an attribute name and the value of that attribute.

In a TEI XML example of two equivalent serializations expressing the name-value pair <part-of-speech,common-noun>, namely <pos>commonNoun</pos> and pos="common-noun", one would classify the element <pos> and the attribute pos as containers (mapping onto the first member of the relevant name-value pair), while the character data content of <pos> or the value of pos would be seen as mapping onto the second member of the pair.

The att.datcat class provides means of addressing the containers and their values, while at the same time providing a way to interpret them in the context of external taxonomies or ontologies. Aligning e.g. both the <pos> element and the pos attribute with the same value of an external reference point (i.e., an entry in an agreed taxonomy) affirms the identity of the concept serialised by both the element container and the attribute container, and optionally provides a definition of that concept (in the case at hand, the concept part of speech).

The value of the att.datcat attributes should be a PID (persistent identifier) that points to a specific — and, ideally, shared — taxonomy or ontology. Among the resources that can, to a lesser or greater extent, be used as inventories of (more or less) standardized linguistic categories are the GOLD ontology, CLARIN CCR, OLiA, or TermWeb's DatCatInfo, and also the Universal Dependencies inventory, on the assumption that its URIs are going to persist. It is imaginable that a project may choose to address a local taxonomy store instead, but this risks losing the advantage of interchangeability with other projects.

Historically, datcat and valueDatcat originate from the (now obsolete) ISO 12620:2009 standard, describing the data model and procedures for a Data Category Registry (DCR). The current version of that standard, ISO 12620-1, does not standardize the serialization of pointers, merely mentioning the TEI att.datcat as an example.

Note that no constraint prevents the occurrence of a combination of att.datcat attributes: the <fDecl> element, which is a natural bearer of the targetDatcat attribute, is an instance of a specific modeling element, and, in principle, could be semantically fixed by an appropriate reference taxonomy of modeling devices.

Appendix A.3.13 att.declarable

att.declarable provides attributes for those elements in the TEI header which may be independently selected by means of the special purpose decls attribute. [16.3. Associating Contextual Information with a Text]
Moduletei — Formal specification
Membersavailability bibl correction editorialDecl hyphenation langUsage normalization particDesc projectDesc quotation segmentation settingDesc sourceDesc textClass
Attributes
defaultindicates whether or not this element is selected by default when its parent is selected.
StatusOptional
Datatypeteidata.truthValue
Legal values are:
true
This element is selected if its parent is selected
false
This element can only be selected explicitly, unless it is the only one of its kind, in which case it is selected if its parent is selected.[Default]
Schematron
<sch:pattern id="declarable" abstract="true"> <sch:rule context="$tde[ ancestor::tei:teiHeader and following-sibling::$tde and not( preceding-sibling::$tde ) ]">  <sch:report test="../child::$tde[ not( @xml:id ) ]"> When there is more than one <sch:name/>, each must have an @xml:id  </sch:report>  <sch:assert test="count( ../child::$tde[ normalize-space( @default ) = ('1','true') ] ) eq 1"> When there is more than one <sch:name/>, one and only one must have a @default of 'true'.  </sch:assert> </sch:rule> </sch:pattern>
Note

The rules governing the association of declarable elements with individual parts of a TEI text are fully defined in chapter 16.3. Associating Contextual Information with a Text. Only one element of a particular type may have a default attribute with a value of true.

Appendix A.3.14 att.declaring

att.declaring provides attributes for elements which may be independently associated with a particular declarable element within the header, thus overriding the inherited default for that element. [16.3. Associating Contextual Information with a Text]
Moduletei — Formal specification
Membersback body div facsimile front graphic media p ref sourceDoc surface surfaceGrp term text
Attributes
decls(declarations) identifies one or more declarable elements within the header, which are understood to apply to the element bearing this attribute and its content.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

The rules governing the association of declarable elements with individual parts of a TEI text are fully defined in chapter 16.3. Associating Contextual Information with a Text.

Appendix A.3.15 att.dimensions

att.dimensions provides attributes for describing the size of physical objects.
Moduletei — Formal specification
Membersatt.damaged[damage damageSpan] addSpan date delSpan ex gap mod redo restore retrace secl space subst substJoin supplied surplus time undo
Attributes
unitnames the unit used for the measurement
StatusOptional
Datatypeteidata.enumerated
Suggested values include:
cm
(centimetres)
mm
(millimetres)
in
(inches)
line
lines of text
char
(characters) characters of text
quantityspecifies the length in the units specified
StatusOptional
Datatypeteidata.numeric
extentindicates the size of the object concerned using a project-specific vocabulary combining quantity and units in a single string of words.
StatusOptional
Datatypeteidata.text
<gap extent="5 words"/>
<height extent="half the page"/>
precisioncharacterizes the precision of the values specified by the other attributes.
StatusOptional
Datatypeteidata.certainty
scopewhere the measurement summarizes more than one observation, specifies the applicability of this measurement.
StatusOptional
Datatypeteidata.enumerated
Sample values include:
all
measurement applies to all instances.
most
measurement applies to most of the instances inspected.
range
measurement applies to only the specified range of instances.

Appendix A.3.16 att.divLike

att.divLike provides attributes common to all elements which behave in the same way as divisions. [4. Default Text Structure]
Moduletei — Formal specification
Membersdiv
Attributes
org(organization) specifies how the content of the division is organized.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
composite
no claim is made about the sequence in which the immediate contents of this division are to be processed, or their inter-relationships.
uniform
the immediate contents of this element are regarded as forming a logical unit, to be processed in sequence.[Default]
sampleindicates whether this division is a sample of the original source and if so, from which part.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
initial
division lacks material present at end in source.
medial
division lacks material at start and end.
final
division lacks material at start.
unknown
position of sampled material within original unknown.
complete
division is not a sample.[Default]

Appendix A.3.17 att.docStatus

att.docStatus provides attributes for use on metadata elements describing the status of a document.
Moduletei — Formal specification
Membersbibl change revisionDesc
Attributes
statusdescribes the status of a document either currently or, when associated with a dated element, at the time indicated.
StatusOptional
Datatypeteidata.enumerated
Sample values include:
approved
candidate
cleared
deprecated
draft
[Default]
embargoed
expired
frozen
galley
proposed
published
recommendation
submitted
unfinished
withdrawn
Example
<revisionDesc status="published">  <change when="2010-10-21"   status="published"/>  <change when="2010-10-02status="cleared"/>  <change when="2010-08-02"   status="embargoed"/>  <change when="2010-05-01status="frozen"   who="#MSM"/>  <change when="2010-03-01status="draft"   who="#LB"/> </revisionDesc>

Appendix A.3.18 att.editLike

att.editLike provides attributes describing the nature of an encoded scholarly intervention or interpretation of any kind. [3.5. Simple Editorial Changes 11.3.1. Origination 14.3.2. The Person Element 12.3.1.1. Core Elements for Transcriptional Work]
Moduletei — Formal specification
Membersatt.transcriptional[addSpan delSpan mod redo restore retrace subst substJoin undo] date ex gap name secl supplied surplus time
Attributes
evidenceindicates the nature of the evidence supporting the reliability or accuracy of the intervention or interpretation.
StatusOptional
Datatype1–∞ occurrences of teidata.enumerated separated by whitespace
Suggested values include:
internal
there is internal evidence to support the intervention.
external
there is external evidence to support the intervention.
conjecture
the intervention or interpretation has been made by the editor, cataloguer, or scholar on the basis of their expertise.
instantindicates whether this is an instant revision or not.
StatusOptional
Datatypeteidata.xTruthValue
Defaultfalse
Note

The members of this attribute class are typically used to represent any kind of editorial intervention in a text, for example a correction or interpretation, or to date or localize manuscripts etc.

Each pointer on the source (if present) corresponding to a witness or witness group should reference a bibliographic citation such as a <witness>, <msDesc>, or <bibl> element, or another external bibliographic citation, documenting the source concerned.

Appendix A.3.19 att.edition

att.edition provides attributes identifying the source edition from which some encoded feature derives.
Moduletei — Formal specification
Memberspb
Attributes
ed(edition) supplies a sigil or other arbitrary identifier for the source edition in which the associated feature (for example, a page, column, or line beginning) occurs at this point in the text.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
edRef(edition reference) provides a pointer to the source edition in which the associated feature (for example, a page, column, or line beginning) occurs at this point in the text.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Example
<l>Of Mans First Disobedience,<lb ed="1674"/> and<lb ed="1667"/> the Fruit</l> <l>Of that Forbidden Tree, whose<lb ed="1667 1674"/> mortal tast</l> <l>Brought Death into the World,<lb ed="1667"/> and all<lb ed="1674"/> our woe,</l>
Example
<listBibl>  <bibl xml:id="stapledon1937">   <author>Olaf Stapledon</author>,  <title>Starmaker</title>, <publisher>Methuen</publisher>, <date>1937</date>  </bibl>  <bibl xml:id="stapledon1968">   <author>Olaf Stapledon</author>,  <title>Starmaker</title>, <publisher>Dover</publisher>, <date>1968</date>  </bibl> </listBibl> <!-- ... --> <p>Looking into the future aeons from the supreme moment of the cosmos, I saw the populations still with all their strength maintaining the<pb n="411edRef="#stapledon1968"/>essentials of their ancient culture, still living their personal lives in zest and endless novelty of action, … I saw myself still preserving, though with increasing difficulty, my lucid con<pb n="291break="no"   edRef="#stapledon1937"/>sciousness;</p>
In the above example, the soft hyphen in Stapledon 1937 is omitted. Such decisions may be documented in the edition's declaration of editorial principles, e.g. with the <hyphenation> element in the <teiHeader>.
Note

These guidelines provide no semantic basis or suggested precedence when both ed and edRef are provided. For this reason simultaneous use of both is not recommended unless documentation explaining the use is provided, probably in an ODD customization, for interchange.

Appendix A.3.20 att.fragmentable

att.fragmentable provides attributes for representing fragmentation of a structural element, typically as a consequence of some overlapping hierarchy.
Moduletei — Formal specification
Membersatt.divLike[div] att.segLike[pc s w] p
Attributes
partspecifies whether or not its parent element is fragmented in some way, typically by some other overlapping structure: for example a speech which is divided between two or more verse stanzas, a paragraph which is split across a page division, a verse line which is divided between two speakers.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
Y
(yes) the element is fragmented in some (unspecified) respect
N
(no) the element is not fragmented, or no claim is made as to its completeness[Default]
I
(initial) this is the initial part of a fragmented element
M
(medial) this is a medial part of a fragmented element
F
(final) this is the final part of a fragmented element
Note

The values I, M, or F should be used only where it is clear how the element may be reconstituted.

Appendix A.3.21 att.gaijiProp

att.gaijiProp provides attributes for defining the properties of non-standard characters or glyphs. [5. Characters, Glyphs, and Writing Modes]
Modulegaiji — Formal specification
MemberslocalProp unicodeProp unihanProp
Attributes
nameprovides the name of the character or glyph property being defined.
StatusRequired
Datatypeteidata.xmlName
valueprovides the value of the character or glyph property being defined.
StatusRequired
Datatypeteidata.text
versionspecifies the version number of the Unicode Standard in which this property name is defined.
StatusOptional
Datatypeteidata.enumerated
Suggested values include:
1.0.1
1.1
2.0
2.1
3.0
3.1
3.2
4.0
4.1
5.0
5.1
5.2
6.0
6.1
6.2
6.3
7.0
8.0
9.0
10.0
11.0
12.0
12.1
13.0
14.0
15.0
unassigned
schemesupplies the name of the character set system from which this property is drawn.
StatusOptional
Datatypeteidata.enumerated
Sample values include:
Unicode
(Unicode) ISO 10646[Default]
Bridwell
(E. Nelson Bridwell) Original character set developed by E. Nelson Bridwell as described by Al Turniansky, in use from the 1950s to 1985.
Brewer
(Georg Brewer) Developed by Georg Brewer, with a look similar to the Byrne glyphs. Like the Byrne set, this is not a true character set, but rather a set of alternate glyphs.
Doyle
(Darren Doyle) Glyph set (in some cases associated with multiple characters) developed by Darren Doyle as part of a comprehensive version of the language created in part for an invented language classs at UT Austin in 2006.
Schreyer
(Christine Schreyer) Character set (without actual codepoints) of 153 characters developed, along with pronunciation rules, roughly 300 words, and a grammar, developed by Christine Schreyer developed in 2012 for Warner Brothers.
ExampleIn this example a definition for the Unicode property Decomposition Mapping is provided.
<unicodeProp name="Decomposition_Mapping"  value="circle"/>
Note

All name-only attributes need an xs:boolean attribute value inside value.

Appendix A.3.22 att.global

att.global provides attributes common to all elements in the TEI encoding scheme. [1.3.1.1. Global Attributes]
Moduletei — Formal specification
MembersTEI addSpan appInfo application availability back bibl body catDesc catRef category change char charDecl classDecl correction damage damageSpan date delSpan desc div docDate edition editionStmt editorialDecl email encodingDesc ex extent facsimile fileDesc front funder fw g gap glyph graphic handNotes handShift head hyphenation idno label langUsage language licence line listPrefixDef listTranspose localProp mapping measure media meeting metamark mod name namespace normalization note num p particDesc path pb pc prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation redo ref resp respStmt restore retrace revisionDesc s secl segmentation setting settingDesc sourceDesc sourceDoc space subst substJoin supplied surface surfaceGrp surplus tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt transpose undo unicodeProp unihanProp unit w zone
Attributes
xml:id(identifier) provides a unique identifier for the element bearing the attribute.
StatusOptional
DatatypeID
Note

The xml:id attribute may be used to specify a canonical reference for an element; see section 3.11. Reference Systems.

n(number) gives a number (or other label) for an element, which is not necessarily unique within the document.
StatusOptional
Datatypeteidata.text
Note

The value of this attribute is always understood to be a single token, even if it contains space or other punctuation characters, and need not be composed of numbers only. It is typically used to specify the numbering of chapters, sections, list items, etc.; it may also be used in the specification of a standard reference system for the text.

xml:lang(language) indicates the language of the element content using a ‘tag’ generated according to BCP 47.
StatusOptional
Datatypeteidata.language
<p> … The consequences of this rapid depopulation were the loss of the last <foreign xml:lang="rap">ariki</foreign> or chief (Routledge 1920:205,210) and their connections to ancestral territorial organization.</p>
Note

The xml:lang value will be inherited from the immediately enclosing element, or from its parent, and so on up the document hierarchy. It is generally good practice to specify xml:lang at the highest appropriate level, noticing that a different default may be needed for the <teiHeader> from that needed for the associated resource element or elements, and that a single TEI document may contain texts in many languages.

Only attributes with free text values (rare in these guidelines) will be in the scope of xml:lang.

The authoritative list of registered language subtags is maintained by IANA and is available at https://www.iana.org/assignments/language-subtag-registry. For a good general overview of the construction of language tags, see https://www.w3.org/International/articles/language-tags/, and for a practical step-by-step guide, see https://www.w3.org/International/questions/qa-choosing-language-tags.en.php.

The value used must conform with BCP 47. If the value is a private use code (i.e., starts with x- or contains -x-), a <language> element with a matching value for its ident attribute should be supplied in the TEI header to document this value. Such documentation may also optionally be supplied for non-private-use codes, though these must remain consistent with their (IETF)Internet Engineering Task Force definitions.

xml:baseprovides a base URI reference with which applications can resolve relative URI references into absolute URI references.
StatusOptional
Datatypeteidata.pointer
<div type="bibl">  <head>Selections from <title level="m">The Collected Letters of Robert Southey. Part 1: 1791-1797</title>  </head>  <listBibl xml:base="https://romantic-circles.org/sites/default/files/imported/editions/southey_letters/XML/">   <bibl>    <ref target="letterEEd.26.3.xml">     <title>Robert Southey to Grosvenor Charles Bedford</title>, <date when="1792-04-03">3 April 1792</date>.    </ref>   </bibl>   <bibl>    <ref target="letterEEd.26.57.xml">     <title>Robert Southey to Anna Seward</title>, <date when="1793-09-18">18 September 1793</date>.    </ref>   </bibl>   <bibl>    <ref target="letterEEd.26.85.xml">     <title>Robert Southey to Robert Lovell</title>, <date from="1794-04-05"      to="1794-04-06">5-6 April, 1794</date>.    </ref>   </bibl>  </listBibl> </div>
xml:spacesignals an intention about how white space should be managed by applications.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
default
signals that the application's default white-space processing modes are acceptable
preserve
indicates the intent that applications preserve all white space
Note

The XML specification provides further guidance on the use of this attribute. Note that many parsers may not handle xml:space correctly.

Appendix A.3.23 att.global.analytic

att.global.analytic provides additional global attributes for associating specific analyses or interpretations with appropriate portions of a text. [18.2. Global Attributes for Simple Analyses 18.3. Spans and Interpretations]
Moduleanalysis — Formal specification
Membersatt.global[TEI addSpan appInfo application availability back bibl body catDesc catRef category change char charDecl classDecl correction damage damageSpan date delSpan desc div docDate edition editionStmt editorialDecl email encodingDesc ex extent facsimile fileDesc front funder fw g gap glyph graphic handNotes handShift head hyphenation idno label langUsage language licence line listPrefixDef listTranspose localProp mapping measure media meeting metamark mod name namespace normalization note num p particDesc path pb pc prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation redo ref resp respStmt restore retrace revisionDesc s secl segmentation setting settingDesc sourceDesc sourceDoc space subst substJoin supplied surface surfaceGrp surplus tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt transpose undo unicodeProp unihanProp unit w zone]
Attributes
ana(analysis) indicates one or more elements containing interpretations of the element on which the ana attribute appears.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

When multiple values are given, they may reflect either multiple divergent interpretations of an ambiguous text, or multiple mutually consistent interpretations of the same passage in different contexts.

Appendix A.3.24 att.global.change

Appendix A.3.26 att.global.rendition

att.global.rendition provides rendering attributes common to all elements in the TEI encoding scheme. [1.3.1.1.3. Rendition Indicators]
Moduletei — Formal specification
Membersatt.global[TEI addSpan appInfo application availability back bibl body catDesc catRef category change char charDecl classDecl correction damage damageSpan date delSpan desc div docDate edition editionStmt editorialDecl email encodingDesc ex extent facsimile fileDesc front funder fw g gap glyph graphic handNotes handShift head hyphenation idno label langUsage language licence line listPrefixDef listTranspose localProp mapping measure media meeting metamark mod name namespace normalization note num p particDesc path pb pc prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation redo ref resp respStmt restore retrace revisionDesc s secl segmentation setting settingDesc sourceDesc sourceDoc space subst substJoin supplied surface surfaceGrp surplus tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt transpose undo unicodeProp unihanProp unit w zone]
Attributes
rend(rendition) indicates how the element in question was rendered or presented in the source text.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
<head rend="align(center) case(allcaps)">  <lb/>To The <lb/>Duchesse <lb/>of <lb/>Newcastle, <lb/>On Her <lb/>  <hi rend="case(mixed)">New Blazing-World</hi>. </head>
Note

These Guidelines make no binding recommendations for the values of the rend attribute; the characteristics of visual presentation vary too much from text to text and the decision to record or ignore individual characteristics varies too much from project to project. Some potentially useful conventions are noted from time to time at appropriate points in the Guidelines. The values of the rend attribute are a set of sequence-indeterminate individual tokens separated by whitespace.

stylecontains an expression in some formal style definition language which defines the rendering or presentation used for this element in the source text.
StatusOptional
Datatypeteidata.text
<head style="text-align: center; font-variant: small-caps">  <lb/>To The <lb/>Duchesse <lb/>of <lb/>Newcastle, <lb/>On Her <lb/>  <hi style="font-variant: normal">New Blazing-World</hi>. </head>
Note

Unlike the attribute values of rend, which uses whitespace as a separator, the style attribute may contain whitespace. This attribute is intended for recording inline stylistic information concerning the source, not any particular output.

The formal language in which values for this attribute are expressed may be specified using the <styleDefDecl> element in the TEI header.

If style and rendition are both present on an element, then style overrides or complements rendition. style should not be used in conjunction with rend, because the latter does not employ a formal style definition language.

renditionpoints to a description of the rendering or presentation used for this element in the source text.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
<head rendition="#ac #sc">  <lb/>To The <lb/>Duchesse <lb/>of <lb/>Newcastle, <lb/>On Her <lb/>  <hi rendition="#normal">New Blazing-World</hi>. </head> <!-- elsewhere... --> <rendition xml:id="sc"  scheme="css">font-variant: small-caps</rendition> <rendition xml:id="normal"  scheme="css">font-variant: normal</rendition> <rendition xml:id="ac"  scheme="css">text-align: center</rendition>
Note

The rendition attribute is used in a very similar way to the class attribute defined for XHTML but with the important distinction that its function is to describe the appearance of the source text, not necessarily to determine how that text should be presented on screen or paper.

If rendition is used to refer to a style definition in a formal language like CSS, it is recommended that it not be used in conjunction with rend. Where both rendition and rend are supplied, the latter is understood to override or complement the former.

Each URI provided should indicate a <rendition> element defining the intended rendition in terms of some appropriate style language, as indicated by the scheme attribute.

Appendix A.3.27 att.global.responsibility

att.global.responsibility provides attributes indicating the agent responsible for some aspect of the text, the markup or something asserted by the markup, and the degree of certainty associated with it. [1.3.1.1.4. Sources, certainty, and responsibility 3.5. Simple Editorial Changes 12.3.2.2. Hand, Responsibility, and Certainty Attributes 18.3. Spans and Interpretations 14.1.1. Linking Names and Their Referents]
Moduletei — Formal specification
Membersatt.global[TEI addSpan appInfo application availability back bibl body catDesc catRef category change char charDecl classDecl correction damage damageSpan date delSpan desc div docDate edition editionStmt editorialDecl email encodingDesc ex extent facsimile fileDesc front funder fw g gap glyph graphic handNotes handShift head hyphenation idno label langUsage language licence line listPrefixDef listTranspose localProp mapping measure media meeting metamark mod name namespace normalization note num p particDesc path pb pc prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation redo ref resp respStmt restore retrace revisionDesc s secl segmentation setting settingDesc sourceDesc sourceDoc space subst substJoin supplied surface surfaceGrp surplus tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt transpose undo unicodeProp unihanProp unit w zone]
Attributes
cert(certainty) signifies the degree of certainty associated with the intervention or interpretation.
StatusOptional
Datatypeteidata.probCert
resp(responsible party) indicates the agency responsible for the intervention or interpretation, for example an editor or transcriber.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

To reduce the ambiguity of a resp pointing directly to a person or organization, we recommend that resp be used to point not to an agent (<person> or <org>) but to a <respStmt>, <author>, <editor> or similar element which clarifies the exact role played by the agent. Pointing to multiple <respStmt>s allows the encoder to specify clearly each of the roles played in part of a TEI file (creating, transcribing, encoding, editing, proofing etc.).

Example
Blessed are the <choice>  <sic>cheesemakers</sic>  <corr resp="#editorcert="high">peacemakers</corr> </choice>: for they shall be called the children of God.
Example
<!-- in the <text> ... --><lg> <!-- ... -->  <l>Punkes, Panders, baſe extortionizing    sla<choice>    <sic>n</sic>    <corr resp="#JENS1_transcriber">u</corr>   </choice>es,</l> <!-- ... --> </lg> <!-- in the <teiHeader> ... --> <!-- ... --> <respStmt xml:id="JENS1_transcriber">  <resp when="2014">Transcriber</resp>  <name>Janelle Jenstad</name> </respStmt>

Appendix A.3.28 att.global.source

att.global.source provides attributes used by elements to point to an external source. [1.3.1.1.4. Sources, certainty, and responsibility 3.3.3. Quotation 8.3.4. Writing]
Moduletei — Formal specification
Membersatt.global[TEI addSpan appInfo application availability back bibl body catDesc catRef category change char charDecl classDecl correction damage damageSpan date delSpan desc div docDate edition editionStmt editorialDecl email encodingDesc ex extent facsimile fileDesc front funder fw g gap glyph graphic handNotes handShift head hyphenation idno label langUsage language licence line listPrefixDef listTranspose localProp mapping measure media meeting metamark mod name namespace normalization note num p particDesc path pb pc prefixDef profileDesc projectDesc pubPlace publicationStmt publisher quotation redo ref resp respStmt restore retrace revisionDesc s secl segmentation setting settingDesc sourceDesc sourceDoc space subst substJoin supplied surface surfaceGrp surplus tagUsage tagsDecl taxonomy teiCorpus teiHeader term text textClass time title titleStmt transpose undo unicodeProp unihanProp unit w zone]
Attributes
sourcespecifies the source from which some aspect of this element is drawn.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Schematron
<sch:rule context="tei:*[@source]"> <sch:let name="srcs"  value="tokenize( normalize-space(@source),' ')"/> <sch:report test="( self::tei:classRef | self::tei:dataRef | self::tei:elementRef | self::tei:macroRef | self::tei:moduleRef | self::tei:schemaSpec ) and $srcs[2]"> When used on a schema description element (like <sch:value-of select="name(.)"/>), the @source attribute should have only 1 value. (This one has <sch:value-of select="count($srcs)"/>.) </sch:report> </sch:rule>
Note

The source attribute points to an external source. When used on an element describing a schema component (<classRef>, <dataRef>, <elementRef>, <macroRef>, <moduleRef>, or <schemaSpec>), it identifies the source from which declarations for the components should be obtained.

On other elements it provides a pointer to the bibliographical source from which a quotation or citation is drawn.

In either case, the location may be provided using any form of URI, for example an absolute URI, a relative URI, a private scheme URI of the form tei:x.y.z, where x.y.z indicates the version number, e.g. tei:4.3.2 for TEI P5 release 4.3.2 or (as a special case) tei:current for whatever is the latest release, or a private scheme URI that is expanded to an absolute URI as documented in a <prefixDef>.

When used on elements describing schema components, source should have only one value; when used on other elements multiple values are permitted.

Example
<p> <!-- ... --> As Willard McCarty (<bibl xml:id="mcc_2012">2012, p.2</bibl>) tells us, <quote source="#mcc_2012">‘Collaboration’ is a problematic and should be a contested    term.</quote> <!-- ... --> </p>
Example
<p> <!-- ... -->  <quote source="#chicago_15_ed">Grammatical theories are in flux, and the more we learn, the    less we seem to know.</quote> <!-- ... --> </p> <!-- ... --> <bibl xml:id="chicago_15_ed">  <title level="m">The Chicago Manual of Style</title>, <edition>15th edition</edition>. <pubPlace>Chicago</pubPlace>: <publisher>University of    Chicago Press</publisher> (<date>2003</date>), <biblScope unit="page">p.147</biblScope>. </bibl>
Example
<elementRef key="psource="tei:2.0.1"/>
Include in the schema an element named <p> available from the TEI P5 2.0.1 release.
Example
<schemaSpec ident="myODD"  source="mycompiledODD.xml"> <!-- further declarations specifying the components required --> </schemaSpec>
Create a schema using components taken from the file mycompiledODD.xml.

Appendix A.3.29 att.handFeatures

att.handFeatures provides attributes describing aspects of the hand in which a manuscript is written. [12.3.2.1. Document Hands]
Moduletei — Formal specification
MembershandShift
Attributes
scribegives a name or other identifier for the scribe believed to be responsible for this hand.
StatusOptional
Datatypeteidata.name
scribeRefpoints to a full description of the scribe concerned, typically supplied by a <person> element elsewhere in the description.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
scriptcharacterizes the particular script or writing style used by this hand, for example secretary, copperplate, Chancery, Italian, etc.
StatusOptional
Datatype1–∞ occurrences of teidata.name separated by whitespace
scriptRefpoints to a full description of the script or writing style used by this hand, typically supplied by a <scriptNote> element elsewhere in the description.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
mediumdescribes the tint or type of ink, e.g. brown, or other writing medium, e.g. pencil.
StatusOptional
Datatype1–∞ occurrences of teidata.enumerated separated by whitespace
Note

Usually either script or scriptRef, and similarly, either scribe or scribeRef, will be supplied.

Appendix A.3.30 att.internetMedia

att.internetMedia provides attributes for specifying the type of a computer resource using a standard taxonomy.
Moduletei — Formal specification
Membersatt.media[graphic media] ref
Attributes
mimeType(MIME media type) specifies the applicable multimedia internet mail extension (MIME) media type.
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
ExampleIn this example mimeType is used to indicate that the URL points to a TEI XML file encoded in UTF-8.
<ref mimeType="application/tei+xml; charset=UTF-8"  target="https://raw.githubusercontent.com/TEIC/TEI/dev/P5/Source/guidelines-en.xml"/>
Note

This attribute class provides an attribute for describing a computer resource, typically available over the internet, using a value taken from a standard taxonomy. At present only a single taxonomy is supported, the Multipurpose Internet Mail Extensions (MIME) Media Type system. This typology of media types is defined by the Internet Engineering Task Force in RFC 2046. The list of types is maintained by the Internet Assigned Numbers Authority (IANA). The mimeType attribute must have a value taken from this list.

Appendix A.3.31 att.lexicographic.normalized

att.lexicographic.normalized provides attributes for usage within word-level elements in the analysis module and within lexicographic microstructure in the dictionaries module.
Moduleanalysis — Formal specification
Membersatt.linguistic[pc w]
Attributes
norm(normalized) provides the normalized/standardized form of information present in the source text in a non-normalized form.
StatusOptional
Datatypeteidata.text
Normalization of part-of-speech information within a dictionary entry.
<gramGrp>  <pos norm="noun">n</pos> </gramGrp>
Normalization of a source form in a tokenized historical corpus.
<s>  <w>for</w>  <w norm="virtue's">vertues</w>  <w>sake</w> </s>
<s>  <w norm="persuasion">perswasion</w>  <w>of</w>  <w norm="Unity">Vnitie</w> </s>
Example of normalization from Aviso. Relation oder Zeitung. Wolfenbüttel, 1609. In: Deutsches Textarchiv.
<s>  <w norm="freiwillig">freywillig</w>  <pc norm=","   join="left">/</pc>  <w norm="unbedrängt">vnbedraͤngt</w>  <w norm="und">vnd</w>  <w norm="unverhindert">vnuerhindert</w> </s>
<w norm="Teil">Theyll</w>
<w norm="Freude">Frewde</w>
orig(original) gives the original string or is the empty string when the element does not appear in the source text.
StatusOptional
Datatypeteidata.text
Example from a language documentation project of the Mixtepec-Mixtec language (ISO 639-3: 'mix'). This is a use case where speakers spell something incorrectly but we would like to preserve it for any number of reasons, the use of orig is essential and could have uses for both the speaker to see past mistakes, researchers to get insight into how untrained speakers write their language instinctually (in contrast to prescribed convention), etc.:
<w orig="ntsa sia'i">ntsasia'i</w>
Example from the EarlyPrint project. Fragment of text where obvious errors have been corrected but the original forms remain recorded:
<w lemma="he"  pos="pns"  xml:id="b1afj-003-a-0950">he</w> <w lemma="have"  pos="vvz"  xml:id="b1afj-003-a-0960">hath</w> <w lemma="bring"  pos="vvn"  xml:id="b1afj-003-a-0970">brought</w> <w lemma="forth"  pos="av"  xml:id="b1afj-003-a-0980"  orig="sorth">forth</w>
An example from the EarlyPrint project showing the use of both norm and orig. The orig attribute preserves the original version (sometimes with spelling errors, often with printer abbreviations), the element content resolves printer abbreviations but retains the original orthography, and the norm attribute holds normalized values:
<w lemma="commandment"  pos="n1"  norm="commandment"  xml:id="b9avr-018-a-7720"  orig="commandemēt">commandement</w>
Note

It needs to be stressed that the two attributes in this class are meant for strictly lexicographic and linguistic uses, and not for editorial interventions. For the latter, the mechanism based on <choice>, <orig>, and <reg> needs to be employed.

Appendix A.3.32 att.linguistic

att.linguistic provides a set of attributes concerning linguistic features of tokens, for usage within token-level elements, specifically <w> and <pc> in the analysis module. [18.4.2. Lightweight Linguistic Annotation]
Moduleanalysis — Formal specification
Memberspc w
Attributes
lemmaprovides a lemma (base form) for the word, typically uninflected and serving both as an identifier (e.g. in dictionary contexts, as a headword), and as a basis for potential inflections.
StatusOptional
Datatypeteidata.text
<w lemma="wife">wives</w>
<w lemma="Arznei">Artzeneyen</w>
lemmaRefprovides a pointer to a definition of the lemma for the word, for example in an online lexicon.
StatusOptional
Datatypeteidata.pointer
<w type="verb"  lemma="hit"  lemmaRef="http://www.example.com/lexicon/hitvb.xml">hitt<m type="suffix">ing</m> </w>
pos(part of speech) indicates the part of speech assigned to a token (i.e. information on whether it is a noun, adjective, or verb), usually according to some official reference vocabulary (e.g. for German: STTS, for English: CLAWS, for Polish: NKJP, etc.).
StatusOptional
Datatypeteidata.text
The German sentence ‘Wir fahren in den Urlaub.’ tagged with the Stuttgart-Tuebingen-Tagset (STTS).
<s>  <w pos="PPER">Wir</w>  <w pos="VVFIN">fahren</w>  <w pos="APPR">in</w>  <w pos="ART">den</w>  <w pos="NN">Urlaub</w>  <w pos="$.">.</w> </s>
The English sentence ‘We're going to Brazil.’ tagged with the CLAWS-5 tagset, arranged inline (with significant whitespace).
<p><w pos="PNP">We</w><w pos="VBB">'re</w> <w pos="VVG">going</w> <w pos="PRP">to</w> <w pos="NP0">Brazil</w><pc pos="PUN">.</pc></p>         
The English sentence ‘We're going on vacation to Brazil for a month!’ tagged with the CLAWS-7 tagset and arranged sequentially.
<p>  <w pos="PPIS2">We</w>  <w pos="VBR">'re</w>  <w pos="VVG">going</w>  <w pos="II">on</w>  <w pos="NN1">vacation</w>  <w pos="II">to</w>  <w pos="NP1">Brazil</w>  <w pos="IF">for</w>  <w pos="AT1">a</w>  <w pos="NNT1">month</w>  <pc pos="!">!</pc> </p>
msd(morphosyntactic description) supplies morphosyntactic information for a token, usually according to some official reference vocabulary (e.g. for German: STTS-large tagset; for a feature description system designed as (pragmatically) universal, see Universal Features).
StatusOptional
Datatypeteidata.text
<ab>  <w pos="PPER"   msd="1.Pl.*.Nom">Wir</w>  <w pos="VVFIN"   msd="1.Pl.Pres.Ind">fahren</w>  <w pos="APPR"   msd="--">in</w>  <w pos="ART"   msd="Def.Masc.Akk.Sg">den</w>  <w pos="NN"   msd="Masc.Akk.Sg">Urlaub</w>  <pc pos="$."   msd="--">.</pc> </ab>
joinwhen present, provides information on whether the token in question is adjacent to another, and if so, on which side.
StatusOptional
Datatypeteidata.text
Legal values are:
no
the token is not adjacent to another
left
there is no whitespace on the left side of the token
right
there is no whitespace on the right side of the token
both
there is no whitespace on either side of the token
overlap
the token overlaps with another; other devices (specifying the extent and the area of overlap) are needed to more precisely locate this token in the character stream
The example below assumes that the lack of whitespace is marked redundantly, by using the appropriate values of join.
<s>  <pc join="right">"</pc>  <w join="left">Friends</w>  <w>will</w>  <w>be</w>  <w join="right">friends</w>  <pc join="both">.</pc>  <pc join="left">"</pc> </s>
Note that a project may make a decision to only indicate lack of whitespace in one direction, or do that non-redundantly. The existing proposal is the broadest possible, on the assumption that we adopt the "streamable view", where all the information on the current element needs to be represented locally.
The English sentence ‘We're going on vacation.’ tagged with the CLAWS-5 tagset, arranged sequentially, tagged on the assumption that only the lack of the preceding whitespace is indicated.
<p>  <w pos="PNP">We</w>  <w pos="VBB"   join="left">'re</w>  <w pos="VVG">going</w>  <w pos="PRP">on</w>  <w pos="NN1">vacation</w>  <pc pos="PUN"   join="left">.</pc> </p>
Note

The definition of this attribute is adapted from ISO MAF (Morpho-syntactic Annotation Framework), ISO 24611:2012.

Note

These attributes make it possible to encode simple language corpora and to add a layer of linguistic information to any tokenized resource. See section 18.4.2. Lightweight Linguistic Annotation for discussion.

Appendix A.3.33 att.measurement

att.measurement provides attributes to represent a regularized or normalized measurement.
Moduletei — Formal specification
Membersmeasure unit
Attributes
unit(unit) indicates the units used for the measurement, usually using the standard symbol for the desired units.
StatusOptional
Datatypeteidata.enumerated
Suggested values include:
m
(metre) SI base unit of length
kg
(kilogram) SI base unit of mass
s
(second) SI base unit of time
Hz
(hertz) SI unit of frequency
Pa
(pascal) SI unit of pressure or stress
Ω
(ohm) SI unit of electric resistance
L
(litre) 1 dm³
t
(tonne) 10³ kg
ha
(hectare) 1 hm²
Å
(ångström) 10⁻¹⁰ m
mL
(millilitre)
cm
(centimetre)
dB
(decibel) see remarks, below
kbit
(kilobit) 10³ or 1000 bits
Kibit
(kibibit) 2¹⁰ or 1024 bits
kB
(kilobyte) 10³ or 1000 bytes
KiB
(kibibyte) 2¹⁰ or 1024 bytes
MB
(megabyte) 10⁶ or 1 000 000 bytes
MiB
(mebibyte) 2²⁰ or 1 048 576 bytes
Note

If the measurement being represented is not expressed in a particular unit, but rather is a number of discrete items, the unit count should be used, or the unit attribute may be left unspecified.

Wherever appropriate, a recognized SI unit name should be used (see further http://www.bipm.org/en/publications/si-brochure/; http://physics.nist.gov/cuu/Units/). The list above is indicative rather than exhaustive.

unitRefpoints to a unique identifier stored in the xml:id of a <unitDef> element that defines a unit of measure.
StatusOptional
Datatypeteidata.pointer
quantity(quantity) specifies the number of the specified units that comprise the measurement
StatusOptional
Datatypeteidata.numeric
commodity(commodity) indicates the substance that is being measured
StatusOptional
Datatype1–∞ occurrences of teidata.word separated by whitespace
Note

In general, when the commodity is made of discrete entities, the plural form should be used, even when the measurement is of only one of them.

Schematron
<sch:rule context="tei:*[@unitRef]"> <sch:report test="@unit" role="info">The @unit attribute may be unnecessary when @unitRef is present.</sch:report> </sch:rule>
Note
This attribute class provides a triplet of attributes that may be used either to regularize the values of the measurement being encoded, or to normalize them with respect to a standard measurement system.
<l>So weren't you gonna buy <measure quantity="0.5unit="gal"   commodity="ice cream">half    a gallon</measure>, baby</l> <l>So won't you go and buy <measure quantity="1.893unit="L"   commodity="ice cream">half    a gallon</measure>, baby?</l>

The unit should normally be named using the standard symbol for an SI unit (see further http://www.bipm.org/en/publications/si-brochure/; http://physics.nist.gov/cuu/Units/). However, encoders may also specify measurements using informally defined units such as lines or characters.

Appendix A.3.34 att.media

att.media provides attributes for specifying display and related properties of external media.
Moduletei — Formal specification
Membersgraphic media
Attributes
widthWhere the media are displayed, indicates the display width.
StatusOptional
Datatypeteidata.outputMeasurement
heightWhere the media are displayed, indicates the display height.
StatusOptional
Datatypeteidata.outputMeasurement
scaleWhere the media are displayed, indicates a scale factor to be applied when generating the desired display size.
StatusOptional
Datatypeteidata.numeric

Appendix A.3.35 att.naming

att.naming provides attributes common to elements which refer to named persons, places, organizations etc. [3.6.1. Referring Strings 14.3.7. Names and Nyms]
Moduletei — Formal specification
Membersatt.personal[name] pubPlace
Attributes
rolemay be used to specify further information about the entity referenced by this name in the form of a set of whitespace-separated values, for example the occupation of a person, or the status of a place.
StatusOptional
Datatype1–∞ occurrences of teidata.enumerated separated by whitespace
nymRef(reference to the canonical name) provides a means of locating the canonical form (nym) of the names associated with the object named by the element bearing it.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

The value must point directly to one or more XML elements by means of one or more URIs, separated by whitespace. If more than one is supplied, the implication is that the name is associated with several distinct canonical names.

Appendix A.3.36 att.notated

att.notated provides attributes to indicate any specialised notation used for element content.
Moduletei — Formal specification
Memberss w
Attributes
notationnames the notation used for the content of the element.
StatusOptional
Datatypeteidata.enumerated

Appendix A.3.37 att.patternReplacement

att.patternReplacement provides attributes for regular-expression matching and replacement. [17.2.3. Using Abbreviated Pointers 2.3.6.3. Milestone Method 2.3.6. The Reference System Declaration 2.3.6.2. Search-and-Replace Method]
Moduleheader — Formal specification
MembersprefixDef
Attributes
matchPatternspecifies a regular expression against which the values of other attributes can be matched.
StatusRequired
Datatypeteidata.pattern
Note

The syntax used should follow that defined by W3C XPath syntax. Note that parenthesized groups are used not only for establishing order of precedence and atoms for quantification, but also for creating subpatterns to be referenced by the replacementPattern attribute.

replacementPatternspecifies a ‘replacement pattern’, that is, the skeleton of a relative or absolute URI containing references to groups in the matchPattern which, once subpattern substitution has been performed, complete the URI.
StatusRequired
Datatypeteidata.replacement
Note

The strings $1, $2 etc. are references to the corresponding group in the regular expression specified by matchPattern (counting open parenthesis, left to right). Processors are expected to replace them with whatever matched the corresponding group in the regular expression.

If a digit preceded by a dollar sign is needed in the actual replacement pattern (as opposed to being used as a back reference), the dollar sign must be written as %24.

Appendix A.3.38 att.personal

att.personal (attributes for components of names usually, but not necessarily, personal names) common attributes for those elements which form part of a name usually, but not necessarily, a personal name. [14.2.1. Personal Names]
Moduletei — Formal specification
Membersname
Attributes
fullindicates whether the name component is given in full, as an abbreviation or simply as an initial.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
yes
(yes) the name component is spelled out in full.[Default]
abb
(abbreviated) the name component is given in an abbreviated form.
init
(initial letter) the name component is indicated only by one initial.
sort(sort) specifies the sort order of the name component in relation to others within the name.
StatusOptional
Datatypeteidata.count

Appendix A.3.39 att.placement

att.placement provides attributes for describing where on the source page or object a textual element appears. [3.5.3. Additions, Deletions, and Omissions 12.3.1.4. Additions and Deletions]
Moduletei — Formal specification
Membersatt.transcriptional[addSpan delSpan mod redo restore retrace subst substJoin undo] div fw head label metamark note
Attributes
placespecifies where this item is placed.
StatusRecommended
Datatype1–∞ occurrences of teidata.enumerated separated by whitespace
Suggested values include:
top
at the top of the page
bottom
at the foot of the page
margin
in the margin (left, right, or both)
opposite
on the opposite, i.e. facing, page
overleaf
on the other side of the leaf
above
above the line
right
to the right, e.g. to the right of a vertical line of text, or to the right of a figure
below
below the line
left
to the left, e.g. to the left of a vertical line of text, or to the left of a figure
end
at the end of e.g. chapter or volume.
inline
within the body of the text.
inspace
in a predefined space, for example left by an earlier scribe.
<add place="margin">[An addition written in the margin]</add> <add place="bottom opposite">[An addition written at the foot of the current page and also on the facing page]</add>
<note place="bottom">Ibid, p.7</note>

Appendix A.3.40 att.pointing

att.pointing provides a set of attributes used by all elements which point to other elements by means of one or more URI references. [1.3.1.1.2. Language Indicators 3.7. Simple Links and Cross-References]
Moduletei — Formal specification
MemberscatRef licence note ref substJoin term
Attributes
targetLangspecifies the language of the content to be found at the destination referenced by target, using a ‘language tag’ generated according to BCP 47.
StatusOptional
Datatypeteidata.language
Schematron
<sch:rule context="tei:*[not(self::tei:schemaSpec)][@targetLang]"> <sch:assert test="@target">@targetLang should only be used on <sch:name/> if @target is specified.</sch:assert> </sch:rule>
<linkGrp xml:id="pol-swh_aln_2.1-linkGrp">  <ptr xml:id="pol-swh_aln_2.1.1-ptr"   target="pol/UDHR/text.xml#pol_txt_1-head"   type="tuv"   targetLang="pl"/>  <ptr xml:id="pol-swh_aln_2.1.2-ptr"   target="swh/UDHR/text.xml#swh_txt_1-head"   type="tuv"   targetLang="sw"/> </linkGrp>
In the example above, the <linkGrp> combines pointers at parallel fragments of the Universal Declaration of Human Rights: one of them is in Polish, the other in Swahili.
Note

The value must conform to BCP 47. If the value is a private use code (i.e., starts with x- or contains -x-), a <language> element with a matching value for its ident attribute should be supplied in the TEI header to document this value. Such documentation may also optionally be supplied for non-private-use codes, though these must remain consistent with their (IETF)Internet Engineering Task Force definitions.

targetspecifies the destination of the reference by supplying one or more URI References.
StatusOptional
Datatype1–∞ occurrences of teidata.pointer separated by whitespace
Note

One or more syntactically valid URI references, separated by whitespace. Because whitespace is used to separate URIs, no whitespace is permitted inside a single URI. If a whitespace character is required in a URI, it should be escaped with the normal mechanism, e.g. TEI%20Consortium.

evaluate(evaluate) specifies the intended meaning when the target of a pointer is itself a pointer.
StatusOptional
Datatypeteidata.enumerated
Legal values are:
all
if the element pointed to is itself a pointer, then the target of that pointer will be taken, and so on, until an element is found which is not a pointer.
one
if the element pointed to is itself a pointer, then its target (whether a pointer or not) is taken as the target of this pointer.
none
no further evaluation of targets is carried out beyond that needed to find the element specified in the pointer's target.
Note

If no value is given, the application program is responsible for deciding (possibly on the basis of user input) how far to trace a chain of pointers.

Appendix A.3.41 att.ranging

att.ranging provides attributes for describing numerical ranges.
Moduletei — Formal specification
Membersatt.dimensions[att.damaged[damage damageSpan] addSpan date delSpan ex gap mod redo restore retrace secl space subst substJoin supplied surplus time undo] measure num
Attributes
atLeastgives a minimum estimated value for the approximate measurement.
StatusOptional
Datatypeteidata.numeric
atMostgives a maximum estimated value for the approximate measurement.
StatusOptional
Datatypeteidata.numeric
minwhere the measurement summarizes more than one observation or a range, supplies the minimum value observed.
StatusOptional
Datatypeteidata.numeric
maxwhere the measurement summarizes more than one observation or a range, supplies the maximum value observed.
StatusOptional
Datatypeteidata.numeric
confidencespecifies the degree of statistical confidence (between zero and one) that a value falls within the range specified by min and max, or the proportion of observed values that fall within that range.
StatusOptional
Datatypeteidata.probability
Example
The MS. was lost in transmission by mail from <del rend="overstrike">  <gap reason="illegible"   extent="one or two lettersatLeast="1atMost="2unit="chars"/> </del> Philadelphia to the Graphic office, New York.
Example
Americares has been supporting the health sector in Eastern Europe since 1986, and since 1992 has provided <measure atLeast="120000000unit="USD"  commodity="currency">more than $120m</measure> in aid to Ukrainians.

Appendix A.3.42 att.resourced

att.resourced provides attributes by which a resource (such as an externally held media file) may be located.
Moduletei — Formal specification
Membersgraphic media
Attributes
url(uniform resource locator) specifies the URL from which the media concerned may be obtained.
StatusRequired
Datatypeteidata.pointer

Appendix A.3.43 att.scope

att.scope provides attributes to describe, in general terms, the scope of an element’s application.
Moduletei — Formal specification
Membersatt.handFeatures[handShift] language
Attributes
scopeindicates the scope of application of the element
StatusOptional
Datatypeteidata.enumerated
Suggested values include:
sole
only this particular feature is used throughout the document
major
this feature is used through most of the document
minor
this feature is used occasionally through the document
<langUsage>  <language ident="en"   scope="major"/>  <language ident="es"   scope="minor"/>  <language ident="x-ww"   scope="minor">An invented language the children call <name>Wikwah</name>.</language> </langUsage>
<handNote scope="sole">  <p>Written in insular phase II half-uncial with    interlinear Old English gloss in an Anglo-Saxon    pointed minuscule.</p> </handNote>

Appendix A.3.44 att.segLike

att.segLike provides attributes for elements used for arbitrary segmentation. [17.3. Blocks, Segments, and Anchors 18.1. Linguistic Segment Categories]
Moduletei — Formal specification
Memberspc s w
Attributes
function(function) characterizes the function of the segment.
StatusOptional
Datatypeteidata.enumerated
Note

Attribute values will often vary depending on the type of element to which they are attached. For example, a <cl>, may take values such as coordinate, subject, adverbial etc. For a <phr>, such values as subject, predicate etc. may be more appropriate. Such constraints will typically be implemented by a project-defined customization.

Appendix A.3.45 att.sortable

att.sortable provides attributes for elements in lists or groups that are sortable, but whose sorting key cannot be derived mechanically from the element content. [10.1. Dictionary Body and Overall Structure]
Moduletei — Formal specification
Membersbibl idno term
Attributes
sortKeysupplies the sort key for this element in an index, list or group which contains it.
StatusOptional
Datatypeteidata.word
David's other principal backer, Josiah ha-Kohen <index indexName="NAMES">  <term sortKey="Azarya_Josiah_Kohen">Josiah ha-Kohen b. Azarya</term> </index> b. Azarya, son of one of the last gaons of Sura was David's own first cousin.
Note

The sort key is used to determine the sequence and grouping of entries in an index. It provides a sequence of characters which, when sorted with the other values, will produced the desired order; specifics of sort key construction are application-dependent

Dictionary order often differs from the collation sequence of machine-readable character sets; in English-language dictionaries, an entry for 4-H will often appear alphabetized under ‘fourh’, and McCoy may be alphabetized under ‘maccoy’, while A1, A4, and A5 may all appear in numeric order ‘alphabetized’ between ‘a-’ and ‘AA’. The sort key is required if the orthography of the dictionary entry does not suffice to determine its location.

Appendix A.3.46 att.spanning

att.spanning provides attributes for elements which delimit a span of text by pointing mechanisms rather than by enclosing it. [12.3.1.4. Additions and Deletions 1.3.1. Attribute Classes]
Moduletei — Formal specification
MembersaddSpan damageSpan delSpan metamark mod pb redo retrace undo
Attributes
spanToindicates the end of a span initiated by the element bearing this attribute.
StatusOptional
Datatypeteidata.pointer
SchematronThe @spanTo attribute must point to an element following the current element; however, this can only be tested if both this element and the one pointed to are in the same document.
<sch:rule context="tei:*[ starts-with( @spanTo, '#') ]"> <sch:assert test="id( substring( @spanTo, 2 ) ) >> .">The element indicated by @spanTo (<sch:value-of select="@spanTo"/>) must follow the current <sch:name/> element </sch:assert> </sch:rule>
Note

The span is defined as running in document order from the start of the content of the pointing element to the end of the content of the element pointed to by the spanTo attribute (if any). If no value is supplied for the attribute, the assumption is that the span is coextensive with the pointing element. If no content is present, the assumption is that the starting point of the span is immediately following the element itself.

Appendix A.3.47 att.timed

att.timed provides attributes common to those elements which have a duration in time, expressed either absolutely or by reference to an alignment map. [8.3.5. Temporal Information]
Moduletei — Formal specification
Membersgap media
Attributes
startindicates the location within a temporal alignment at which this element begins.
StatusOptional
Datatypeteidata.pointer
Note

If no value is supplied, the element is assumed to follow the immediately preceding element at the same hierarchic level.

endindicates the location within a temporal alignment at which this element ends.
StatusOptional
Datatypeteidata.pointer
Note

If no value is supplied, the element is assumed to precede the immediately following element at the same hierarchic level.

Appendix A.3.48 att.transcriptional

att.transcriptional provides attributes specific to elements encoding authorial or scribal intervention in a text when transcribing manuscript or similar sources. [12.3.1.4. Additions and Deletions]
Moduletei — Formal specification
MembersaddSpan delSpan mod redo restore retrace subst substJoin undo
Attributes
statusindicates the effect of the intervention, for example in the case of a deletion, strikeouts which include too much or too little text, or in the case of an addition, an insertion which duplicates some of the text already present.
StatusOptional
Datatypeteidata.enumerated
Sample values include:
duplicate
all of the text indicated as an addition duplicates some text that is in the original, whether the duplication is word-for-word or less exact.
duplicate-partial
part of the text indicated as an addition duplicates some text that is in the original
excessStart
some text at the beginning of the deletion is marked as deleted even though it clearly should not be deleted.
excessEnd
some text at the end of the deletion is marked as deleted even though it clearly should not be deleted.
shortStart
some text at the beginning of the deletion is not marked as deleted even though it clearly should be.
shortEnd
some text at the end of the deletion is not marked as deleted even though it clearly should be.
partial
some text in the deletion is not marked as deleted even though it clearly should be.
unremarkable
the deletion is not faulty.[Default]
Note

Status information on each deletion is needed rather rarely except in critical editions from authorial manuscripts; status information on additions is even less common.

Marking a deletion or addition as faulty is inescapably an interpretive act; the usual test applied in practice is the linguistic acceptability of the text with and without the letters or words in question.

causedocuments the presumed cause for the intervention.
StatusOptional
Datatypeteidata.enumerated
seq(sequence) assigns a sequence number related to the order in which the encoded features carrying this attribute are believed to have occurred.
StatusOptional
Datatypeteidata.count

Appendix A.3.49 att.typed

att.typed provides attributes that can be used to classify or subclassify elements in any way. [1.3.1. Attribute Classes 18.1.1. Words and Above 3.6.1. Referring Strings 3.7. Simple Links and Cross-References 3.6.5. Abbreviations and Their Expansions 3.13.1. Core Tags for Verse 7.2.5. Speech Contents 4.1.1. Un-numbered Divisions 4.1.2. Numbered Divisions 4.2.1. Headings and Trailers 4.4. Virtual Divisions 14.3.2.3. Personal Relationships 12.3.1.1. Core Elements for Transcriptional Work 17.1.1. Pointers and Links 17.3. Blocks, Segments, and Anchors 13.2. Linking the Apparatus to the Text 23.5.1.2. Defining Content Models: RELAX NG 8.3. Elements Unique to Spoken Texts 24.3.1.3. Modification of Attribute and Attribute Value Lists]
Moduletei — Formal specification
MembersTEI addSpan application bibl change damage damageSpan date delSpan desc div fw g graphic head idno label line mapping measure media mod name note num path pb pc ref restore s space surface surfaceGrp teiCorpus term text time title unit w zone
Attributes
typecharacterizes the element in some sense, using any convenient classification scheme or typology.
StatusOptional
Datatypeteidata.enumerated
<div type="verse">  <head>Night in Tarras</head>  <lg type="stanza">   <l>At evening tramping on the hot white road</l>   <l></l>  </lg>  <lg type="stanza">   <l>A wind sprang up from nowhere as the sky</l>   <l></l>  </lg> </div>
Note

The type attribute is present on a number of elements, not all of which are members of att.typed, usually because these elements restrict the possible values for the attribute in a specific way.

subtype(subtype) provides a sub-categorization of the element, if needed.
StatusOptional
Datatypeteidata.enumerated
Note

The subtype attribute may be used to provide any sub-classification for the element additional to that provided by its type attribute.

Schematron
<sch:rule context="tei:*[@subtype]"> <sch:assert test="@type">The <sch:name/> element should not be categorized in detail with @subtype unless also categorized in general with @type</sch:assert> </sch:rule>
Note

When appropriate, values from an established typology should be used. Alternatively a typology may be defined in the associated TEI header. If values are to be taken from a project-specific list, this should be defined using the <valList> element in the project-specific schema description, as described in 24.3.1.3. Modification of Attribute and Attribute Value Lists .

Appendix A.3.50 att.written

att.written provides attributes to indicate the hand in which the content of an element was written in the source being transcribed. [1.3.1. Attribute Classes]
Moduletei — Formal specification
Membersatt.damaged[damage damageSpan] att.transcriptional[addSpan delSpan mod redo restore retrace subst substJoin undo] div fw head label line note p path text zone
Attributes
handpoints to a <handNote> element describing the hand considered responsible for the content of the element concerned.
StatusOptional
Datatypeteidata.pointer

Appendix A.4 Macros

Appendix A.4.1 macro.limitedContent

macro.limitedContent (paragraph content) defines the content of prose elements that are not used for transcription of extant materials. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.limitedPhrase"/>
  <classRef key="model.inter"/>
 </alternate>
</content>
    
Declaration
tei_macro.limitedContent =
   ( text | tei_model.limitedPhrase | tei_model.inter )*

Appendix A.4.2 macro.paraContent

macro.paraContent (paragraph content) defines the content of paragraphs and similar elements. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.paraPart"/>
 </alternate>
</content>
    
Declaration
tei_macro.paraContent = ( text | tei_model.paraPart )*

Appendix A.4.3 macro.phraseSeq

macro.phraseSeq (phrase sequence) defines a sequence of character data and phrase-level elements. [1.4.1. Standard Content Models]
Moduletei — Formal specification
Used by
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.attributable"/>
  <classRef key="model.phrase"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Declaration
tei_macro.phraseSeq =
   (
      text
    | tei_model.gLike
    | tei_model.attributable
    | tei_model.phrase
    | tei_model.global
   )*

Appendix A.4.4 macro.phraseSeq.limited

macro.phraseSeq.limited (limited phrase sequence) defines a sequence of character data and those phrase-level elements that are not typically used for transcribing extant documents. [1.4.1. Standard Content Models]
Moduletei — Formal specification
Used by
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.limitedPhrase"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Declaration
tei_macro.phraseSeq.limited =
   ( text | tei_model.limitedPhrase | tei_model.global )*

Appendix A.4.5 macro.specialPara

macro.specialPara ('special' paragraph content) defines the content model of elements such as notes or list items, which either contain a series of component-level elements or else have the same structure as a paragraph, containing a series of phrase-level and inter-level elements. [1.3. The TEI Class System]
Moduletei — Formal specification
Used by
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
  <classRef key="model.phrase"/>
  <classRef key="model.inter"/>
  <classRef key="model.divPart"/>
  <classRef key="model.global"/>
 </alternate>
</content>
    
Declaration
tei_macro.specialPara =
   (
      text
    | tei_model.gLike
    | tei_model.phrase
    | tei_model.inter
    | tei_model.divPart
    | tei_model.global
   )*

Appendix A.4.6 macro.xtext

macro.xtext (extended text) defines a sequence of character data and gaiji elements.
Moduletei — Formal specification
Used by
Content model
<content>
 <alternate minOccurs="0"
  maxOccurs="unbounded">
  <textNode/>
  <classRef key="model.gLike"/>
 </alternate>
</content>
    
Declaration
tei_macro.xtext = ( text | tei_model.gLike )*

Appendix A.5 Datatypes

Appendix A.5.1 teidata.certainty

teidata.certainty defines the range of attribute values expressing a degree of certainty.
Moduletei — Formal specification
Used by
Content model
<content>
 <valList type="closed">
  <valItem ident="high"/>
  <valItem ident="medium"/>
  <valItem ident="low"/>
  <valItem ident="unknown"/>
 </valList>
</content>
    
Declaration
tei_teidata.certainty = "high" | "medium" | "low" | "unknown"
Note

Certainty may be expressed by one of the predefined symbolic values high, medium, or low. The value unknown should be used in cases where the encoder does not wish to assert an opinion about the matter.

Appendix A.5.2 teidata.count

teidata.count defines the range of attribute values used for a non-negative integer value used as a count.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef name="nonNegativeInteger"/>
</content>
    
Declaration
tei_teidata.count = xsd:nonNegativeInteger
Note

Any positive integer value or zero is permitted

Appendix A.5.3 teidata.duration.iso

teidata.duration.iso defines the range of attribute values available for representation of a duration in time using ISO 8601 standard formats.
Moduletei — Formal specification
Used by
Content model
<content>
 <dataRef name="token"
  restriction="[0-9.,DHMPRSTWYZ/:+\-]+"/>
</content>
    
Declaration
tei_teidata.duration.iso = token { pattern = "[0-9.,DHMPRSTWYZ/:+\-]+" }
Example
<time dur-iso="PT0,75H">three-quarters of an hour</time>
Example
<date dur-iso="P1,5D">a day and a half</date>
Example
<date dur-iso="P14D">a fortnight</date>
Example
<time dur-iso="PT0.02S">20 ms</time>
Note

A duration is expressed as a sequence of number-letter pairs, preceded by the letter P; the letter gives the unit and may be Y (year), M (month), D (day), H (hour), M (minute), or S (second), in that order. The numbers are all unsigned integers, except for the last, which may have a decimal component (using either . or , as the decimal point; the latter is preferred). If any number is 0, then that number-letter pair may be omitted. If any of the H (hour), M (minute), or S (second) number-letter pairs are present, then the separator T must precede the first ‘time’ number-letter pair.

For complete details, see ISO 8601 Data elements and interchange formats — Information interchange — Representation of dates and times.

Appendix A.5.4 teidata.duration.w3c

teidata.duration.w3c defines the range of attribute values available for representation of a duration in time using W3C datatypes.
Moduletei — Formal specification
Used by
Content model
<content>
 <dataRef name="duration"/>
</content>
    
Declaration
tei_teidata.duration.w3c = xsd:duration
Example
<time dur="PT45M">forty-five minutes</time>
Example
<date dur="P1DT12H">a day and a half</date>
Example
<date dur="P7D">a week</date>
Example
<time dur="PT0.02S">20 ms</time>
Note

A duration is expressed as a sequence of number-letter pairs, preceded by the letter P; the letter gives the unit and may be Y (year), M (month), D (day), H (hour), M (minute), or S (second), in that order. The numbers are all unsigned integers, except for the S number, which may have a decimal component (using . as the decimal point). If any number is 0, then that number-letter pair may be omitted. If any of the H (hour), M (minute), or S (second) number-letter pairs are present, then the separator T must precede the first ‘time’ number-letter pair.

For complete details, see the W3C specification.

Appendix A.5.5 teidata.enumerated

teidata.enumerated defines the range of attribute values expressed as a single XML name taken from a list of documented possibilities.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef key="teidata.word"/>
</content>
    
Declaration
tei_teidata.enumerated = teidata.word
Note

Attributes using this datatype must contain a single ‘word’ which contains only letters, digits, punctuation characters, or symbols: thus it cannot include whitespace.

Typically, the list of documented possibilities will be provided (or exemplified) by a value list in the associated attribute specification, expressed with a <valList> element.

Appendix A.5.6 teidata.language

teidata.language defines the range of attribute values used to identify a particular combination of human language and writing system. [6.1. Language Identification]
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <alternate>
  <dataRef name="language"/>
  <valList>
   <valItem ident=""/>
  </valList>
 </alternate>
</content>
    
Declaration
tei_teidata.language = xsd:language | ( "" )
Note

The values for this attribute are language ‘tags’ as defined in BCP 47. Currently BCP 47 comprises RFC 5646 and RFC 4647; over time, other IETF documents may succeed these as the best current practice.

A ‘language tag’, per BCP 47, is assembled from a sequence of components or subtags separated by the hyphen character (-, U+002D). The tag is made of the following subtags, in the following order. Every subtag except the first is optional. If present, each occurs only once, except the fourth and fifth components (variant and extension), which are repeatable.

language
The IANA-registered code for the language. This is almost always the same as the ISO 639 2-letter language code if there is one. The list of available registered language subtags can be found at https://www.iana.org/assignments/language-subtag-registry. It is recommended that this code be written in lower case.
script
The ISO 15924 code for the script. These codes consist of 4 letters, and it is recommended they be written with an initial capital, the other three letters in lower case. The canonical list of codes is maintained by the Unicode Consortium, and is available at https://unicode.org/iso15924/iso15924-codes.html. The IETF recommends this code be omitted unless it is necessary to make a distinction you need.
region
Either an ISO 3166 country code or a UN M.49 region code that is registered with IANA (not all such codes are registered, e.g. UN codes for economic groupings or codes for countries for which there is already an ISO 3166 2-letter code are not registered). The former consist of 2 letters, and it is recommended they be written in upper case; the list of codes can be searched or browsed at https://www.iso.org/obp/ui/#search/code/. The latter consist of 3 digits; the list of codes can be found at http://unstats.un.org/unsd/methods/m49/m49.htm.
variant
An IANA-registered variation. These codes ‘are used to indicate additional, well-recognized variations that define a language or its dialects that are not covered by other available subtags’.
extension
An extension has the format of a single letter followed by a hyphen followed by additional subtags. There are currently only two extensions in use. Extension T indicates that the content was transformed. For example en-t-it could be used for content in English that was translated from Italian. Extension T is described in the informational RFC 6497. Extension U can be used to embed a variety of locale attributes. It is described in the informational RFC 6067.
private use
An extension that uses the initial subtag of the single letter x (i.e., starts with x-) has no meaning except as negotiated among the parties involved. These should be used with great care, since they interfere with the interoperability that use of RFC 4646 is intended to promote. In order for a document that makes use of these subtags to be TEI-conformant, a corresponding <language> element must be present in the TEI header.

There are two exceptions to the above format. First, there are language tags in the IANA registry that do not match the above syntax, but are present because they have been ‘grandfathered’ from previous specifications.

Second, an entire language tag can consist of only a private use subtag. These tags start with x-, and do not need to follow any further rules established by the IETF and endorsed by these Guidelines. Like all language tags that make use of private use subtags, the language in question must be documented in a corresponding <language> element in the TEI header.

Examples include

sn
Shona
zh-TW
Taiwanese
zh-Hant-HK
Chinese written in traditional script as used in Hong Kong
en-SL
English as spoken in Sierra Leone
pl
Polish
es-MX
Spanish as spoken in Mexico
es-419
Spanish as spoken in Latin America

The W3C Internationalization Activity has published a useful introduction to BCP 47, Language tags in HTML and XML.

Appendix A.5.7 teidata.name

teidata.name defines the range of attribute values expressed as an XML Name.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef name="Name"/>
</content>
    
Declaration
tei_teidata.name = xsd:Name
Note

Attributes using this datatype must contain a single word which follows the rules defining a legal XML name (see https://www.w3.org/TR/REC-xml/#dt-name): for example they cannot include whitespace or begin with digits.

Appendix A.5.8 teidata.namespace

teidata.namespace defines the range of attribute values used to indicate XML namespaces as defined by the W3C Namespaces in XML Technical Recommendation.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef restriction="\S+" name="anyURI"/>
</content>
    
Declaration
tei_teidata.namespace = xsd:anyURI { pattern = "\S+" }
Note

The range of syntactically valid values is defined by RFC 3986 Uniform Resource Identifier (URI): Generic Syntax

Appendix A.5.9 teidata.numeric

teidata.numeric defines the range of attribute values used for numeric values.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <alternate>
  <dataRef name="double"/>
  <dataRef name="token"
   restriction="(\-?[\d]+/\-?[\d]+)"/>
  <dataRef name="decimal"/>
 </alternate>
</content>
    
Declaration
tei_teidata.numeric =
   xsd:double | token { pattern = "(\-?[\d]+/\-?[\d]+)" } | xsd:decimal
Note

Any numeric value, represented as a decimal number, in floating point format, or as a ratio.

To represent a floating point number, expressed in scientific notation, ‘E notation’, a variant of ‘exponential notation’, may be used. In this format, the value is expressed as two numbers separated by the letter E. The first number, the significand (sometimes called the mantissa) is given in decimal format, while the second is an integer. The value is obtained by multiplying the mantissa by 10 the number of times indicated by the integer. Thus the value represented in decimal notation as 1000.0 might be represented in scientific notation as 10E3.

A value expressed as a ratio is represented by two integer values separated by a solidus (/) character. Thus, the value represented in decimal notation as 0.5 might be represented as a ratio by the string 1/2.

Appendix A.5.10 teidata.outputMeasurement

teidata.outputMeasurement defines a range of values for use in specifying the size of an object that is intended for display.
Moduletei — Formal specification
Used by
Content model
<content>
 <dataRef name="token"
  restriction="[\-+]?\d+(\.\d+)?(%|cm|mm|in|pt|pc|px|em|ex|ch|rem|vw|vh|vmin|vmax)"/>
</content>
    
Declaration
tei_teidata.outputMeasurement =
   token
   {
      pattern = "[\-+]?\d+(\.\d+)?(%|cm|mm|in|pt|pc|px|em|ex|ch|rem|vw|vh|vmin|vmax)"
   }
Example
<figure>  <head>The TEI Logo</head>  <figDesc>Stylized yellow angle brackets with the letters <mentioned>TEI</mentioned> in    between and <mentioned>text encoding initiative</mentioned> underneath, all on a white    background.</figDesc>  <graphic height="600pxwidth="600px"   url="http://www.tei-c.org/logos/TEI-600.jpg"/> </figure>
Note

These values map directly onto the values used by XSL-FO and CSS. For definitions of the units see those specifications; at the time of this writing the most complete list is in the CSS3 working draft.

Appendix A.5.11 teidata.pattern

teidata.pattern defines attribute values which are expressed as a regular expression.
Moduletei — Formal specification
Used by
Content model
<content>
 <dataRef name="token"/>
</content>
    
Declaration
tei_teidata.pattern = token
Note
A regular expression, often called a pattern, is an expression that describes a set of strings. They are usually used to give a concise description of a set, without having to list all elements. For example, the set containing the three strings Handel, Händel, and Haendel can be described by the pattern H(ä|ae?)ndel (or alternatively, it is said that the pattern H(ä|ae?)ndel matches each of the three strings)
Wikipedia

This TEI datatype is mapped to the XSD token datatype, and may therefore contain any string of characters. However, it is recommended that the value used conform to the particular flavour of regular expression syntax supported by XSD Schema.

Appendix A.5.12 teidata.point

teidata.point defines the data type used to express a point in cartesian space.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef name="token"
  restriction="(-?[0-9]+(\.[0-9]+)?,-?[0-9]+(\.[0-9]+)?)"/>
</content>
    
Declaration
tei_teidata.point =
   token { pattern = "(-?[0-9]+(\.[0-9]+)?,-?[0-9]+(\.[0-9]+)?)" }
Example
<facsimile>  <surface ulx="0uly="0lrx="400lry="280">   <zone points="220,100 300,210 170,250 123,234">    <graphic url="handwriting.png"/>   </zone>  </surface> </facsimile>
Note

A point is defined by two numeric values, which should be expressed as decimal numbers. Neither number can end in a decimal point. E.g., both 0.0,84.2 and 0,84 are allowed, but 0.,84. is not.

Appendix A.5.13 teidata.pointer

teidata.pointer defines the range of attribute values used to provide a single URI, absolute or relative, pointing to some other resource, either within the current document or elsewhere.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef restriction="\S+" name="anyURI"/>
</content>
    
Declaration
tei_teidata.pointer = xsd:anyURI { pattern = "\S+" }
Note

The range of syntactically valid values is defined by RFC 3986 Uniform Resource Identifier (URI): Generic Syntax. Note that the values themselves are encoded using RFC 3987 Internationalized Resource Identifiers (IRIs) mapping to URIs. For example, https://secure.wikimedia.org/wikipedia/en/wiki/% is encoded as https://secure.wikimedia.org/wikipedia/en/wiki/%25 while http://موقع.وزارة-الاتصالات.مصر/ is encoded as http://xn--4gbrim.xn----rmckbbajlc6dj7bxne2c.xn--wgbh1c/

Appendix A.5.14 teidata.prefix

teidata.prefix defines a range of values that may function as a URI scheme name.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef name="token"
  restriction="[a-z][a-z0-9\+\.\-]*"/>
</content>
    
Declaration
tei_teidata.prefix = token { pattern = "[a-z][a-z0-9\+\.\-]*" }
Note

This datatype is used to constrain a string of characters to one that can be used as a URI scheme name according to RFC 3986, section 3.1. Thus only the 26 lowercase letters a–z, the 10 digits 0–9, the plus sign, the period, and the hyphen are permitted, and the value must start with a letter.

Appendix A.5.15 teidata.probCert

teidata.probCert defines a range of attribute values which can be expressed either as a numeric probability or as a coded certainty value.
Moduletei — Formal specification
Used by
Content model
<content>
 <alternate>
  <dataRef key="teidata.probability"/>
  <dataRef key="teidata.certainty"/>
 </alternate>
</content>
    
Declaration
tei_teidata.probCert = teidata.probability | teidata.certainty

Appendix A.5.16 teidata.probability

teidata.probability defines the range of attribute values expressing a probability.
Moduletei — Formal specification
Used by
Content model
<content>
 <dataRef name="double">
  <dataFacet name="minInclusive" value="0"/>
  <dataFacet name="maxInclusive" value="1"/>
 </dataRef>
</content>
    
Declaration
tei_teidata.probability = xsd:double
Note

Probability is expressed as a real number between 0 and 1; 0 representing certainly false and 1 representing certainly true.

Appendix A.5.17 teidata.replacement

teidata.replacement defines attribute values which contain a replacement template.
Moduletei — Formal specification
Used by
Content model
<content>
 <textNode/>
</content>
    
Declaration
tei_teidata.replacement = text

Appendix A.5.18 teidata.temporal.w3c

teidata.temporal.w3c defines the range of attribute values expressing a temporal expression such as a date, a time, or a combination of them, that conform to the W3C XML Schema Part 2: Datatypes Second Edition specification.
Moduletei — Formal specification
Used by
Content model
<content>
 <alternate>
  <dataRef name="date"/>
  <dataRef name="gYear"/>
  <dataRef name="gMonth"/>
  <dataRef name="gDay"/>
  <dataRef name="gYearMonth"/>
  <dataRef name="gMonthDay"/>
  <dataRef name="time"/>
  <dataRef name="dateTime"/>
 </alternate>
</content>
    
Declaration
tei_teidata.temporal.w3c =
   xsd:date
 | xsd:gYear
 | xsd:gMonth
 | xsd:gDay
 | xsd:gYearMonth
 | xsd:gMonthDay
 | xsd:time
 | xsd:dateTime
Note

If it is likely that the value used is to be compared with another, then a time zone indicator should always be included, and only the dateTime representation should be used.

Appendix A.5.19 teidata.text

teidata.text defines the range of attribute values used to express some kind of identifying string as a single sequence of Unicode characters possibly including whitespace.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef name="string"/>
</content>
    
Declaration
tei_teidata.text = string
Note

Attributes using this datatype must contain a single ‘token’ in which whitespace and other punctuation characters are permitted.

Appendix A.5.20 teidata.truthValue

teidata.truthValue defines the range of attribute values used to express a truth value.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef name="boolean"/>
</content>
    
Declaration
tei_teidata.truthValue = xsd:boolean
Note

The possible values of this datatype are 1 or true, or 0 or false.

This datatype applies only for cases where uncertainty is inappropriate; if the attribute concerned may have a value other than true or false, e.g. unknown, or inapplicable, it should have the extended version of this datatype: teidata.xTruthValue.

Appendix A.5.21 teidata.version

teidata.version defines the range of attribute values which may be used to specify a TEI or Unicode version number.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef name="token"
  restriction="[\d]+(\.[\d]+){0,2}"/>
</content>
    
Declaration
tei_teidata.version = token { pattern = "[\d]+(\.[\d]+){0,2}" }
Note

The value of this attribute follows the pattern specified by the Unicode consortium for its version number (https://unicode.org/versions/). A version number contains digits and fullstop characters only. The first number supplied identifies the major version number. A second and third number, for minor and sub-minor version numbers, may also be supplied.

Appendix A.5.22 teidata.versionNumber

teidata.versionNumber defines the range of attribute values used for version numbers.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef name="token"
  restriction="[\d]+[a-z]*[\d]*(\.[\d]+[a-z]*[\d]*){0,3}"/>
</content>
    
Declaration
tei_teidata.versionNumber =
   token { pattern = "[\d]+[a-z]*[\d]*(\.[\d]+[a-z]*[\d]*){0,3}" }

Appendix A.5.23 teidata.word

teidata.word defines the range of attribute values expressed as a single word or token.
Moduletei — Formal specification
Used by
teidata.enumeratedElement:
Content model
<content>
 <dataRef name="token"
  restriction="[^\p{C}\p{Z}]+"/>
</content>
    
Declaration
tei_teidata.word = token { pattern = "[^\p{C}\p{Z}]+" }
Note

Attributes using this datatype must contain a single ‘word’ which contains only letters, digits, punctuation characters, or symbols: thus it cannot include whitespace.

Appendix A.5.24 teidata.xTruthValue

teidata.xTruthValue (extended truth value) defines the range of attribute values used to express a truth value which may be unknown.
Moduletei — Formal specification
Used by
Content model
<content>
 <alternate>
  <dataRef name="boolean"/>
  <valList>
   <valItem ident="unknown"/>
   <valItem ident="inapplicable"/>
  </valList>
 </alternate>
</content>
    
Declaration
tei_teidata.xTruthValue = xsd:boolean | ( "unknown" | "inapplicable" )
Note

In cases where where uncertainty is inappropriate, use the datatype teidata.TruthValue.

Appendix A.5.25 teidata.xmlName

teidata.xmlName defines attribute values which contain an XML name.
Moduletei — Formal specification
Used by
Element:
Content model
<content>
 <dataRef name="NCName"/>
</content>
    
Declaration
tei_teidata.xmlName = xsd:NCName
Note

The rules defining an XML name form a part of the XML Specification.

Appendix A.5.26 teidata.xpath

teidata.xpath defines attribute values which contain an XPath expression.
Moduletei — Formal specification
Used by
Content model
<content>
 <textNode/>
</content>
    
Declaration
tei_teidata.xpath = text
Note

Any XPath expression using the syntax defined in 6.2..

When writing programs that evaluate XPath expressions, programmers should be mindful of the possibility of malicious code injection attacks. For further information about XPath injection attacks, see the article at OWASP.

Notes
1
Note that this is a illustrative example, i.e. a valid PressMint corpus would also need certain attributes to be defined on the illustrated elements. This holds for all the examples in chapter.
2
Note that this is different from ParlaMint, where a hyphen, not underscore is used.
3
These are typically tagset developed and used for specific languages and can be found in the XPOS column of CoNLL-U files, which is the native format for UD treebanks.
4
Note that PressMint does not foresee syntactic parsing, so there is not ambiguity if word is split because of normalisation or because of its syntactic analysis. However, if both were present, the outer one would correspond to normalisation and the inner to syntactic words.
Tomaž Erjavec, tomaz.erjavec@ijs.si and Matyáš Kopp, kopp@ufal.mff.cuni.cz. Date: 2025-09-16