Published

January 13, 2025

Assembling parts of the archive

HmtArchive includes functions to instantiate parts of the archive as collections of citable content. Behind the scenes, this is how the archive is published, since all of the citable objects built from it can be serialized as simple delimited text.

Documentation incomplete

This page lacks documentation of some of the content that is included in published releases of the HMT archive.

In the following examples, root refers to the archive directory of the hmt-archive github repository. We begin by creating an Archive object.

using HmtArchive, EditorsRepo
hmt = Archive(root)

Text corpora

Create a normalized corpus of all texts in the repository:

normedcorpus = normalizedcorpus(hmt)
Corpus with 40485 citable passages in 12 documents.

Create a diplomatic corpus of all texts in the repository:

diplcorpus = diplomaticcorpus(hmt)
Corpus with 40485 citable passages in 12 documents.

These are instances of a CitableTextCorpus from the Julia CitableCorpus package.

Citable physical texts

HMT project editions follow a model of a citable physical text connecting text contents, physical object such as a manuscript page, and documentary image.

Create a collection of these relations (the “Digital Scholarly Edition,” or DSE, model):

dsecollection = HmtArchive.dse(hmt)
urn:cite2:hmt:hmtdse.v1:all Homer Multitext project indexing of digital scholarly editions

The result is a DSECollection from the Julia CitablePhysicalText package.

Indexing content of XML editions

Index scholia commenting on Iliad:

commentaryindex = HmtArchive.commentpairs(hmt)

The result is a Vector pairing two CtsUrns, the first refering to the scholion, and the second to the Iliad passage.

commentaryindex[1]
(urn:cts:greekLit:tlg5026.msAint.hmt:1.27, urn:cts:greekLit:tlg0012.tlg001.msA:1.8)

CtsUrns are defined in the Julia CitableText package.

Summary

Documentation out of date

This section needs to be updated.

In the following summary lines, hmt is an instance of an Archive; archivecex is the output of cex(hmt) (a complete representation of the archive in CEX format).

Each line of the table represents a complete round trip beginning from the archive, to instantiate objects in Julia, serialize the object to CEX, and then instantiate equivalent objects from a CEX serialization of the entire archive.

Component Instantiate from archive Serialize to CEX Instantiate from CEX
collection of all images TBD imagecex(hmt) TBA
collection of all codices TBD codexcex(hmt) (but currently failing to include CEX for data model?) fromcex(archivecex, Codex)
catalog of all texts fromcex(textcatalogcex(hmt), TextCatalogCollection) textcatalogcex(hmt) fromcex(archivecex, TextCatalogCollection)
diplomatic editions of all texts diplomaticcorpus(hmt) cex(diplomaticcorpus(hmt)) filter full corpus created with fromcex(archivecex, CitableTextCorpus)
normalized editions of all texts normalizedcorpus(hmt) cex(normalizedcorpus(hmt)) filter full corpus created with fromcex(archivecex, CitableTextCorpus)
all DSE records dse(hmt) cex(dse(hmt)) TBA (fromcex(archivecex, DseCollection)) is broken in current version of CitablePhysicalText)
indexes of scholia to Iliad passages TBD scholiaindexcex(hmt) TBA (fromcex(archivecex, CitableCommentary) is broken in current version of CitableAnnotations)
other indexes (including Iliad passages to pages) TBD relationsetscex(hmt) TBA (should be instantiated using fromcex with an implementation of the index’s data model)
collections of authority lists for personal names, place names, astronomical entities, and texts no longer extant fromcex(acex, CatalogedCollection) acex = authlistscex(hmt) TBA
collection of all data models in the library TBD: accounted for in cex methods for DSE records, codices and images datamodelcex() TBA