Find words by canonical reference

Published

November 9, 2024

OSHB’s annotated words are ordered in the sequence in which they occur in the Hebrew Bible, and include a reference to a specific verse represented with a CTS URN. Here is the URN for the first word (in the first verse) in the Hebrew Bible.

using OpenScripturesHebrew
words = tanakh()

words[1].urn

"urn:cts:compnov:bible.genesis.osh:1.1"

From a list of OSHB word annotations, you can select the words for a single verse using the OSHB abbreviation for a book and the canonical chapter.verse reference (separated by a period).

versewords = oshverse("Gen", "1.1", words)

11-element Vector{Any}:
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HR", mtoken = "בְּ", otoken = "בְּ/רֵאשִׁ֖ית", otoken_num = 1, lemma = "b")
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HNcfsa", mtoken = "רֵאשִׁ֖ית", otoken = "בְּ/רֵאשִׁ֖ית", otoken_num = 1, lemma = "7225")
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HVqp3ms", mtoken = "בָּרָ֣א", otoken = "בָּרָ֣א", otoken_num = 2, lemma = "1254 a")
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HNcmpa", mtoken = "אֱלֹהִ֑ים", otoken = "אֱלֹהִ֑ים", otoken_num = 3, lemma = "430")
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HTo", mtoken = "אֵ֥ת", otoken = "אֵ֥ת", otoken_num = 4, lemma = "particle")
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HTd", mtoken = "הַ", otoken = "הַ/שָּׁמַ֖יִם", otoken_num = 5, lemma = "particle")
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HNcmpa", mtoken = "שָּׁמַ֖יִם", otoken = "הַ/שָּׁמַ֖יִם", otoken_num = 5, lemma = "8064")
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HC", mtoken = "וְ", otoken = "וְ/אֵ֥ת", otoken_num = 6, lemma = "c")
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HTo", mtoken = "אֵ֥ת", otoken = "וְ/אֵ֥ת", otoken_num = 6, lemma = "particle")
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HTd", mtoken = "הָ", otoken = "הָ/אָֽרֶץ", otoken_num = 7, lemma = "particle")
 (urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HNcbsa", mtoken = "אָֽרֶץ", otoken = "הָ/אָֽרֶץ", otoken_num = 7, lemma = "776")

Notice that while all the selected words share the same URN value (urn:cts:compnov:bible.genesis.osh:1.1), they occur in document order. This is easily seen if we look at the string values of each analyzed token.

tokens = map(w -> w.mtoken, versewords)
join(tokens," ")

"בְּ רֵאשִׁ֖ית בָּרָ֣א אֱלֹהִ֑ים אֵ֥ת הַ שָּׁמַ֖יִם וְ אֵ֥ת הָ אָֽרֶץ"

Full list of OSH abbreviations

See this page for a complete list of OSH abbreviations for books of the Hebrew Bible.