using OpenScripturesHebrew
= tanakh() words
OpenScripturesHebrew.jl
OpenScripturesHebrew.jl
is a Julia package for working with the morphological data of the Open Scriptures Hebrew Bible project (OSHB). OpenScripturesHebrew.jl
parses the OSHB data into easily manipulated Julia tuples, and provides a Julia data model for the OSHB morphological data.
Quickest start
Download data
Get OSH annotations on all words in the Hebrew Bible:
length(words)
432307
That’s a lot of words!
What’s in a word
Each word is reprsented by a named tuple.
1] words[
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HR", mtoken = "בְּ", otoken = "בְּ/רֵאשִׁ֖ית", otoken_num = 1, lemma = "b")
In the tuple, the morphologically analyzed token is named mtoken
:
1].mtoken words[
"בְּ"
Morphological analysis
The morphological analysis for the token is represented by a code string, but you can use the parseword
function to create a OSHMorphologicalForm
.
1].code words[
"HR"
parseword(words[1])
preposition
Select by canonical reference
Select words for a passage identified by the OSH name and passage conventions:
= oshverse("Gen", "1.1", words) versewords
11-element Vector{Any}:
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HR", mtoken = "בְּ", otoken = "בְּ/רֵאשִׁ֖ית", otoken_num = 1, lemma = "b")
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HNcfsa", mtoken = "רֵאשִׁ֖ית", otoken = "בְּ/רֵאשִׁ֖ית", otoken_num = 1, lemma = "7225")
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HVqp3ms", mtoken = "בָּרָ֣א", otoken = "בָּרָ֣א", otoken_num = 2, lemma = "1254 a")
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HNcmpa", mtoken = "אֱלֹהִ֑ים", otoken = "אֱלֹהִ֑ים", otoken_num = 3, lemma = "430")
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HTo", mtoken = "אֵ֥ת", otoken = "אֵ֥ת", otoken_num = 4, lemma = "particle")
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HTd", mtoken = "הַ", otoken = "הַ/שָּׁמַ֖יִם", otoken_num = 5, lemma = "particle")
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HNcmpa", mtoken = "שָּׁמַ֖יִם", otoken = "הַ/שָּׁמַ֖יִם", otoken_num = 5, lemma = "8064")
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HC", mtoken = "וְ", otoken = "וְ/אֵ֥ת", otoken_num = 6, lemma = "c")
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HTo", mtoken = "אֵ֥ת", otoken = "וְ/אֵ֥ת", otoken_num = 6, lemma = "particle")
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HTd", mtoken = "הָ", otoken = "הָ/אָֽרֶץ", otoken_num = 7, lemma = "particle")
(urn = "urn:cts:compnov:bible.genesis.osh:1.1", code = "HNcbsa", mtoken = "אָֽרֶץ", otoken = "הָ/אָֽרֶץ", otoken_num = 7, lemma = "776")
Apply regular Julia mapping to a selection of words
Extract the token and the morphological analysis from the tuple:
map(versewords) do w
string(w.mtoken, " = ", parseword(w))
end
11-element Vector{String}:
"בְּ = preposition"
"רֵאשִׁ֖ית = noun (common noun): feminine singular absolute state"
"בָּרָ֣א = finite verb: qal perfect third singular masculine"
"אֱלֹהִ֑ים = noun (common noun): masculine plural absolute state"
"אֵ֥ת = particle"
"הַ = particle"
"שָּׁמַ֖יִם = noun (common noun): masculine plural absolute state"
"וְ = conjunction"
"אֵ֥ת = particle"
"הָ = particle"
"אָֽרֶץ = noun (common noun): common gender singular absolute state"
More information
- See some tutorials to get started
- See reference documentation (APIs, data sources)
Status
The package is not yet registered with the central Julia registry. You can directly add it to your Julia environment from its github repository https://github.com/neelsmith/OpenScripturesHebrew.jl. For example:
using Pkg Pkg.add(url = "https://github.com/neelsmith/OpenScripturesHebrew.jl")