Published

October 21, 2024

Sefaria’s digital data from Strong

Key points

Retrieve Sefaria’s digital data derived from Strong: strong

For a given record;

  1. find headword (lemma): headword
  2. find list of short definitions: definitions
  3. find part of speech code: pos

Finding data from Strong for a token

Sefaria has indexed individual tokens in the Hebrew Bible to key data from the corresponding entry in Strong’s Concordance.

using BrownDriverBriggs 
records = strong("בָּרָ֣א")
5-element Vector{Strong}:
 1247 בַּר, (n-m): son
 1248 בַּר, (n-m): son,  heir
 1249 בַּר, (a): adj; pure,  clear,  sincere; clean,  empty; adv; purely
 1251 בַּר, (n-m): field
 1254 בָּרָא, (v): to create,  shape,  form; to be fat

Working with elements of a Strong record

Each record includes an id, a headword (or lemma), a list of short definitions, and a part of speech. You can extract this information with the id, headword, definitions and pos functions, respectively as the following examples illustrate.

The previous search yields a list of five records. Here are their unique IDs.

id.(records)
5-element Vector{String}:
 "1247"
 "1248"
 "1249"
 "1251"
 "1254"

How many distinct headwords could the five records derive from?

headword.(records) |> unique
2-element Vector{String}:
 "בַּר"
 "בָּרָא"

What distinct parts of speech do they represent?

pos.(records) |> unique
3-element Vector{String}:
 "n-m"
 "a"
 "v"

Filter out verb forms:

verbs = filter(record -> pos(record) == "v", records)
1-element Vector{Strong}:
 1254 בָּרָא, (v): to create,  shape,  form; to be fat

Find the short definitions for the single verb form:

definitions(verbs[1])
2-element Vector{String}:
 "to create,  shape,  form"
 "to be fat"