Published

January 17, 2025

Mining Lewis-Short’s Latin Dictionary

LexiconMining.jl is an unpublished Julia package to extract latent morphological data and short definitions from a digital edition of Lewis-Short’s Latin Dictionary. It supports reading the automatically extracted morphological information and generating a structured morphological lexicon for use with the Tabulae system for building Latin morphological parsers.

With version 0.3, the LexiconMining package is being completely rewritten to use ChatGPT for the first round of data extraction. The process can be outlined as:

  1. Use ChatGPT to extract morphological data from Christopher Blackwell’s Markdown edition of Lewis-Short.
  2. Parse the ChatGPT output into a Julia object model for each morphological type (“part of speech”).
  3. Create a morphological dataset usable with Tabulae.
More information about Tabulae