Formatting numeric tokens

Published

July 29, 2024

Manuscripts of Greek scientific and mathematical texts use the “Milesian” system of numeric notation.

Integers in the “Milesian” system of notation

Values from 1-999

The Milesian system is essentially a place-value system. The 27 values for ones (1-9), tens (10-90), and hundreds (100-900) are noted with specific alphabetic characters in normal alphabetic order, with the additions of ϛ for 6, ϙ for 90 and ϡ for 900.

Note

Note that in GreekScientificOrthography, characters used to write the basic integer values must be in lower-case.

Ones Tens Hundreds
α ι ρ
β κ σ
γ λ τ
δ μ υ
ε ν φ
ϛ ξ χ
ζ ο ψ
η π ω
θ ϙ ϡ

Since most of the characters used to write integers in the Milesian system can also be read as alphabetic characters, integer tokens are flagged with a special marker, the numeric tick mark, ʹ (Unicode x0374). The integer value 1 is written like this:

one = "αʹ" 

Like our place value notation, the left-to-right sequence of digits is always largest to smallest. Note that unlike our numeric notation, there is no need for a zero character to represent the absence of a value in the hundreds, tens or ones column, since the characters for each column are distinct.

eleven = "ιαʹ"
one_hundred_one = "ραʹ"
one_hundred_eleven = "ριαʹ"

All of these are valid strings in a GreekSciOrthography.

ortho = stemortho()
validstring(one, ortho) == validstring(eleven, ortho) == validstring(one_hundred_one, ortho) == validstring(one_hundred_eleven, ortho)
true

A comma separates a thousands column (with values from 1,000 - 9,000) to the left of the comma, from hundreds, tens and ones values to the right of the comma, again just like our practice. The thousands column reuses the same characters as the ones column.

one_thousand_one = "α,αʹ"
validstring(one_thousand_one, ortho)
true

All of these strings represent a single integer token (type MilesianIntegerToken).

tokenize(one, ortho)
1-element Vector{OrthographicToken}:
 OrthographicToken("αʹ", MilesianIntegerToken())
tokenize(one_thousand_one, ortho)
1-element Vector{OrthographicToken}:
 OrthographicToken("α,αʹ", MilesianIntegerToken())

Integers from 999-19,999

The value for 10,000 is written with the upper-case mu, Μ (for μυριάς, “myriad”).

myriad = "Μʹ"
validstring(myriad, ortho)
true

Conventionally, the myriads value is written as a separate token separated by white space from the smaller columns. That means that the string for a value like 10,001 will be represented by two tokens.

tenthousand_one = "Μʹ αʹ"
tokenize(tenthousand_one, ortho)
2-element Vector{OrthographicToken}:
 OrthographicToken("Μʹ", MilesianIntegerToken())
 OrthographicToken("αʹ", MilesianIntegerToken())
Note

Greek manuscripts do not normally repeat the tick mark on the myriad marker; GreekScientificOrthography requires this to guarantee context-independent parsing of tokens.

You can express whole numbers up to 19,999 in this way.

nineteen_999 = "Μʹ θ,ϡϙθʹ"
validstring(nineteen_999, ortho)
true
tokenize(nineteen_999, ortho)
2-element Vector{OrthographicToken}:
 OrthographicToken("Μʹ", MilesianIntegerToken())
 OrthographicToken("θ,ϡϙθʹ", MilesianIntegerToken())

Integers greater than 19,999

To write values larger than 19,999, Milesian notation begins by multiplying the myriad character. In manuscripts, the multiplier is normally written above the Μ; in GreekScientificOrthography, we use the markdown convention for superscript that is supported by pandoc, among others, of bracketing the superscript value with carets. In this context, neither multiplicand requires the numeric tick.

20,000, for example, is written as Μ multiplied by β, like this example:

twentyk = "Μ^β^"
validstring(twentyk, ortho)
true

In an environment that supports pandoc’s markdown extension, the multiplier will display as a superscript.

using Markdown
Markdown.parse(twentyk)

Μβ

Archimedes uses this notation as he derives limiting values for pi in his treatise Measurement of a Circle. The value 349,450, for example, is written with these two tokens:

threefortynine450 = "Μ^λδ^ ,θυνʹ"
Markdown.parse(threefortynine450)

Μλδ ,θυνʹ

tokenize(threefortynine450, ortho)
2-element Vector{OrthographicToken}:
 OrthographicToken("Μ^λδ^", MilesianIntegerToken())
 OrthographicToken(",θυνʹ", MilesianIntegerToken())

Fractional values

GreekScientificOrthography includes characters for three fractional values that are often written in manuscripts with special symbols: 𐅵 for one half (Unicode x10175), 𐅷 for two-thirds (Unicode x10177), and 𐅸 for three quarters (Unicode x10178). (The package also makes these characters available with the contsant names ONE_HALF, TWO_THIRDS and THREE_FOURTHS.)

Apart from these special cases, the only notation for fractional values available to Greek mathematicians was to use normal integer notation, but flagged with a special double-prime marker indicating that this is a reciprocal value.

sixth = "ϛ″"
thirtysixth = "λϛ″"
validstring(sixth, ortho) == validstring(thirtysixth, ortho)
true

Other fractional values would be written as sums of these. The fraction 2/3 can, for example, appear as 1/2 + 1/6, written simply as a succession of individual fractional tokens. As with integer values, the left-to-right sequence is from greatest to least, and the value of the sum of the tokens.

twothirds = "β″ ϛ″"
tokenize(twothirds, ortho)
2-element Vector{OrthographicToken}:
 OrthographicToken("β′′", MilesianFractionToken())
 OrthographicToken("ϛ′′", MilesianFractionToken())

Fractions can of course be mixed with integer numbers.

six_and_twothirds = "ϛʹ β″ ϛ″"
validstring(six_and_twothirds, stemortho())
true
tokenize(six_and_twothirds, stemortho())
3-element Vector{OrthographicToken}:
 OrthographicToken("ϛʹ", MilesianIntegerToken())
 OrthographicToken("β′′", MilesianFractionToken())
 OrthographicToken("ϛ′′", MilesianFractionToken())