TY - GEN
T1 - A Formal Description of an Algorithm Suitable for Parsing the Language of Mathematics
AU - Vrečar, Luka
AU - Wells, Joe
AU - Kamareddine, Fairouz
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - Kofler’s DynGenPar (DGP) generalized dynamic parsing algorithm [6, 7] is aimed at parsing the language of mathematics. Notably, DGP uses a notion of initial graph instead of parsing tables in the style of GLR and LR parsers. We distinguish the two previously existing definitions of DGP: (1) The high-level “Abstract DGP” (ADGP) is a non-deterministic algorithm which reaches any particular parse tree via a sequence of non-deterministic choices. It is not obvious just from reading ADGP how to implement it well. (2) Kofler’s C++ implementation (KDGP) implements its own concurrency engine for finding all possible parse trees simultaneously while synchronizing on each input token. It is challenging to comprehend how ADGP corresponds to KDGP, and it would be hard to formally prove properties of either ADGP or KDGP. We present a new mathematical definition of Core DGP (CDGP), the context-free grammar (CFG) core of DGP, that is both declarative and deterministic. We formalize the concurrency with a notion of continuation, and define a way to match continuations to parse trees in proving that our algorithm is sound and complete, i.e., it terminates and returns all valid parse trees excluding trees with superfluous recursion. We also make available a Python implementation which corresponds quite directly to the mathematical definition of CDGP.
AB - Kofler’s DynGenPar (DGP) generalized dynamic parsing algorithm [6, 7] is aimed at parsing the language of mathematics. Notably, DGP uses a notion of initial graph instead of parsing tables in the style of GLR and LR parsers. We distinguish the two previously existing definitions of DGP: (1) The high-level “Abstract DGP” (ADGP) is a non-deterministic algorithm which reaches any particular parse tree via a sequence of non-deterministic choices. It is not obvious just from reading ADGP how to implement it well. (2) Kofler’s C++ implementation (KDGP) implements its own concurrency engine for finding all possible parse trees simultaneously while synchronizing on each input token. It is challenging to comprehend how ADGP corresponds to KDGP, and it would be hard to formally prove properties of either ADGP or KDGP. We present a new mathematical definition of Core DGP (CDGP), the context-free grammar (CFG) core of DGP, that is both declarative and deterministic. We formalize the concurrency with a notion of continuation, and define a way to match continuations to parse trees in proving that our algorithm is sound and complete, i.e., it terminates and returns all valid parse trees excluding trees with superfluous recursion. We also make available a Python implementation which corresponds quite directly to the mathematical definition of CDGP.
U2 - 10.1007/978-3-032-07021-0_10
DO - 10.1007/978-3-032-07021-0_10
M3 - Conference contribution
AN - SCOPUS:105020566349
SN - 9783032070203
T3 - Lecture Notes in Computer Science
SP - 171
EP - 188
BT - Intelligent Computer Mathematics. CICM 2025
A2 - de Paiva, Valeria
A2 - Koepke, Peter
PB - Springer
T2 - 18th International Conference on Intelligent Computer Mathematics 2025
Y2 - 6 October 2025 through 10 October 2025
ER -