A Formal Description of an Algorithm Suitable for Parsing the Language of Mathematics

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Kofler’s DynGenPar (DGP) generalized dynamic parsing algorithm [6, 7] is aimed at parsing the language of mathematics. Notably, DGP uses a notion of initial graph instead of parsing tables in the style of GLR and LR parsers. We distinguish the two previously existing definitions of DGP: (1) The high-level “Abstract DGP” (ADGP) is a non-deterministic algorithm which reaches any particular parse tree via a sequence of non-deterministic choices. It is not obvious just from reading ADGP how to implement it well. (2) Kofler’s C++ implementation (KDGP) implements its own concurrency engine for finding all possible parse trees simultaneously while synchronizing on each input token. It is challenging to comprehend how ADGP corresponds to KDGP, and it would be hard to formally prove properties of either ADGP or KDGP. We present a new mathematical definition of Core DGP (CDGP), the context-free grammar (CFG) core of DGP, that is both declarative and deterministic. We formalize the concurrency with a notion of continuation, and define a way to match continuations to parse trees in proving that our algorithm is sound and complete, i.e., it terminates and returns all valid parse trees excluding trees with superfluous recursion. We also make available a Python implementation which corresponds quite directly to the mathematical definition of CDGP.

Original languageEnglish
Title of host publicationIntelligent Computer Mathematics. CICM 2025
EditorsValeria de Paiva, Peter Koepke
PublisherSpringer
Pages171-188
Number of pages18
ISBN (Electronic)9783032070210
ISBN (Print)9783032070203
DOIs
Publication statusPublished - 2026
Event18th International Conference on Intelligent Computer Mathematics 2025 - Brasilia, Brazil
Duration: 6 Oct 202510 Oct 2025

Publication series

NameLecture Notes in Computer Science
Volume16136
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Intelligent Computer Mathematics 2025
Abbreviated titleCICM 2025
Country/TerritoryBrazil
CityBrasilia
Period6/10/2510/10/25

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'A Formal Description of an Algorithm Suitable for Parsing the Language of Mathematics'. Together they form a unique fingerprint.

Cite this