Incremental Scannerless Generalized LR Parsing

Maarten P. Sijm. Incremental Scannerless Generalized LR Parsing. Master's thesis, Delft University of Technology, July 2021. [doi]

Abstract

The Scannerless Generalized LR (SGLR) parsing algorithm supports the development of composed languages seamlessly but does not support incremental parsing. The Incremental Generalized LR (IGLR) parsing algorithm, on the other hand, does not support the seamless composition of languages. This thesis presents the Incremental Scannerless Generalized LR (ISGLR) parsing algorithm and investigates the effects of combining the SGLR and IGLR parsing algorithms. While the algorithmic differences are orthogonal, the fact that scannerless parsing relies on non-deterministic parsing for disambiguation has a negative impact on incrementality. Nonetheless, we show that the ISGLR parsing algorithm performs better than the batch SGLR parsing algorithm in typical scenarios. On average, the ISGLR parser can reuse 99% of a previous parse result. When parsing from scratch, the ISGLR parser has a 24% run time overhead compared to the SGLR parser, but when parsing incrementally for changes that are smaller than 1% of the input size on average, it has a 9× speedup.

Predecessors

The following are older variants of this publication: