From 8a393f554ab1b68f3ab74f634898bad41e9135e1 Mon Sep 17 00:00:00 2001
From: Josh Holtrop <jholtrop@gmail.com>
Date: Thu, 15 Jan 2026 20:22:54 -0500
Subject: [PATCH] Document p_lex and p_token_info_t in user guide - fix #37

---
 doc/user_guide.md | 54 ++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 51 insertions(+), 3 deletions(-)

diff --git a/doc/user_guide.md b/doc/user_guide.md
index 2c4f8f1..db66bed 100644
--- a/doc/user_guide.md
+++ b/doc/user_guide.md
@@ -751,7 +751,7 @@ Some example uses of this functionality could be to:
 
   * Detect integer overflow when lexing an integer literal constant.
   * Detect and report an error as soon as possible during parsing before continuing to parse any more of the input.
-  * Determine whether parsing should stop and instead be performed using a different parser version.
+  * Determine whether parsing should stop and instead be retried using a different parser version.
 
 To terminate parsing from a lexer or parser user code block, use the
 `$terminate(code)` function, passing an integer expression argument.
@@ -787,7 +787,7 @@ Propane generates the following result code constants:
 * `P_EOF`: The lexer reached the end of the input string.
 * `P_USER_TERMINATED`: A parser user code block has requested to terminate the parser.
 
-Result codes are returned by the functions `p_decode_code_point()`, `p_lex()`, and `p_parse()`.
+Result codes are returned by the API functions `p_decode_code_point()`, `p_lex()`, and `p_parse()`.
 
 ##> Types
 
@@ -807,7 +807,7 @@ A pointer to this instance is passed to the generated functions.
 
 ### `p_position_t`
 
-The `p_position_t` structure contains two fields `row` and `col`.
+The `p_position_t` structure contains two fields: `row` and `col`.
 These fields contain the 1-based row and column describing a parser position.
 
 For D targets, the `p_position_t` structure can be checked for validity by
@@ -817,6 +817,16 @@ For C targets, the `p_position_t` structure can be checked for validity by
 calling `p_position_valid(pos)` where `pos` is a `p_position_t` structure
 instance.
 
+### `p_token_info_t`
+
+The `p_token_info_t` structure contains the following fields:
+
+* `position` (`p_position_t`) holds the text position of the first code point in the token.
+* `end_position` (`p_position_t`) holds the text position of the last code point in the token.
+* `length` (`size_t`) holds the number of input bytes used by the token.
+* `token` (`p_token_t`) holds the token ID of the lexed token
+* `pvalue` (`p_value_t`) holds the parser value associated with the token.
+
 ### AST Node Types
 
 If AST generation mode is enabled, a structure type for each rule will be
@@ -927,6 +937,44 @@ p_context_t context;
 p_context_init(&context, input);
 ```
 
+### `p_lex`
+
+The `p_lex()` function is the main entry point to the lexer.
+It is normally called automatically by the generated parser to retrieve the
+next input token for the parser and does not need to be called by the user.
+However, the user may initialize a context and call `p_lex()` to use the
+generated lexer in a standalone mode.
+
+Example:
+
+```
+p_context_t context;
+p_context_init(&context, input, input_length);
+p_token_info_t token_info;
+size_t result = p_lex(&context, &token_info);
+switch (result)
+{
+case P_DECODE_ERROR:
+    /* UTF-8 decode error */
+    break;
+case P_UNEXPECTED_INPUT:
+    /* Input text does not match any lexer pattern. */
+    break;
+case P_USER_TERMINATED:
+    /* Lexer user code block requested to terminate the lexer. */
+    break;
+case P_SUCCESS:
+    /*
+     * token_info.position holds the text position of the first code point in the token.
+     * token_info.end_position holds the text position of the last code point in the token.
+     * token_info.length holds the number of input bytes used by the token.
+     * token_info.token holds the token ID of the lexed token
+     * token_info.pvalue holds the parser value associated with the token.
+     */
+    break;
+}
+```
+
 ### `p_parse`
 
 The `p_parse()` function is the main entry point to the parser.