Lexical Structure

Lexical Structure

This chapter describes the lexical structure of Rue programs, including tokens, comments, and whitespace.

The lexer processes source text and produces a sequence of tokens. Comments and whitespace are handled but do not produce tokens.

Maximal Munch

The lexer uses the maximal munch (or longest match) principle: at each position in the source text, the lexer consumes the longest sequence of characters that forms a valid token.

This principle resolves ambiguity when multiple token patterns could match at a position. For example, <= is lexed as a single <= token rather than < followed by =, and && is lexed as a single logical AND token rather than two & tokens.

fn main() -> i32 {
    let x = 1 << 2;   // << is a single left-shift token
    let y = x <= 10;  // <= is a single less-than-or-equal token
    if true && false { 0 } else { 1 }  // && is a single logical AND token
}

In this section