Ruta graveolens  ·  notes from a language experiment  ·  cultivated since 2025

Introduction

This document is the Rue Language Specification. It defines the syntax and semantics of the Rue programming language.

Scope

This specification describes the Rue programming language as implemented by the reference compiler. It covers:

  • Lexical structure (tokens, comments, whitespace)
  • Types (integers, booleans, arrays, structs, enums, strings, move semantics, destructors)
  • Expressions and operators (including compile-time expressions)
  • Statements
  • Items (functions, structs, enums, constants)
  • Arrays
  • Runtime behavior (overflow, bounds, panics)
  • Unchecked code and raw pointers
  • Modules and program composition

This specification does not cover:

  • The standard library (when one exists)
  • Compiler implementation details
  • Platform-specific behavior beyond what is explicitly documented

Conformance

A conforming implementation MUST implement all normative requirements of this specification.

Paragraphs marked with rule categories are normative unless explicitly marked as informative. The following categories are used:

CategoryDescription
legality-ruleCompile-time requirements that must be enforced
syntaxGrammar rules defining valid program structure
dynamic-semanticsRuntime behavior requirements
informativeExplanatory text that is not normative
exampleCode examples that are not normative

Behavior Categories

Beyond the paragraph categories above, this specification classifies the behavior of a program into four categories, following C, C++, and Rust. Rue's guiding design preference for how a behavior is assigned to a category — in short, prefer the most-defined category, and confine undefined behavior to unchecked operations that cannot be checked otherwise — is a design decision recorded in ADR-0036, not a normative rule of this document.

Undefined behavior imposes no requirements on a conforming implementation: a program that exhibits undefined behavior is invalid, and the implementation may do anything. In Rue, undefined behavior arises only within unchecked code — raw-pointer operations whose validity cannot be checked without changing a value's representation. The safe subset of Rue has no undefined behavior; this is the language's central memory-safety guarantee.

Unspecified behavior is behavior for which this specification permits a set of possibilities and does not require any particular one to be chosen or documented. Rue currently specifies most such choices — for example evaluation order (4.0) and drop order (3.9) — so this category has few instances today; it is defined for future use.

Implementation-defined behavior is behavior for which this specification permits a set of possibilities and requires the implementation to choose one and document its choice. Examples: the growth strategy and resulting capacity of String (3.7); the in-memory layout of struct and array types (3.6); the width of usize and isize on a target (3.1); and the implementation limits of Appendix C.

Erroneous behavior is behavior that is well-defined but constitutes a program error a conforming implementation is encouraged to diagnose — distinct from undefined behavior, which imposes no requirements at all. Rue currently has no erroneous behavior: conditions other languages leave erroneous, such as integer overflow, Rue instead traps as a defined runtime panic (3.1, 8.1). The category is defined so behaviors of this kind have a home if they later arise (for example, an opt-in wrapping-arithmetic mode).

Normative Language

This specification uses terminology from RFC 2119 to indicate requirement levels. The key words are interpreted as follows:

MUST and SHALL: An absolute requirement. A conforming implementation is required to satisfy this.

MUST NOT and SHALL NOT: An absolute prohibition. A conforming implementation is required not to do this.

SHOULD and RECOMMENDED: There may be valid reasons to ignore this requirement, but the implications must be understood.

SHOULD NOT and NOT RECOMMENDED: There may be valid reasons to accept this behavior, but the implications must be understood.

MAY and OPTIONAL: An item is truly optional. Implementations may or may not include it.

These keywords appear in bold throughout this specification to distinguish normative requirements from descriptive text.

Definitions

The following terms are used throughout this specification:

Expression: A syntactic construct that evaluates to a value.

Statement: A syntactic construct that performs an action but does not produce a value.

Item: A top-level definition in a program, such as a function or struct.

Type: A classification that determines what values an expression can produce and what operations are valid on those values.

Normative: Content that defines required behavior for conforming implementations.

Informative: Content that provides explanation or context but does not define required behavior.

Value: An instance of a type. Expressions evaluate to values.

Coercion: An implicit type conversion that occurs automatically during type checking. See section 3.4 for the complete set of coercions in Rue.

Compatible type: A type is compatible with another type if they are the same type, or if the first type can be coerced to the second type.

Panic: A runtime error condition that terminates program execution with a specific exit code. See Appendix B for the complete list of panic conditions.

Notation

Spec paragraph identifiers follow the format {chapter}.{section}:{paragraph}. For example, 3.1:5 refers to Chapter 3, Section 1, Paragraph 5.

Grammar rules use Extended Backus-Naur Form (EBNF) notation:

  • = defines a production
  • | separates alternatives
  • { } indicates zero or more repetitions
  • [ ] indicates optional elements
  • " " indicates literal text
  • UPPERCASE indicates terminal symbols (tokens)
if_expr     = "if" expression "{" block "}" [ else_clause ] ;
else_clause = "else" ( "{" block "}" | if_expr ) ;

Organization

This specification is organized as follows:

  • Chapter 2: Lexical Structure - Tokens, comments, whitespace, keywords
  • Chapter 3: Types - Integer types, booleans, unit, never, arrays, structs, enums, strings, move semantics, destructors
  • Chapter 4: Expressions - Operators, control flow, function calls, compile-time expressions
  • Chapter 5: Statements - Variable bindings, assignment
  • Chapter 6: Items - Functions, structs, enums, constants
  • Chapter 7: Arrays - Fixed-size array behavior
  • Chapter 8: Runtime Behavior - Overflow, bounds checking, panics
  • Chapter 9: Unchecked Code - Raw pointers and unchecked intrinsics
  • Chapter 10: Modules - Module forms, import resolution, visibility, program composition
  • Appendix A: Grammar - Complete EBNF grammar
  • Appendix B: Runtime Panics - Summary of panic conditions
  • Appendix C: Implementation Limits - Minimum limits for conforming implementations

Version

This specification corresponds to version 0.1.0 of the Rue language.