Skip to content

Ripper: incompatibility for uppercase UTF-8 constant names in aliases #3457

Open
@noahgibbs

Description

@noahgibbs

In aliases and many other cases, CRuby's Ripper emits different lexer tokens depending on the symbol's name. For instance an uppercase letter emits :@const instead of :@Ident.

TruffleRuby does this correctly for 7-bit constants like "A", but not for unicode uppercase constants like "Ñ".

CRuby:

irb(main):001:0> require "ripper"
=> false
irb(main):002:0> Ripper.sexp_raw("alias :foo :Ñ")
=>
[:program,
 [:stmts_add,
  [:stmts_new],
  [:alias, [:symbol_literal, [:symbol, [:@ident, "foo", [1, 7]]]], [:symbol_literal, [:symbol, [:@const, "Ñ", [1, 12]]]]]]]

TruffleRuby:

irb(main):001:0> require "ripper"
=> false
irb(main):002:0> Ripper.sexp_raw("alias :foo :Ñ")
=>
[:program,
 [:stmts_add,
  [:stmts_new],
  [:alias, [:symbol_literal, [:symbol, [:@ident, "foo", [1, 7]]]], [:symbol_literal, [:symbol, [:@ident, "Ñ", [1, 12]]]]]]]

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions