Skip to content

SourceFile.ContainsNonASCII can be false for non-ASCII in string literal fast path #4358

Description

@CPunisher

Summary

SourceFile.ContainsNonASCII can remain false even when the source file contains non-ASCII characters, if those characters appear inside a simple string literal handled by the scanner's string fast path.

This can make SourceFile.GetPositionMap() return an ASCII identity map, causing encoder node positions to remain UTF-8 byte offsets instead of being converted to UTF-16 offsets.

AI Analysis

const x = "─";

namespace N {
  export const y = x;
}

The character is non-ASCII. However, scanString can take the fast path for simple strings without escapes or line breaks:

strLen := strings.IndexByte(s.text[s.pos:], byte(quote))
if strLen > 0 {
    str := s.text[s.pos : s.pos+strLen]
    if jsxAttributeString ||
        strings.IndexByte(str, '\\') < 0 && strings.IndexByte(str, '\r') < 0 && strings.IndexByte(str, '\n') < 0 {
        s.pos += strLen + 1
        return str
    }
}

This path advances over the string contents by byte length without decoding runes or setting s.containsNonASCII = true.

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions