Summary
SourceFile.ContainsNonASCII can remain false even when the source file contains non-ASCII characters, if those characters appear inside a simple string literal handled by the scanner's string fast path.
This can make SourceFile.GetPositionMap() return an ASCII identity map, causing encoder node positions to remain UTF-8 byte offsets instead of being converted to UTF-16 offsets.
AI Analysis
const x = "─";
namespace N {
export const y = x;
}
The ─ character is non-ASCII. However, scanString can take the fast path for simple strings without escapes or line breaks:
strLen := strings.IndexByte(s.text[s.pos:], byte(quote))
if strLen > 0 {
str := s.text[s.pos : s.pos+strLen]
if jsxAttributeString ||
strings.IndexByte(str, '\\') < 0 && strings.IndexByte(str, '\r') < 0 && strings.IndexByte(str, '\n') < 0 {
s.pos += strLen + 1
return str
}
}
This path advances over the string contents by byte length without decoding runes or setting s.containsNonASCII = true.
Summary
SourceFile.ContainsNonASCIIcan remainfalseeven when the source file contains non-ASCII characters, if those characters appear inside a simple string literal handled by the scanner's string fast path.This can make
SourceFile.GetPositionMap()return an ASCII identity map, causing encoder node positions to remain UTF-8 byte offsets instead of being converted to UTF-16 offsets.AI Analysis
The
─character is non-ASCII. However,scanStringcan take the fast path for simple strings without escapes or line breaks:This path advances over the string contents by byte length without decoding runes or setting
s.containsNonASCII = true.