Testcase:
Array.from({length:256},(x,i)=>[i,new TextDecoder('windows-1252').decode(Uint8Array.of(i)).codePointAt(0)]).filter(([a,b])=>a!==b).map(x=>x.join())
Node.js treats windows-1252 as a subset of Unicode (code above shows zero difference), which is not correct
E.g. Node.js:
> new TextDecoder('windows-1252').decode(Uint8Array.of(128)).codePointAt(0)
128
> new TextDecoder('windows-1252').decode(Uint8Array.of(130)).codePointAt(0)
130
> new TextDecoder('windows-1252').decode(Uint8Array.of(131)).codePointAt(0)
131
> new TextDecoder('windows-1252').decode(Uint8Array.of(159)).codePointAt(0)
159
Browsers (expected):
> new TextDecoder('windows-1252').decode(Uint8Array.of(128)).codePointAt(0)
8364
> new TextDecoder('windows-1252').decode(Uint8Array.of(130)).codePointAt(0)
8218
> new TextDecoder('windows-1252').decode(Uint8Array.of(131)).codePointAt(0)
402
> new TextDecoder('windows-1252').decode(Uint8Array.of(159)).codePointAt(0)
376
This also directly contradicts the doc (which is aware that windows-1252 and Latin1 are different):
|
Modern Web browsers follow the [WHATWG Encoding Standard][] which aliases |
|
both `'latin1'` and `'ISO-8859-1'` to `'win-1252'`. This means that while doing |
|
something like `http.get()`, if the returned charset is one of those listed in |
|
the WHATWG specification it is possible that the server actually returned |
|
`'win-1252'`-encoded data, and using `'latin1'` encoding may incorrectly decode |
|
the characters. |
It's also a regression since v20.18.3 and v22.13.0
Node.js <=20.18.2 behaves correctly, v22 <=22.12.0 also behaves correctly
This regressed in 20.x and 22.x this year, after they were labeled as LTS
20.x regressed during Maintenance.
Whatever caused this in 20/22 should be reverted
Testcase:
Node.js treats
windows-1252as a subset of Unicode (code above shows zero difference), which is not correctE.g. Node.js:
Browsers (expected):
This also directly contradicts the doc (which is aware that windows-1252 and Latin1 are different):
node/doc/api/buffer.md
Lines 229 to 234 in 7643c2a
It's also a regression since v20.18.3 and v22.13.0
Node.js <=20.18.2 behaves correctly, v22 <=22.12.0 also behaves correctly
This regressed in 20.x and 22.x this year, after they were labeled as LTS
20.x regressed during Maintenance.
Whatever caused this in 20/22 should be reverted