[e-lang] Source code character sets and Unicode

Kevin Reid kpreid at mac.com
Mon May 28 22:18:16 EDT 2007


On May 28, 2007, at 21:25, Mark S. Miller wrote:

> I have been thinking of requiring something like a
>
>      pragma.charset("unicode")
>
> before allowing non-ascii characters. This discussion is probably  
> an opportune
> time to decide on this matter.

I suggest not using the term "charset", which has been commonly  
misused to refer to encodings. In fact, when I first read your  
message, I thought you were proposing an encoding pragma.

"characterSet" would be sufficient to avoid this issue.


If we introduce something like this, perhaps we should allow for  
specifying subsets of Unicode, e.g. excluding writing systems with  
characters resembling the ones actually used in the text? This might  
be too complex for the benefit, though.


I have not read the other documents you linked to in the rest of your  
message yet.

-- 
Kevin Reid                            <http://homepage.mac.com/kpreid/>




More information about the e-lang mailing list