<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: WTF-16</title>
	<atom:link href="http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/</link>
	<description>Just another Blog.mozilla.com weblog</description>
	<lastBuildDate>Wed, 25 Jan 2012 23:29:35 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
	<item>
		<title>By: blassey</title>
		<link>http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/comment-page-1/#comment-50</link>
		<dc:creator>blassey</dc:creator>
		<pubDate>Mon, 18 Feb 2008 21:01:35 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/#comment-50</guid>
		<description>This is just a guess, but I would assume that the cost of the conversions in processing time, heap size and code size would be more damaging to performance on mobile devices than the reduction of memory use. 

Perhaps what is really needed is an abstraction layer that only converts the string when necessary. For instance, if you follow the start up on windows of xulrunner, command line arguments are passed in as UTF-16 and immediately converted to UTF-8. Various string manipulations are performed and then various system APIs are called, but only after the UFT-8 strings are converted back to UTF-16. The memory for the UTF-8 strings isn&#039;t freed until shutdown.</description>
		<content:encoded><![CDATA[<p>This is just a guess, but I would assume that the cost of the conversions in processing time, heap size and code size would be more damaging to performance on mobile devices than the reduction of memory use. </p>
<p>Perhaps what is really needed is an abstraction layer that only converts the string when necessary. For instance, if you follow the start up on windows of xulrunner, command line arguments are passed in as UTF-16 and immediately converted to UTF-8. Various string manipulations are performed and then various system APIs are called, but only after the UFT-8 strings are converted back to UTF-16. The memory for the UTF-8 strings isn&#8217;t freed until shutdown.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: dmandelin</title>
		<link>http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/comment-page-1/#comment-48</link>
		<dc:creator>dmandelin</dc:creator>
		<pubDate>Fri, 15 Feb 2008 21:47:25 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/#comment-48</guid>
		<description>Matthew: Good points. IMO, ease of use for programmers really is the biggest reason to think about decreasing the number encodings. Based on what I hear, it sounds like we would need some more help to make such a big change happen for Mozilla 2.

Kim: That&#039;s the only problem I know of with UCS-2. I&#039;m not sure how important Linear B and Phoenician support really are anyway, but it just sort of bugs me that UCS-2 isn&#039;t truly a Unicode encoding any more.</description>
		<content:encoded><![CDATA[<p>Matthew: Good points. IMO, ease of use for programmers really is the biggest reason to think about decreasing the number encodings. Based on what I hear, it sounds like we would need some more help to make such a big change happen for Mozilla 2.</p>
<p>Kim: That&#8217;s the only problem I know of with UCS-2. I&#8217;m not sure how important Linear B and Phoenician support really are anyway, but it just sort of bugs me that UCS-2 isn&#8217;t truly a Unicode encoding any more.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Kim Sullivan</title>
		<link>http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/comment-page-1/#comment-47</link>
		<dc:creator>Kim Sullivan</dc:creator>
		<pubDate>Fri, 15 Feb 2008 10:48:59 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/#comment-47</guid>
		<description>UCS-2 is mostly the same as UTF16 - UTF16 just has the advantage of supporting surrogate pairs. Why exactly do you revile UCS-2, while UTF-16 seems to be considered a viable alternative?

(I agree that not supporting U+10000 through U+10FFFF is a serious drawback, just wanted to know if there was something else to it).</description>
		<content:encoded><![CDATA[<p>UCS-2 is mostly the same as UTF16 &#8211; UTF16 just has the advantage of supporting surrogate pairs. Why exactly do you revile UCS-2, while UTF-16 seems to be considered a viable alternative?</p>
<p>(I agree that not supporting U+10000 through U+10FFFF is a serious drawback, just wanted to know if there was something else to it).</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Matthew Gertner</title>
		<link>http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/comment-page-1/#comment-46</link>
		<dc:creator>Matthew Gertner</dc:creator>
		<pubDate>Fri, 15 Feb 2008 09:44:48 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/#comment-46</guid>
		<description>Although this will never be the primary consideration, I wouldn&#039;t discount the value of a more consistent API for developers. The Mozilla platform is large, complex and daunting as it is. It&#039;s pretty baffling for the newbie Mozilla coder to figure out why certain APIs take UTF-8 and other UTF-16, seemingly at random. For people designing new APIs for Mozilla, there is a constant decision of whether to use one (&quot;but then I&#039;ll have to convert to use so-and-so API&quot;) or the other (&quot;but then I&#039;ll have to convert for some other API&quot;). Conversion code isn&#039;t that slow, perhaps, but it does muck up and obfuscate code.

As I say, not the most important factor, perhaps, but all things being equal (and it sounds like they are quite emphatically not equal and that UTF-8 is superior), harmonizing on a single Unicode encoding would be a great step. Given all the big changes planned for Mozilla 2, I&#039;d be disappointed if this one didn&#039;t make the cut.</description>
		<content:encoded><![CDATA[<p>Although this will never be the primary consideration, I wouldn&#8217;t discount the value of a more consistent API for developers. The Mozilla platform is large, complex and daunting as it is. It&#8217;s pretty baffling for the newbie Mozilla coder to figure out why certain APIs take UTF-8 and other UTF-16, seemingly at random. For people designing new APIs for Mozilla, there is a constant decision of whether to use one (&#8220;but then I&#8217;ll have to convert to use so-and-so API&#8221;) or the other (&#8220;but then I&#8217;ll have to convert for some other API&#8221;). Conversion code isn&#8217;t that slow, perhaps, but it does muck up and obfuscate code.</p>
<p>As I say, not the most important factor, perhaps, but all things being equal (and it sounds like they are quite emphatically not equal and that UTF-8 is superior), harmonizing on a single Unicode encoding would be a great step. Given all the big changes planned for Mozilla 2, I&#8217;d be disappointed if this one didn&#8217;t make the cut.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert O'Callahan</title>
		<link>http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/comment-page-1/#comment-39</link>
		<dc:creator>Robert O'Callahan</dc:creator>
		<pubDate>Thu, 14 Feb 2008 23:47:44 +0000</pubDate>
		<guid isPermaLink="false">http://blog.mozilla.com/dmandelin/2008/02/14/wtf-16/#comment-39</guid>
		<description>CJK characters are 3 bytes each in UTF-8. That&#039;s why the memory usage reduction is surprising.

The Windows and Mac text APIs are not a problem for UTF-8. We already have a string copy in the textrun construction path that we could use to convert from UTF-8 to UTF-16 as well.</description>
		<content:encoded><![CDATA[<p>CJK characters are 3 bytes each in UTF-8. That&#8217;s why the memory usage reduction is surprising.</p>
<p>The Windows and Mac text APIs are not a problem for UTF-8. We already have a string copy in the textrun construction path that we could use to convert from UTF-8 to UTF-16 as well.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

