<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Casaba Security &#187; whitelist</title>
	<atom:link href="http://www.casaba.com/blog/tag/whitelist/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.casaba.com/blog</link>
	<description>Building and breaking software and robots</description>
	<lastBuildDate>Wed, 11 Jan 2012 18:08:45 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>I18N input validation whitelist filter with System.Globalization and GetUnicodeCategory</title>
		<link>http://www.casaba.com/blog/2007/04/i18n-input-validation-whitelist-filter-with-system-globalization-and-getunicodecategory/</link>
		<comments>http://www.casaba.com/blog/2007/04/i18n-input-validation-whitelist-filter-with-system-globalization-and-getunicodecategory/#comments</comments>
		<pubDate>Tue, 24 Apr 2007 05:33:20 +0000</pubDate>
		<dc:creator>Chris Weber</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Unicode]]></category>
		<category><![CDATA[whitelist]]></category>

		<guid isPermaLink="false"></guid>
		<description><![CDATA[Maybe you’re building internationalized code and wondering how to build a whitelist filter that will support all the different character sets your planning to support. If you support more than ten, especially some of the larger east Asian sets, this might seem like an unwieldy or tricky process. Well luckily it’s easier than most people [...]]]></description>
			<content:encoded><![CDATA[<p>Maybe you’re building internationalized code and wondering how to build a whitelist filter that will support all the different character sets your planning to support. If you support more than ten, especially some of the larger east Asian sets, this might seem like an unwieldy or tricky process.<br />
Well luckily it’s easier than most people would think. Building a good input validation filter can be simplified with .Net’s <a linkindex="84" href="http://msdn2.microsoft.com/en-us/library/system.globalization.charunicodeinfo.getunicodecategory.aspx">GetUnicodeCategory</a>. But use the method from the <strong>System.Globalization</strong> namespace as the other one in System.Char looks like it may become the subordinate. </p>
<p>With <strong>GetUnicodeCategory </strong>you can simply build a <strong>whitelist </strong>supporting the character <em><strong>categories </strong></em>you want to allow. So get away from thinking you have to write a regEx filter and list out all the character ranges you want to allow in each character set, it’s much simpler than that! </p>
<p>The Unicode standard assigns ever character to one of about <strong>31 categories</strong>. They make sense too, for example Other Control charactes (Cc) , Lowercase Letter (Ll), Uppercase Letter (Lu), Math Symbol (Sm). So for example you might want to only allow letters, numbers, and punctuation in your whitelist. This could be achieved with the following snippet: </p>
<p><code><br />
char cUntrustedInput; // the untrusted user-input<br />
UnicodeCategory cInputTest = CharUnicodeInfo.GetUnicodeCategory(cUntrustedInput);<br />
if (cTestCategory == UnicodeCategory.LowercaseLetter ||<br />
cTestCategory == UnicodeCategory.UppercaseLetter ||<br />
cTestCategory == UnicodeCategory.DecimalDigitNumber ||<br />
cTestCategory == UnicodeCategory.TitlecaseLetter ||<br />
cTestCategory == UnicodeCategory.OtherLetter ||<br />
cTestCategory == UnicodeCategory.NonSpacingMark ||<br />
cTestCategory == UnicodeCategory.DashPunctuation ||<br />
cTestCategory == UnicodeCategory.ConnectorPunctuation)<br />
{<br />
// character looks safe, continue<br />
}<br />
else<br />
{<br />
// character is not allowed, fail<br />
}<br />
</code></p>
<p>Not too bad eh.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.casaba.com/blog/2007/04/i18n-input-validation-whitelist-filter-with-system-globalization-and-getunicodecategory/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

