Posts Tagged ‘test cases’

Generating test cases for Unicode-enabled software

September 10th, 2008 by

When it comes to Unicode implementations, there’s a rich set of test
cases to perform. Realizing it is the start. Automating it is the next
step.

At a high-level Unicode-related security bugs can be categorized into the following root-causes:

Canonicalization

  • Interpreting non-shortest form (e.g .UTF-8 encoding trickery)
  • Other decoding issues

Absorption (over-consumption)

  • Over-consuming invalid byte sequences or correcting rather than failing
  • When <41 C2 C3 B1 42>  becomes <41 42>

Character deletion and swallowing

  • “deletion of noncharacters” (UTR-36)
  • <scr[U+FEFF]ipt> becomes <script>
  • Use replacement characters instead!

Interpreting Syntax replacements

  • white space and line feeds
  • E.g. when U+180E acts like U+0020

Best-fit mappings

  • When σ becomes s
  • When ′ becomes ‘

Buffer overruns

  • Incorrect assumptions about string sizes (chars vs. bytes)
  • Improper width calculations

Timing issues

  • handling Unicode after security gates
  • Sometimes handling Unicode before a gate can be a problem too! E.g. BOM handling

Unicode formatter characters lead to cross-site scripting in popular browsers

September 5th, 2008 by

I'll be discussing some of the issues recently reported to Opera, Apple, and Mozilla at the 32nd Unicode Conference in San Jose next week. We discovered some issues with the way certain Unicode characters could be leveraged to enable cross-site scripting attacks in popular web browsers (aka User-Agents). These issues involve utilizing Unicode characters in ways which might bypass most filters, IPS, and IDS systems.