Unicode can be quite complex, so we set out to create a truly ergonomic Unicode experience for Smalltalk. We went beyond "basic" Unicode support, and instead designed an advanced implementation that focused on "Unicode Correctness" and performance.
For example, all Unicode normalization is handled automatically by default to help prevent silent errors and other issues. VAST's new Unicode strings also maintain full API compatibility with the current
String and
Character while using a modern, compact UTF-8 representation. Our Unicode library even allows for representational flexibility and bridging to existing string classes through the use of "
Views".
To keep these features performant, Unicode strings are optimized at creation (and during usage) through SIMD algorithms, just-in-time optimizations, and copy-on-write methodologies. They provide rapid UTF-8 and ASCII validation, fast string search, and exceptional memory efficiency. In fact, validation can occur at over 20 GB/sec!
See our Unicode documentation for a full review of this libraries' capabilities.