Difference between revisions of "Element type of string ranges"

From D Wiki
Jump to: navigation, search
(Initial version)
 
(Comparison: Add correctness)
Line 19: Line 19:
 
| Case conversion, insensitive comparison in ranges for other languages
 
| Case conversion, insensitive comparison in ranges for other languages
 
| {{No}} Correct case conversion for all languages is dependent on locale information. (E.g. Turkish I / ı and İ / i).
 
| {{No}} Correct case conversion for all languages is dependent on locale information. (E.g. Turkish I / ı and İ / i).
 +
|-
 +
| Correctness || {{No}} Only works for certain languages and alphabets || {{No}} Only works for ASCII
 
|-
 
|-
 
| Performance || {{No}} Implicit decoding everywhere, unless each algorithm is specialized not to || {{Yes}} As fast as <tt>ubyte[]</tt>
 
| Performance || {{No}} Implicit decoding everywhere, unless each algorithm is specialized not to || {{Yes}} As fast as <tt>ubyte[]</tt>

Revision as of 03:07, 9 March 2014

This article attempts to summarize the arguments in the thread Major performance problem with std.array.front().

Comparison

One of the proposals in the thread is to switch the iteration type of string ranges from dchar to the string's character type.

Argument Old New
Searching for a particular dchar in a string. Green check.png s.canFind('é') Red x.png Will result in a pragma warning in some places, will fail silently in others (when specified via predicate).
Searching for a particular dchar in a non-normalized string. Red x.png Above fails for combining marks, as that requires normalization.
Case conversion, insensitive comparison in ranges for certain languages Green check.png s.count!((a, b) => std.uni.toLower(a) == std.uni.toLower(b))("é") Red x.png Fails silently.
Case conversion, insensitive comparison in ranges for other languages Red x.png Correct case conversion for all languages is dependent on locale information. (E.g. Turkish I / ı and İ / i).
Correctness Red x.png Only works for certain languages and alphabets Red x.png Only works for ASCII
Performance Red x.png Implicit decoding everywhere, unless each algorithm is specialized not to Green check.png As fast as ubyte[]
Implementation difficulty Red x.png
phobos/std $ grep ElementEncodingType *.d | wc -l
80
Green check.png Strings are treated as any other arrays
Consistency Red x.png Range algorithms return values different from array algorithms Green check.png String ranges work like ranges of any other arrays