2011 NEWS 2011-04-12 - NEW METHOD: Uize.String.split
The new Uize.String.split
method, implemented in the Uize.String
module, splits a string into an array of elements using the specified splitter string or regular expression, in strict accordance with the ECMA-262 language's specified behavior for the String
object's split
instance method.
As you may be aware, JavaScript's built-in String
object provides a split
instance method. Unfortunately, this method has poor implementations in some JavaScript interpreters, such as Microsoft's JScript interpreter that is used by Internet Explorer and WSH (Windows Script Host), and such poor implementations may lead to well written code behaving inconcistently and exhibiting buggy behavior in the faulty interpreters.
The Uize.String.split
method addresses this issue, so it can be used in Internet Explorer and WSH (Windows Script Host) to safely split strings using a regular expression splitter. Specifically, the Uize.String.split
method addresses two known issues when using a regular expression splitter: incorrect dropping of empty split values and incorrect omission of captures in the result array.
1. Incorrect Dropping of Empty Split Values
Microsoft's JScript interpreter exhibits an issue where empty split values are omitted when a regular expression splitter is used (but not when a string splitter is used).
EXAMPLE
result = 'foo,,bar'.split (/,/);
In the above example, a string is being split using a regular expression splitter that matches a single comma. In compliant JavaScript interpreters, the above statement would produce a result array with the value ['foo','','bar']
- exactly the same result as if you used a simple string splitter (i.e. 'foo,,bar'.split (',')
).
For a reason that is hard to fathom, the JScript interpreter omits the second empty string element to produce, instead, the result ['foo','bar']
. It's hard to justify or defend this implementation choice, as it wreaks havoc with using the split
instance method to parse lists of values that were serialized using the Array
object's join
instance method, and where some of the values were empty strings.
The Uize.String.split
method fixes this issue, so it can be used in Internet Explorer and WSH (Windows Script Host) to safely split strings using a regular expression splitter.
2. Incorrect Omission of Captures in the Result Array
While the split
instance method of JavaScript's built-in String
object is supposed to include captures from a regular expression splitter in the returned array, this behavior is not supported by some JavaScript interpreters - notably Microsoft's JScript interpreter.
This means that the statement 'line 1\rline 2\nline 3\r\nline 4'.split (/(\r\n|[\r\n])/)
would return the result array ['line 1','line 2','line 3','line 4']
in the JScript interpreter, and not the array ['line 1','\r','line 2','\n','line 3','\r\n','line 4']
as it should. The Uize.String.split
method fixes this issue, so it can be used in Internet Explorer and WSH (Windows Script Host) to safely split strings using a regular expression splitter.