Trim leading whitespace

I would like to trim leading whitespace from a text string and retain the remainder of the string including whitespace in the middle of the string. For example, a string such as

" !]L1lU22ut@#% T*)rM! 8" becomes "!]L1lU22ut@#% T*)rM! 8"

where the leading and internal whitespace can be any form of whitespace (blank, tab, etc.). I don't think that Igor has a trim function to remove leading or trailing characters. Is this correct? I am no regular Regular Expression person, but (through trial and error) I arrived at the following, using split string:

splitstring /E="(\\s*)([[:ascii:]]*)" sString, sFirst, sSecond

where sSecond returns the desired portion of the input string. Is this a reasonable solution to the problem or is there a better way?

Also, ideally, trailing whitespace would be removed as well, but I haven't figured that out using Regular Expressions.

Thanks,

Jeff
Function/S RemoveLeadingWhitespace(str)
    String str
   
    do
        String firstChar= str[0]
        if( CmpStr(firstChar," ") == 0 )
            str= str[1,inf]
        else
            break
        endif  
    while(1)
   
    return str
End

Function/S RemoveEndingWhitespace(str)
    String str
   
    do
        String str2= RemoveEnding(str," ")
        if( CmpStr(str2, str) == 0 )
            break
        endif
        str= str2
    while( 1 )
    return str
End

Function test()
    String str= "   !]L1lU22ut@#% T*)rM! 8    "
   
    String str2= RemoveLeadingWhitespace(str)

    Print "x"+str2+"x"

    String str3= RemoveEndingWhitespace(str2)

    Print "x"+str3+"x"
End

•test()
  x!]L1lU22ut@#% T*)rM! 8    x
  x!]L1lU22ut@#% T*)rM! 8x

--Jim Prouty
Software Engineer, WaveMetrics, Inc.
Jim,

That is certainly the bullet proof way to go. It is my back up if the splitstring fails me. Thanks for the response.

I wonder if there are any comments on the regular expression method. The archives & help files didn't cough up any useful insights on this question.
Some people, when confronted with a problem, think “I know, I'll use regular expressions.” Now they have two problems.

http://www.codinghorror.com/blog/2008/06/regular-expressions-now-you-ha…

I wrote SplitString and the regular expression routines. I use them only if straightforward approaches are insufficient. Leading and trailing spaces are pretty simple to remove using easily understood foolproof code like the above.

Regular expressions are great until they don't work for some mysterious reason. Then they're a pain.

--Jim Prouty
Software Engineer, WaveMetrics, Inc.
jtigor wrote:
where the leading and internal whitespace can be any form of whitespace (blank, tab, etc.).


JimProuty wrote:
Leading and trailing spaces are pretty simple to remove using easily understood foolproof code like the above.



While I agree with Jim's thoughts on regular expressions, I believe his code will work only with spaces. It's possible to expand his code to handle other whitespace characters, of course, but the problem is that you'll need to add these cases explicitly. But what if you get exotic stuff like a nonbreaking space? I don't like dangling ends like that, even though the odds are minuscule.

So, while filtering for tabs and spaces only wouldn't be too bad, and might not hit the threshold for RegExps, I would probably use one anyway just for the peace of mind that it should handle *any* whitespace character.

jtigor wrote:
Also, ideally, trailing whitespace would be removed as well, but I haven't figured that out using Regular Expressions.


It would be tempting to use (note the use of the '^' and '$' anchors)
SplitString /E="^\\s*(.*)\\s*$" MyString, noTrailingOrLeadingSpaces

But it won't work because the trailing whitespace will match the '.*' pattern, so the '\\s*' at the end will never be matched. We've got a clear problem definition and a deceptively simple RegExp, but we're already running into problems. That's the two problems for you ;).


So, what do I propose? Simply make a slight change to Jim's code, like this:
Function/S RemoveLeadingWhitespace(str)
    String str
 
    if (strlen(str) == 0)
        return ""
    endif
   
    do
        String firstChar= str[0]
        if (IsWhiteSpace(firstChar))
            str= str[1,inf]
        else
            break
        endif  
    while (strlen(str) > 0)
 
    return str
End
 
Function/S RemoveEndingWhitespace(str)
    String str
   
    if (strlen(str) == 0)
        return ""
    endif
 
    do
        String lastChar = str[strlen(str) - 1]
        if (IsWhiteSpace(lastChar))
            str = str[0, strlen(str) - 2]
        endif
    while (strlen(str) > 0)
    return str
End

Function IsWhiteSpace(char)
    String char
   
    return GrepString(char, "\\s")
End


Which, I think, is still sufficiently in line with Jim's simple answer, while also satisfying my tendency to fret over obscure edge cases ;).
jtigor wrote:
Jim,

That is certainly the bullet proof way to go. It is my back up if the splitstring fails me. Thanks for the response.

I wonder if there are any comments on the regular expression method. The archives & help files didn't cough up any useful insights on this question.


I believe the following does what you want (trimming both leading and trailing whitespace):

SplitString/E="^\\s*(.*?)\\s*$" testString, trimmedString

I assume you don't need to capture the leading whitespace as you did in your original question. As 741 mentioned, you should use the leading and trailing anchors in your expressions also.
Oh, and by the way, if you are using regular expressions and have access to a Windows machine, Regex Buddy is the best $39.95 you could spend.
Thanks to everyone for your comments.

Is there a possibility to include a replace string using Regular Expressions in IPV7?