Removing all escape sequences from string

I am writing a procedure to load data from file folders with the Windows file format. I get a funky error when loading the folder 200808 using loadwave. The file path string is something like "C:\xxx\200808\xxx". I have discovered that the combination of a \ with 3 digits can form an octal number and are interpreted as an escape sequence. So, I want to write a function that replaces all \ by \\ if they are followed by 3 numbers.

print replacestring("\200","dfds\200807\fsd", "\\\\200")

gives as output

dfds\\200807\fsd

which is exactly what I want. However, when I try to write a loop to remove all possible octal numbers, like this:

Function/S cleanPathString(str)
// Version 1, 24/06/2022
// Written in Igor Pro 9.0.0.10, build 37840, 64 bits

    string str
    string str2
    variable i,j,k
    str=replacestring("\t",str,"\\\\t")
    str=replacestring("\r",str,"\\\\r")
    str=replacestring("\n",str,"\\\\n")
    str=replacestring("\'",str,"\\\\'")
   
    for(i=0;i<8;i+=1)
        for(j=0;j<8;j+=1)
            for(k=0;K<8;K+=1)
                str2="\\"+num2str(i)+num2str(j)+num2str(k)
                str=replacestring(str2,str,"\\\\"+num2str(i))  
            endfor
        endfor
    endfor

    return str
End

print cleanPathString("dfds\200807fsd")

The output is:

dfds�807fsd

Hopefully, someone has an explanation for this behavior and can offer good suggestion. Much appreciated!

 

Thanks,

 

Matthijs A. van Spronsen

Beamline scientist

Diamond Light Source

You need to use double-backslashes only when entering a literal string, like this:

String path = "C:\\Users\\Someone\\Data"

If you the path is not entered as a literal string, for example, if you read it from a file, then you must not double-backslash it. The reason for this is that Igor's interpretation of backslash as introducing an escape sequence occurs only when Igor interprets a literal string (from the command line, from a procedure, or from the Execute operation).

Where are you getting "C:\xxx\200808\xxx" from?

I don't see any sign here that Windows accepts octal escape sequences.

Igor does accept octal escape sequences. If you entered "\200" as a literal string, Igor would convert it to a byte with the value octal 200 = 128 = hex 80 = 0x80.

I think the problem is that 0x80 is not a valid byte in a UTF-8 string.

0x80 is the Euro sign in Windows-1252 text encoding. I think you need to convert from Windows-1252 to UTF-8, like this:

path = ConvertTextEncoding(path, 3, 1, 1, 0)    // Convert Windows-1252 to UTF-8

If you were entering the path as a literal string, you do this:

String path = "C:\\xxx\200808\\xxx" // Windows-1252 text encoding
path = ConvertTextEncoding(path, 3, 1, 1, 0)    // Convert Windows-1252 to UTF-8

Here we use double-backslashes but not for the "\200" part because we want Igor to interpret that as an escape sequence.

Maybe I'm misunderstanding, but can't you just replace the backslash with a colon in the path, as in:

LoadWave/A/G "C:temp:200808:subfolder:MyFile.txt"

This appears to work for me on Windows.

 

Thanks for your quick reply, Howard. The string with the path would be given as a function parameter by a user, something like

generaldataLoader("C:\xxx\200807")

I would like to have another function that is called by this generaldataLoader that converts this string into

"C:\xxx\\200807"

I do not want to interpret the "\200" escape sequence. Apologies, if my original question wasn't clear.

So, my question boils down to: how can I transform the numerical values of three variables, i,j,k, into a literal string, which equals

string test="\ijk"

 

In reply to by KurtB

KurtB wrote:

Maybe I'm misunderstanding, but can't you just replace the backslash with a colon in the path, as in:

LoadWave/A/G "C:temp:200808:subfolder:MyFile.txt"

This appears to work for me on Windows.

Thanks for your reaction. This would be a workaround. But I want to share these functions with team members, and it would be great if the function would work when given the path in the Windows file format.

You should write this:

generaldataLoader("C:\xxx\200807")

like this:

generaldataLoader("C:\\xxx\\200807")

Igor converts "\\" to "\" *only* when it converts a literal string to bytes in memory.

You can avoid the backslash issue altogether by using colons, like this:

generaldataLoader("C:xxx:200807")

Igor supports paths using colons on both Macintosh and Windows.

So, my question boils down to: how can I transform the numerical values of three variables, i,j,k, into a literal string, which equals

Perhaps you are saying that you want to generate the "200807" part based on the values of variables. If so, you should use the sprintf function, something like this:

// Example: Demo(20,08,07) -> "C:xxx:200807"
Function Demo(a,b,c)
    int a,b,c
   
    String path
    sprintf path, "C:xxx:%02d%02d%02d", a, b, c
    Print path
End

 

In reply to by hrodstein

hrodstein wrote:

Perhaps you are saying that you want to generate the "200807" part based on the values of variables. If so, you should use the sprintf function, something like this:

// Example: Demo(20,08,07) -> "C:xxx:200807"
Function Demo(a,b,c)
    int a,b,c
   
    String path
    sprintf path, "C:xxx:%02d%02d%02d", a, b, c
    Print path
End

Many thanks for your support. Unfortunately, that is not is not what I want.

Ultimately, I would like some code that would change the literal string containing the escape sequences into either a string containing only "\\" escape sequences or into a string where all backslashes are replaced by colons.

print replacestring("\200","dfds\200807fsd", "\\\\200")

and

print replacestring("\200","dfds\200807fsd", ":")

work perfectly from the command line. However, I want to make a loop to go through all possible 512 values to remove all possible escape sequences. So the loop would need to construct this literal string "\xxx" for all possible values, which I cannot seem to do.

The only way I can see to do this is by brute force:

 

Function/S cleanPathString(str)
    string str
    str=replacestring("\001",str,"\\\\001")
    str=replacestring("\002",str,"\\\\002")
    str=replacestring("\003",str,"\\\\003")
    str=replacestring("\004",str,"\\\\004")
    str=replacestring("\005",str,"\\\\005")

   ...

    return str
End

But there got to be a more elegant way of doing this?

 

Strike that. Brute force method also doesn't work. Some of these combinations correspond to numbers and letters. So once the literal string has been interpreted by calling the function, there is no way of knowing where the original escape sequences were.

The ReplaceString commands that you use in your cleanPathString don't do what you want. "\001" is a literal string so Igor parses it. In the parsing, it converts "\ddd" into a single byte. If you want to "\001" to "\\001" (as stored in memory, not as entered in a command), you would need to pass "\\001" to ReplaceString, not "\001".

That said, I don't think replacing single backslashes with double backslashes in a string variable stored in memory will do any good.

I have discovered that the combination of a \ with 3 digits can form an octal number and are interpreted as an escape sequence. So, I want to write a function that replaces all \ by \\ if they are followed by 3 numbers.

Rereading your original question, yes Igor interprets "\ddd" as an escape sequence but only when converting a literal string (a double-quoted string that you type into the command line or into a procedure window). It does not interpret "\ddd" once the string has been parsed and stored in a string variable.

So, if you are entering the path as a literal string, just use double-backslashes. If the string containing the backslashes is already stored in a string variable, you don't need to convert it because "\ddd" will no longer be treated as an escape sequence.

If the Windows path containing the backslashes is stored in a variable, you can convert it to a Macintosh path using colons like this:

// Example: Demo2("C:\\xxx\\200807")
Function Demo2(String windowsPath)
    // "HFS" is the Mac OS 9 Hierarchical File System
    String hfsPath = ParseFilePath(5, windowsPath, ":", 0, 0)
    Print hfsPath
End

When you execute the example, Igor parses the literal string that you type into the command line and converts double-backslashes to single backslashes. This creates a string parameter variable containing "C:\xxx\200807". That parameter string variable is passed to the Demo2 function. The ParseFilePath function converts "C:\xxx\200807" to "C:xxx:200807" which is accepted by all Igor commands.

My suggestion is to open a file explorer dialog for your users to navigate to the files to be opened.  Something like the following is what I often use when loading data files...

Function/S GetFileListToOpen()
    Variable nRefNum            //not used here, files are not opened
    String sFileLIst = ""
    String sMsg = ""

    sMsg = "Select file(s) to open:"

    Open/D/R/MULT=1/M=sMsg nRefNum
   
//Transfer list of files to open to sFileList.
//list items are separated by CR.  
    if( strlen(S_fileName) == 0)
        printf "\nUser Canceled file open." //User Canceled.
        return ""
    else
        sFileList = S_fileName
    endif
    return sFileList
End

This function only returns a list of files without opening them.  The list can then be passed on to another function to do the loading.