Delimited Text

I am having an issue loading a comma delimited excel file (csv).

The data is formatted like this:
a,b,c,d,e
a,b,c,d,e
a,b,c,d,e
a,b,c,d,e,f,g,h,i,j,k,l,m,n
a,b,c,d,e
a,b,c,d,e

Where columns f-n are comments that only appear once in a while.
When I try to load this data using:
Data >> Load Waves >> Load Delimited Text
It does not see columns f-n, since they do not appear in the first row.

Any ideas how to find these columns?
There is no easy way to load this format.

Probably the easiest (though not easy) way to do it is to create a modfied version of the file with commas added so all of the lines have the same number of columns, like this:

a,b,c,d,e,,,,,,,,,
a,b,c,d,e,f,g,h,i,j,k,l,m,n

If you want help with this, send a zipped copy of a sample file as an attachment to support@wavemetrics.com. Mention your Igor version and OS in the body of the email and any other instructions.


hrodstein wrote:
There is no easy way to load this format.

Probably the easiest (though not easy) way to do it is to create a modfied version of the file with commas added so all of the lines have the same number of columns, like this:


Rather than modifying each file, this type of format might be easier to load ...

a) as a text file into a notebook, with subsequent line-by-line parsing into text waves or
b) as a string, with subsequent string manipulations

You would have to write the functions needed to do either operation. The /r (carriage return) and , (comma) characters would be the key searches for parsing.

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
I figured out an ad-hoc solution to the problem, I guess it will work for now.
I have another question however.
One of the columns has text before the value:

voltage=6
voltage=7
voltage=4
voltage=3
.....

What is the function in Igor to strip off a certain number of characters?
Or since the text before the value is always a certain length, could I index the string as string[7:end] or something to the effect?
Will this do what I was describing above?
The string looks like this:
APD Voltage= #####

String expr="([[:alpha:]]+) ([[:alpha:]]+)=([[:digit:]]+)"
String APDtext, APDvalue
Variable voltage
SplitString/E=(expr) 'APD Voltage', APDtext, APDvalue
voltage = Str2Num(APDvalue)
I bad (actually awful) at regular expressions. Here are two other solutions, assuming that I have understood the question:

Function Test()
    String APDText = "voltage=6"
   
    Variable val
   
    // Solution #1
    String tmp
    tmp = APDText[8,100]
    val = str2num(tmp)
    Print val
   
    // Solution #2
    sscanf APDText, "voltage=%g", val
    Print val
End


Cory K wrote:
Will this do what I was describing above?
The string looks like this:
APD Voltage= ##### ....


Function GetV(theStr)
        string theStr

    string regEx = "=[ ]*(.*)", vStr
    SplitString/E=(regEx) theStr, vStr
    return str2num(vStr)   
end

print GetV("ADP Voltage = 1.234")
1.234


--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
Thats exactly what I wanted, thanks guys.

For my initial problem, where Igor wouldnt recognize any columns that werent in row 0, I have a quick question.
Can I write text to the file, then read the new version of the file?

For example:

If the original file was:
a,b,c,d,e
a,b,c,d,e
a,b,c,d,e
a,b,c,d,e,f,g,h,i,j,k,l,m,n
a,b,c,d,e
a,b,c,d,e

Can I write "f,g,h,i,j,k,l,m,n" as text appended to the end of the first row?
Then I think I can get around this issue.
OK, instead of asking little bits and pieces of the issue, I have a better idea.
Can I post what I have so far of my code, then it may be easier to see what my main goal is.

What this eventually supposed to do is:
- Load a csv file
- Delete all the columns I dont need
- For the column that only has data every once in a while (wave20), fill in the points that are empty.
To do this, if the space is blank, I just use the most recent non-blank value. When I get to a non-blank index, I use that value until I reach the next non-blank index and so forth
- Lastly I rename all the relevant waves and add them to a table

I will attach a random data file if that will make it easier to see what I am talking about.
(*Note: I cannot attach a .csv file in IgorExchange, so I resaved it as tab delimeted text)

#pragma rtGlobals=1     // Use modern global access method.
Menu "Macros"
    "Load TTI Data", LoadTTI()
End

Function GetV(theStr)
       string theStr
    string regEx = "=[ ]*(.*)", vStr
    SplitString/E=(regEx) theStr, vStr
    return str2num(vStr)   
end
 
Function LoadTTI()
    variable x, length, APDvalue
    String APDtext, wave20
    LoadWave/J/D/A/E=0/K=0/V={","," $",0,0};
    KillWaves wave3,wave7,wave8,wave9,wave10,wave11,wave12,wave13,wave14,wave15;
    KillWaves wave16,wave17,wave18,wave19,wave21,wave22,wave23,wave24,wave25;
    Wave APDwave = $wave20
    length = strlen(wave20[0]) - 1
    APDvalue = GetV(APDwave[0]
    For (x=0;x<(numpnts(wave0)-1);x+=1)
        if (APDwave[x] != NaN)
            APDtext = num2str(APDwave[x])
            APDvalue = GetV(APDtext)                   
            APDwave[x] =  APDvalue
        else
            APDwave[x] = APDvalue
        endif
    Endfor
    Edit wave0, wave2, wave5, wave4, wave6, wave20, wave1;
    Rename wave0, 'Date Points'; DelayUpdate
    Rename wave1, 'Time (sec)'; DelayUpdate
    Rename wave2, 'Lifetime (µsec)'; DelayUpdate
    Rename wave4, 'Phase deg'; DelayUpdate
    Rename wave5, 'Temperature (°C)'; DelayUpdate
    Rename wave6, 'Signal Amplitude'; DelayUpdate
    Rename wave20, 'APD Voltage';
    DeletePoints 0,2, 'Date Points','Time (sec)','Lifetime (µsec)','Phase deg'; DelayUpdate
    DeletePoints 0,2, 'Temperature (°C)','Signal Amplitude','APD Voltage';
End
Cory K wrote:
Thats exactly what I wanted, thanks guys.

For my initial problem, where Igor wouldnt recognize any columns that werent in row 0, I have a quick question.
Can I write text to the file, then read the new version of the file?

For example:

If the original file was:
a,b,c,d,e
a,b,c,d,e
a,b,c,d,e
a,b,c,d,e,f,g,h,i,j,k,l,m,n
a,b,c,d,e
a,b,c,d,e

Can I write "f,g,h,i,j,k,l,m,n" as text appended to the end of the first row?
Then I think I can get around this issue.


No, that is 1) difficult, and 2) generally not a good idea to go about modifying input data.

There's gotta be a better way.

Software Engineer, WaveMetrics, Inc.
JimProuty wrote:
Cory K wrote:
Thats exactly what I wanted, thanks guys.
...
Then I think I can get around this issue.


No, that is 1) difficult, and 2) generally not a good idea to go about modifying input data.

There's gotta be a better way.


It seems to me, using LoadWave with the /B flag is the way to go for this convoluted file structure.

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
Cory K wrote:

(*Note: I cannot attach a .csv file in IgorExchange, so I resaved it as tab delimeted text)

Thanks for pointing that out. I've added .csv to the list of allowed extensions.
Also /L Flag could be used. I modified your LoadTTI() function, using both, /B flag and /L flag.
For details, the LoadWave's help explain very well how to use them.

Both of function's work whit .txt file. For .csv file you should modify the LoadWave's /V Flag.

Function GetV(theStr)
       string theStr
    string regEx = "=[ ]*(.*)", vStr
    SplitString/E=(regEx) theStr, vStr
    return str2num(vStr)   
end
 
Function LoadTTI_L()            //using /L Flag
   

    LoadWave/O/Q/A/J/D/W/K=0/V={"\t"," $",0,0}/L={0,0,0,0,22}//load the firts 22 waves, skipping all waves after column 22. Column 22 is APD_Voltage
    Wave/T APD_Voltage                          //declaration of text wave to convert in numeric wave
    Make/O/N=(numPnts(APD_Voltage)) APD_Wave        //create the new wave
   
    Variable i, APD_Value, Last_APD                 //Fill the new wave whit proper values
    For(i=0;i<=NumPnts(APD_Wave);i+=1)              //
        APD_Value = GetV(APD_Voltage[i])            //
        If(NumType(APD_Value) == 0)                 // APD_Voltage is not emty
            Last_APD = APD_Value                    //
        EndIf                                       //
        APD_Wave[i] = Last_APD                      //
    EndFor                                          //
   
    KillWaves APD_Voltage                          
   
    Wave Date_Points, Time_Sec                      //Declaration of waves to edit
    Edit Date_Points, Time_Sec
   
End


Function LoadTTI_B()            //using /B Flag

    String columnInfoStr = ""
    columnInfoStr += "C=1,F=8,T=4,W=16,N=Date_Points;"              //description for first 9 column
    columnInfoStr += "C=1,F=0,T=4,W=16,N=Time_sec;"
    columnInfoStr += "C=1,F=0,T=4,W=16,N=Lifetime_usec;"
    columnInfoStr += "C=1,F=0,T=4,W=16,N=Frequency_kHz;"
    columnInfoStr += "C=1,F=0,T=4,W=16,N=Phase_deg;"
    columnInfoStr += "C=1,F=0,T=4,W=20,N=Temperature_C;"
    columnInfoStr += "C=1,F=0,T=4,W=16,N=Signal_Amplitude;"
    columnInfoStr += "C=1,F=0,T=2,W=16,N=Ambient_Pressure_Torr;"
    columnInfoStr += "C=1,F=0,T=2,W=16,N=Aux_Data;"
   
    columnInfoStr += "C=12,N='_skip_';"     //skip 12 column
   
    columnInfoStr += "C=1,F=-2,T=4,W=16,N=APD_Voltage;"     //description for text wave
   
    columnInfoStr += "C=5,N='_skip_';"          //skip last 5 columns
   
    LoadWave/B=columnInfoStr/O/Q/A/J/V={"\t"," $",0,0}      //load the waves, skipping all waves after column 22. Column 22 is APD_Voltage
   
    Wave/T APD_Voltage                              //declaration of text wave to convert in numeric wave
    Make/O/N=(numPnts(APD_Voltage)) APD_Wave        //create the new wave
   
    Variable i, APD_Value, Last_APD                 //Fill the new wave whit proper values
    For(i=0;i<=NumPnts(APD_Wave);i+=1)              //
        APD_Value = GetV(APD_Voltage[i])            //
        If(NumType(APD_Value) == 0)                 // APD_Voltage is not emty
            Last_APD = APD_Value                    //
        EndIf                                       //
        APD_Wave[i] = Last_APD                      //
    EndFor                                          //
   
    KillWaves APD_Voltage                          
   
    Wave Date_Points, Time_Sec                      //Declaration of waves to edit
    Edit Date_Points, Time_Sec
   
End
I notice in one of your code segments:

if (APDwave[x] != NaN)


Unfortunately, this does not work as you would expect. A comparison of any number to NaN will always return 0 (false). To test if a number is NaN you must use the numtype function. So your code would be written like this:

if (numtype(APDwave[x]) != 2)  // Not NaN?


I modified the code based on the above advice.
For some reason, Igor is only finding wave0-wave9.
There should be 26 waves, which I then sort through later in the macro.

Can someone please take a look at the attached macro and see if they can spot any errors?
I also attached a random data file you can use to check the macro with.

Thanks again for all your help.
Cory K wrote:
For some reason, Igor is only finding wave0-wave9.
There should be 26 waves, which I then sort through later in the macro.


The LoadWave /B format only loads 9 columns, not all 26.

--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH
jjweimer wrote:

The LoadWave /B format only loads 9 columns, not all 26.
--
J. J. Weimer
Chemistry / Chemical & Materials Engineering, UAH


Correct, I only want to load 9 waves. However, I do not want the FIRST 9 waves,
I am trying to parse only specific columns.

I have found a way around this, but it isnt automated, so its not a great solution.
If I go into the CSV file, and insert 15 commas after the last column heading, everything loads correctly and without issue.
Is there a way to programmatically do this?
I thought of another solution if there is a way to do this in Igor.
Can you specify what row to start loading data from?
If so, I know that by default, row 2 will always have all 26 rows, so everything should go smoothly.
Is there a flag for that?
Cory K wrote:
I thought of another solution if there is a way to do this in Igor.
Can you specify what row to start loading data from?
If so, I know that by default, row 2 will always have all 26 rows, so everything should go smoothly.
Is there a flag for that?


I answered my own question.
I needed to add:
 /L={0,2,0,0,26}
to the LoadWave function.
This did the trick.

Thanks again for your help everyone.