Error: Wave is in use by preemptive thread. Can't be resized or killed.

Hi

I am getting this error executing some code.  It runs fine normally, but after running for a while on many gigabytes of data (around 500 GB) I get this error: Wave is in use by preemptive thread. Can't be resized or killed.

The code is essentially for detecting corrupted data in frames of video data

The error occurs on the following line:

  make /O /N=(NumOut) FSSeq

The 4 arrays FSSeq, FESeq, LSSeq, and TSSeq are used in the multithreaded and threadsafe FrameData_AnalyzeFrame function

The starting point for execution is the FrameData_Analyze_Example function

Due to the amount of data required to set get this error, its a bit hard to give you any firm examples.  Each of the data files we process (SrcWave) are generally just under 4GB of 16 bit unsigned integers loaded from hdf5 files.  The SrcWave for each file process is deleted but the summary arrays (FrameData_AnalyzeSummary) created for each file are renamed and kept

Any suggestions would be welcome

Thanks

Kent

#pragma rtGlobals=1     // Use modern global access method.
 
// Example Usage
function FrameData_Analyze_Example(SrcWave, [SkipAlign, KeepSrc, KeepRes])
  wave SrcWave             // Aligned or unaligned data can be 1d or 2D,  Must be 2D if SkipAlign is set
  variable SkipAlign       // If set skips the frame alignment process
  variable KeepSrc,KeepRes // Setting KeepSrc keeps the SrcWave, Setting KeepRes keeps the FrameData_AnalyzeResults wave
 
  If (ParamIsDefault(SkipAlign))
    SkipAlign = 0
  EndIf
 
  if (paramIsDefault(KeepSrc))
    KeepSrc=0
  endif
 
  if (paramIsDefault(KeepRes))
    KeepRes=0
  endif
 
  string refResultsName = "FrameData_ResultsRef"
  FrameData_MakeResults(refResultsName)
  wave refResults = $refResultsName
 
  // set values to -1 to not use that value to find bad frames
  // values greater than -1 consider a frame bad if the results dont match
  refResults[%FSCnt]       = 1
  refResults[%TSCnt]       = 1
  refResults[%LSCnt]       = 1024
  refResults[%FECnt]       = 1
  refResults[%FSInitIdx]   = 520
  refResults[%TSInitIdx]   = 0
  refResults[%FEInitIdx]   = 4203024
  refResults[%LSInitIdx]   = 528
  refResults[%LSFinIdx]    = 4198920
  refResults[%LSIdxDifMin] = 4104
  refResults[%LSIdxDifMax] = 4104
  refResults[%LSIdxDifAvg] = 4104
  refResults[%BadLine]     = 0
  refResults[%NumWord]     = -1
  refResults[%NumSamp]     = 4203032
  refResults[%NumDisp]     = 0
  refResults[%NumK]        = 16432
 
// Fill these in so the code can identify the framing characters
  variable FSCode = 0x5C5C
  variable FECode = 0x9C9C
  variable LSCode = 0x7C7C
  variable TSCode = 0xDCDC
  variable FBCode = TSCode  // Frame Begin char (First char of the Frame)
  variable NumOut = 8 // 8 outputs from this
 
  FrameData_Analyze(SrcWave,NumOut,FSCode,FECode,LSCode,TSCode,refResults,KeepSrc=KeepSrc,KeepRes=KeepRes)
 
end
// End of example Usage
 
 
// This analyzes the framing data of a 2D data set and characterizes it for each frame
// non -1 values in the RefResults array are then used to flag bad frames for inspection.
// Bad frames are summarized in the FrameData_AnalyzeSummary array
// This functions output 2 arrays.  FrameData_AnalyzeSummary and FrameData_AnalyzeResults
// FrameData_AnalyzeResults lists the analyzed results for every frame.  This wave is normally deleted but can be kept by setting KeepRes=1
// FrameData_AnalyzeSummary lists the RefResults and the analyzed results for any from that does not match the set parameters in RefResults
Function FrameData_Analyze(SrcWave,NumOut,FSCode,FECode,LSCode,TSCode,RefResults,[KeepSrc,KeepRes,OutStatSize,OutStat32])
  wave SrcWave                          // Aligned 2D Source Wave
  variable NumOut                       // Number of FPA Outputs
  variable FSCode,FECode,LSCode,TSCode  // FrameStart Code, Frame End Code, Line Start code, Telem Start code
  wave RefResults                       // 1D FrameData_Results array used to ignore good frames in the summary results array
  variable KeepSrc,KeepRes              // Setting KeepSrc keeps the SrcWave, Setting KeepRes keeps the FrameData_Results wave
  variable OutStat32                    // Overrides the default OutStats Word size (1 = 32 bit, 0 = 16 bit) (Autocalculated)
  variable OutStatSize                  // Overrides the default OutStats size (4 for 32 bit, 8 for 18 bit)
 
  //Variable RunTime = datetime
  variable FrSize = DimSize(SrcWave,0)
  variable LineLen = RefResults[%LSIdxDifAvg]
 
  if (paramIsDefault(KeepSrc))
    KeepSrc=0
  endif
 
  if (paramIsDefault(KeepRes))
    KeepRes=0
  endif
 
  variable debugval = WaveType(SrcWave)
  if (paramIsDefault(OutStat32))
    if ( WaveType(SrcWave) == 0x60 )
      OutStat32 = 1
    elseif ( WaveType(SrcWave) == 0x50 )
      OutStat32 = 0
    else
      Print ("\rFrameData_Analyze Error: OutStat32 can't be deduced.  Please specify it with the optional OutStat32 parameter.\r")
      return 0
    endif
  endif
 
  if (paramIsDefault(OutStatSize))
    if ( OutStat32 == 1 )
      OutStatSize = 4
    else
      OutStatSize = 8
    endif
  endif
 
  if ( WaveDims(SrcWave) != 2 )
      Print ("\rFrameData_Analyze Error: SrcWave must be a 2D wave to be analyzed.\r")
      return 0
  endif
 
  make /O /N=(NumOut) FSSeq
  make /O /N=(NumOut) FESeq
  make /O /N=(NumOut) LSSeq
  make /O /N=(NumOut) TSSeq
  FSSeq = FSCode
  FESeq = FECode
  LSSeq = LSCode
  TSSeq = TSCode
 
  string resultsName = "FrameData_AnalyzeResults"
  FrameData_MakeResults(resultsName, Size=DimSize(SrcWave,1))
  wave results = $resultsName
 
  // Multithread Analyzing each frame of data
  Variable nthreads= ThreadProcessorCount
  Variable threadGroupID= ThreadGroupCreate(nthreads)
  variable ii, frameIdx
  variable numFrames = DimSize(SrcWave,1)
 
  for(frameIdx=0; frameIdx < numFrames;)
    for(ii=0; ii<nthreads; ii+=1)
      if (frameIdx < numFrames) // Data Maps
        ThreadStart threadGroupID,ii,FrameData_AnalyzeFrame(SrcWave,NumOut,FSSeq,FESeq,LSSeq,TSSeq,frameIdx,LineLen,OutStatSize,OutStat32,results)
      else
        break
      endif
      frameIdx++
    endfor
    variable threadGroupStatus = 0
    do
      threadGroupStatus = ThreadGroupWait(threadGroupID,100)
    while( threadGroupStatus != 0 )
  endfor
  Variable dummy= ThreadGroupRelease(threadGroupID)
 
  resultsName = "FrameData_AnalyzeSummary"
  FrameData_MakeResults(resultsName, Size=1)
  wave summary = $resultsName
 
  // Stuff the RefResults into the first index of the Summary array for reference
  summary[][0] = RefResults[P]
  summary[%FRNum][0] = NaN
  SetDimLabel 1, 0, RefResults, summary
  string dimLabel
 
  variable numBad = 0
  for(frameIdx=0; frameIdx < numFrames;frameidx++)
    for(ii=0; ii<DimSize(results,0); ii+=1)
      if((RefResults[ii] >= 0) && (RefResults[ii] != results[ii][frameIdx]))
        numBad++
        InsertPoints /M=1 numBad, 1, summary
        summary[][numBad] = results[P][frameidx]
        dimLabel = "BadFr_" + num2str(numBad)
        SetDimLabel 1, numBad, $dimLabel, summary
        break
      endif
    endfor
  endfor
 
  KillWaves FSSeq, FESeq, LSSeq, TSSeq
 
  if (KeepSrc == 0)
    KillWaves SrcWave
  endif
 
  if (KeepRes == 0)
    KillWaves results
  endif
 
  //Printf "RunTime: %d\r" , datetime-RunTime
  Printf "%d Bad Frames Found\r" , numBad
  return numBad
 
end
 
// This function gets called.  Not for direct use
threadsafe Function FrameData_AnalyzeFrame(SrcWave,NumOut,FSSeq,FESeq,LSSeq,TSSeq,FrameIdx,LineLen,OutStatSize,OutStat32,AnalyzeResults)
  wave SrcWave
  variable NumOut
  wave FSSEQ,FESEQ,LSSeq,TSSeq
  variable FrameIdx
  variable LineLen
  variable OutStatSize, OutStat32
  wave AnalyzeResults
 
  variable FrLen = DimSize(SrcWave,0)
  variable ii
  make /O /N=(NumOut) CmpWave
  variable FSCnt = 0
  variable TSCnt = 0
  variable LSCnt = 0
  variable FECnt = 0
  variable FSInitIdx = -1
  variable TSInitIdx = -1
  variable LSInitIdx = -1
  variable LSFinIdx  = -1
  variable LSIdxDifSum = 0
  variable LSIdxDifMax = 0
  variable LSIdxDifMin = 0xFFFFFFFF
  variable FEInitIdx = -1
  variable BadLine = 0
  variable LSDiff = 0
 
  for (ii=0;ii<FrLen;ii+=NumOut)
    CmpWave[] = SrcWave[ii+P][FrameIdx]
    If ((EqualWaves(CmpWave,FSSeq,1)) == 1)
      FSCnt++
      FSInitIdx = ((FSInitIdx < 0) ? ii : FSInitIdx)
    endif
    If ((EqualWaves(CmpWave,FESeq,1)) == 1)
      FECnt++
      FEInitIdx = ((FEInitIdx < 0) ? ii : FEInitIdx)
    endif
    If ((EqualWaves(CmpWave,LSSeq,1)) == 1)
      LSCnt++
      if (LSCnt == 1)
        LSInitIdx = ii
      else
        LSDiff = ii - LSFinIdx
        LSIdxDifSum += LSDiff
        LSIdxDifMin = (LSDiff < LSIdxDifMin) ? LSDiff : LSIdxDifMin
        LSIdxDifMax = (LSDiff > LSIdxDifMax) ? LSDiff : LSIdxDifMax
        BadLine = (((LineLen != LSDiff)) ? BadLine+1 : BadLine )
      endif
      LSFinIdx = ii
    endif
    If ((EqualWaves(CmpWave,TSSeq,1)) == 1)
      TSCnt++
      TSInitIdx = ((TSInitIdx < 0) ? ii : TSInitIdx)
    endif
  endfor
 
  AnalyzeResults[%FRNum][FrameIdx]       = FrameIdx
  AnalyzeResults[%FSCnt][FrameIdx]       = FSCnt
  AnalyzeResults[%TSCnt][FrameIdx]       = TSCnt
  AnalyzeResults[%LSCnt][FrameIdx]       = LSCnt
  AnalyzeResults[%FECnt][FrameIdx]       = FECnt
  AnalyzeResults[%FSInitIdx][FrameIdx]   = FSInitIdx
  AnalyzeResults[%TSInitIdx][FrameIdx]   = TSInitIdx
  AnalyzeResults[%FEInitIdx][FrameIdx]   = FEInitIdx
  AnalyzeResults[%LSInitIdx][FrameIdx]   = LSInitIdx
  AnalyzeResults[%LSFinIdx][FrameIdx]    = LSFinIdx
  AnalyzeResults[%LSIdxDifMin][FrameIdx] = LSIdxDifMin
  AnalyzeResults[%LSIdxDifMax][FrameIdx] = LSIdxDifMax
  AnalyzeResults[%LSIdxDifAvg][FrameIdx] = (LSCnt > 1) ? LSIdxDifSum/(LSCnt-1) : 0
  AnalyzeResults[%BadLine][FrameIdx]     = (LineLen > 0) ? BadLine : 0
 
  if (((OutStatSize > 4) && (OutStatSize != 0)) || ((OutStatSize > 8) && (OutStatSize == 0)))
    AnalyzeResults[%NumWord][FrameIdx] = 0
    AnalyzeResults[%NumSamp][FrameIdx] = 0
    AnalyzeResults[%NumDisp][FrameIdx] = 0
    AnalyzeResults[%NumK][FrameIdx]    = 0
    for (ii=0;ii<NumOut;ii+=1)
      if (OutStat32 == 0)
        AnalyzeResults[%NumWord][FrameIdx] += (SrcWave[FrLen-7*NumOut + ii][FrameIdx] << 16) + SrcWave[FrLen-8*NumOut + ii][FrameIdx] 
        AnalyzeResults[%NumSamp][FrameIdx] += (SrcWave[FrLen-5*NumOut + ii][FrameIdx] << 16) + SrcWave[FrLen-6*NumOut + ii][FrameIdx] 
        AnalyzeResults[%NumDisp][FrameIdx] += (SrcWave[FrLen-3*NumOut + ii][FrameIdx] << 16) + SrcWave[FrLen-4*NumOut + ii][FrameIdx] 
        AnalyzeResults[%NumK][FrameIdx]    += SrcWave[FrLen-2*NumOut + ii][FrameIdx]
      else
        AnalyzeResults[%NumWord][FrameIdx] += SrcWave[FrLen-4*NumOut + ii][FrameIdx]
        AnalyzeResults[%NumSamp][FrameIdx] += SrcWave[FrLen-3*NumOut + ii][FrameIdx]
        AnalyzeResults[%NumDisp][FrameIdx] += SrcWave[FrLen-2*NumOut + ii][FrameIdx]
        AnalyzeResults[%NumK][FrameIdx]    += (SrcWave[FrLen-1*NumOut + ii][FrameIdx] && 0xFFFF0000) >> 16
      endif
    endfor
  endif
  
  killwaves CmpWave
  return 0
end
 
// Using this to create the Results/Ref waves allows them to be modified in the future and keep the code
// Backward compatible
function FrameData_MakeResults(Name, [Size])
  string Name       // The name of the array created
  variable Size     // If set the returned array will be 2D with Dim 1 set by size (Results array type) Default for 1D array (Ref array type)
 
  variable NumParams = 18
 
  if (paramIsDefault(Size))
    make /O /N=((NumParams)) $Name
  else
    make /O /N=((NumParams),Size) $Name
  endif
 
  wave FrameData_Results = $Name
  FrameData_Results = (paramIsDefault(Size)) ? -1 : NaN
 
  
  if (!paramIsDefault(Size))
    SetDimLabel 1, -1,  FrIdx,        FrameData_Results  // Dim 1 - Frame Index
  endif
 
  SetDimLabel 0,  0,  FrNum,        FrameData_Results  // Dim 0 Idx 0  - Frame Number Index Index
  SetDimLabel 0,  1,  FSCnt,        FrameData_Results  // Dim 0 Idx 1  - Number of Frame Starts detected
  SetDimLabel 0,  2,  TSCnt,        FrameData_Results  // Dim 0 Idx 2  - Number of Telem Starts detected
  SetDimLabel 0,  3,  FECnt,        FrameData_Results  // Dim 0 Idx 3  - Number of Frame Ends detected
  SetDimLabel 0,  4,  LSCnt,        FrameData_Results  // Dim 0 Idx 4  - Number of Line Starts detected
  SetDimLabel 0,  5,  FSInitIdx,    FrameData_Results  // Dim 0 Idx 5  - Offset from start of frame to the first Frame Start character
  SetDimLabel 0,  6,  TSInitIdx,    FrameData_Results  // Dim 0 Idx 6  - Offset from start of frame to the first Telem Start character
  SetDimLabel 0,  7,  FEInitIdx,    FrameData_Results  // Dim 0 Idx 7  - Offset from start of frame to the first Frame End character
  SetDimLabel 0,  8,  LSInitIdx,    FrameData_Results  // Dim 0 Idx 8  - Offset from start of frame to the first Line Start character
  SetDimLabel 0,  9,  LSFinIdx,     FrameData_Results  // Dim 0 Idx 9  - Offset from start of frame to the last Line Start character
  SetDimLabel 0,  10, LSIdxDifMin,  FrameData_Results  // Dim 0 Idx 10 - Minimum distance from one Line Start to the next Line Start
  SetDimLabel 0,  11, LSIdxDifMax,  FrameData_Results  // Dim 0 Idx 11 - Maximum distance from one Line Start to the next Line Start
  SetDimLabel 0,  12, LSIdxDifAvg,  FrameData_Results  // Dim 0 Idx 12 - Average distance from one Line Start to the next Line Start
  SetDimLabel 0,  13, BadLine,      FrameData_Results  // Dim 0 Idx 13 - Number of Lines where the distance to the last Line Start to the current Line Start is not the passed in value
  SetDimLabel 0,  14, NumWord,      FrameData_Results  // Dim 0 Idx 14 - Total number of words in the frame (All words - Idles + Data)
  SetDimLabel 0,  15, NumSamp,      FrameData_Results  // Dim 0 Idx 15 - Total number of data words in the frame (Data words - Non Idle words)
  SetDimLabel 0,  16, NumDisp,      FrameData_Results  // Dim 0 Idx 16 - Total number of words with disparity errors in the frame 
  SetDimLabel 0,  17, NumK,         FrameData_Results  // Dim 0 Idx 17 - Total number of K code words in the frame
 
end

 

Seems like the only way that error would happen is if one of your threads is still running. But that would seem to indicate that the loop calling `ThreadGroupWait()` exited without getting a zero return, which doesn't seem possible. Can you print out the value of the variable `dummy`? Is it ever non-zero?

The fact that this problem appears only after a lot of processing of huge waves suggests some sort of memory leak (perhaps). That might simply make Igor erratic. It's hard to test for such problems since it requires running with lots of huge waves.

Are you able to create a test of some sort? Can you manufacture dummy input that exercises the code in the same way as your real data? If you can write code to create the needed input for such a test, then it might be feasible to test it here at WaveMetrics.

This code generates the "Wave is in use by pre-emptive thread" error the second time I run it. It uses an Abort to abort the function before ThreadGroupRelease is called. I think any error could take the place of the abort but I would expect that the user would get the original error message before getting the "Wave is in use by pre-emptive thread" error.

ThreadSafe Function TestThreadFunc(WAVE SrcWave)
 
End
 
Function Test()
  Make /O /N=(10000,100) SrcWave
 
  // Multithread Analyzing each frame of data
  Variable nthreads= ThreadProcessorCount
  Variable threadGroupID= ThreadGroupCreate(nthreads)
  variable ii, frameIdx
  variable numFrames = DimSize(SrcWave,1)
 
  for(frameIdx=0; frameIdx < numFrames;)
    for(ii=0; ii<nthreads; ii+=1)
      if (frameIdx < numFrames) // Data Maps
        ThreadStart threadGroupID,ii,TestThreadFunc(SrcWave)
      else
        break
      endif
      frameIdx++
    endfor
    if (frameIdx == 80)
        Abort "Aborting"
    endif
    variable threadGroupStatus = 0
    Printf "Entering ThreadGroupWait Loop. frameIdx=%d\r", frameIdx
    do
     threadGroupStatus = ThreadGroupWait(threadGroupID,100)
    while(threadGroupStatus != 0)
    Print "Finished ThreadGroupWait Loop"
  endfor
  Variable dummy= ThreadGroupRelease(threadGroupID)
  
  Print "Finished Test"
End

Executing this kills all threads and allows the function to run again without throwing the "Wave is in use by pre-emptive thread" error:

Print ThreadGroupRelease(-2)    // Kill all threads

Doing "New Experiment" also kills all threads.

We will try the ThreadGroupRelease(-2) and see if we can execute after.  The error doesn't really happen until we have been processing continuously for 4-5 hours on beefy ~200 ThreadProcessorCount machines.  I will see if I can come up with a test case.  Generally to recover from the error we have to save the experiment, exit igor, restart Igor, then kill the XXseq waves.  Next time it happens we will do a little bit of diagnosing to get more details

Thanks for the insights

> The error doesn't really happen until we have been processing continuously for 4-5 hours on beefy ~200 ThreadProcessorCount machines. 

Is that on Windows? I'm asking because here I can only use up to 64 cores in IP.

 

Regarding your threadsafe functions I would think you can get it faster quite a bit more:

- Use FindDimLabel to translate dimension labels to numeric indices and prefer that in inner loops

- Precompute things like

FrLen-2*NumOut

- I would playaround with translating

AnalyzeResults[%NumK][FrameIdx]    += SrcWave[FrLen-2*NumOut + ii][FrameIdx]

to something use implied loops (p, q, r, s). So that you don't need the inner for loop. Something like
 

Make/FREE/N=(numOut) junkWave
offset2 = FrLen-2*NumOut
junkWave[] = SrcWave[offset2 + p][FrameIdx]
AnalyzeResults[%NumK][FrameIdx] = Sum(junkWave)

could be a start. Or maybe also matrixop would help.

Hi Thomas

Thanks for the input

It is running on windows.  I will have to look into how many threades our higher core count machines were being given.  My understanding is the number of threads caps out at 200 but I have not looked specifically at that in a bit.  I do know when they are processing task manager shows the cores are 100% utilized.  But again something I will check into

As for the optimizations, by far and away the lions share of the processing time is happening in the for loop early in the threadsafe function just after the variable declarations and in that I try to keep the janky dereferencing to a minimum

For instance in a single threadsafe execution the initial for loop may have a million or more iterations while the for loop working on AnalyzeResults might run 4 or 8 times

The threadsafe function itself is generally called less than 1000 times per data set

I have always wonderered what if any optimizations the scripting language is able to accomplish, but I generally assume it's not able to do any and I arrange my code correspondingly.

Prior to Windows 11, an application was limited to 64 logical processors unless it had code to specifically allow it to use more, which Igor does not have. Starting with Windows 11 (and Windows Server 2022) it is no longer necessary for an application to use special code to access more than 64 logical processors. See https://learn.microsoft.com/en-us/windows/win32/procthread/processor-gr… for the details.

We don't have access to a machine with more than 64 logical processors, so we have no idea how this works when running Igor on Windows 11.

Note that Igor does limit the number of threads to 100 in most situations, including the MultiThread keyword for wave assignment and ThreadGroupCreate. That limit was primarily to prevent the user from accidentally trying to use an inappropriate number of threads and probably should be raised in Igor Pro 10, currently in beta.

If you're willing to beta test IP10, please sign up at https://www.wavemetrics.com/form/igor-pro-10-beta-tester-signup. In the comments, please mention that you're interested in testing a version of Igor that has an increased value for MAXIMUM_NUMBER_OF_THREADS. I'll get in touch with you about testing.

Note that Igor's ThreadProcessorCount function will return the actual number of logical processors, not the number that Igor can use.

 

Regarding Igor doing optimizations, your assumption is correct. Igor's compiler doesn't do much to optimize your code.

We had enquired about this a few years ago and I had thought the number was 200 but I am clearly mistaken.  Next time I am with the machines that do that stuff I will do some investigation of both cpu usage and the answers I am getting in Igor code.  We had discussed with you guys about upping the thread count while Igor 9 was being developed and I guess I was assuming the limit was changed.  I'll sign up for the beta and see what I can do to help test

The newest system we have is running a single threadripper pro 7995 WX with 192 threads.  In the past it has been multi xeon cpu systems with similar or higher thread counts.  For reference it takes it about 3 seconds to run that routine on a 4 GB data file

 

Thanks

Kent