Hierarchical clustering 1: Dissimilarity Matrix

The first step in heirarchical clustering involves creating a dis-similarity matrix (wave: DisSimilarityMatrix)

The input to the algorithm is wave: Cluster_Waves which is a 2D wave. The rows represent samples (for example a time-trace) and columns represent repeats/different experiments.

The function MakeDisSimilarityMatrix takes Cluster_Waves as input and computes the dissimilarity matrix.

The function ComputeDissimilarity2waves computes the difference between 2 columns in Cluster_Waves. The computation is threadsafe and is sped-up by the usage of multiple cores. 

The helper function ImageTransformRowCol  is based on Igor's imagetransform getcol function that works also on 1D waves.

Example:

//--- create a variable wave with 10 samples with 2 clusters; the first 5 samples belong to the first cluster

make/o/n=(100,10) testWave=gnoise(1)+(q>5)
MakeDisSimilarityMatrix(testWave)//---compute the dissimilarity table

 
//---a helper function, similar to imagetransform for 1/2 dim waves
threadsafe Function/wave  ImageTransformRowCol(variable row,variable col,wave inputwave)
	make/o/n=1 ExtractedWave
	if(numpnts(inputwave)>1)
		if(numtype(row)==0)
			if((dimsize(inputwave,0)<=1))
				duplicate/o inputwave,ExtractedWave
			elseif((dimsize(inputwave,0)>1)&&(dimsize(inputwave,1)>0))
				imagetransform/g=(row) getrow inputwave
				wave W_ExtractedRow
				duplicate/o W_ExtractedRow,ExtractedWave
			endif
		else
			if((dimsize(inputwave,1)<=1))
				duplicate/o inputwave,ExtractedWave
			elseif((dimsize(inputwave,1)>1)&&(dimsize(inputwave,1)>col))
				imagetransform/g=(col) getcol inputwave
				wave W_ExtractedCol
				duplicate/o W_ExtractedCol,ExtractedWave
			endif
		endif
	endif
	return ExtractedWave
end
//---actual clustering
//---find the difference between two waves
Threadsafe function ComputeDissimilarity2waves(wave Cluster_Waves,variable  pos1,variable  pos2)
	duplicate/o ImageTransformRowCol(nan,pos1,Cluster_Waves),waveI
	duplicate/o ImageTransformRowCol(nan,pos2,Cluster_Waves),waveJ
	waveI=(waveI-waveJ)^2  //--- squared difference between the two waves
	return sqrt(sum(waveI))/numpnts(waveI)
end

//---creates the dissimilarity matrix based on the selected experiments
//---we assume that the wave Cluster_Waves is a 2D wave in which the columns represent different samples (for example, a recorded time-trace). 

Function MakeDisSimilarityMatrix(wave Cluster_Waves)
	variable i,j,reg
	make/o/n=(dimsize(Cluster_Waves,1),dimsize(Cluster_Waves,1)) DisSimilarityMatrix=nan
	Variable n_,nthreads= ThreadProcessorCount
	Variable threadGroupID= ThreadGroupCreate(nthreads)		
	for(i=0;i<dimsize(DisSimilarityMatrix,0)-1;i++)
		j=i+1
		do
			for(n_=0;n_<nthreads;n_++)
				if(n_+j<dimsize(DisSimilarityMatrix,0))
					ThreadStart threadGroupID,n_,ComputeDissimilarity2waves(Cluster_Waves,i,n_+j)//---populate the similarity matrix
				endif
			endfor
			do
				Variable threadGroupStatus = ThreadGroupWait(threadGroupID,1000)
			while( threadGroupStatus != 0 )//---release threads			
			for(n_=0;n_<nthreads;n_++)
				if(n_+j<dimsize(DisSimilarityMatrix,0))
					DisSimilarityMatrix[i][n_+j]=ThreadReturnValue(threadGroupID,n_)
					DisSimilarityMatrix[n_+j][i]=DisSimilarityMatrix[i][n_+j]
				endif
			endfor							
			j+=n_			
		while(j<dimsize(DisSimilarityMatrix,0))
	endfor
	Variable dummy= ThreadGroupRelease(-2)	//---release threads
end

 

 

 

Forum

Support

Gallery

Igor Pro 10

Learn More

Igor XOP Toolkit

Learn More

Igor NIDAQ Tools MX

Learn More