• Sonuç bulunamadı

Clustering fMRI data with a robust unsupervised learning algorithm for neuroscience data mining

N/A
N/A
Protected

Academic year: 2021

Share "Clustering fMRI data with a robust unsupervised learning algorithm for neuroscience data mining"

Copied!
10
0
0

Yükleniyor.... (view fulltext now)

Tam metin

(1)

ContentslistsavailableatScienceDirect

Journal

of

Neuroscience

Methods

j ou rn a l h om epa g e : w w w . e l s e v i e r . c o m / l o c a t e / j n e u m e t h

Clustering

fMRI

data

with

a

robust

unsupervised

learning

algorithm

for

neuroscience

data

mining

Hadeel

K.

Aljobouri

a,b,∗

,

Hussain

A.

Jaber

a

,

Orhan

M.

Koc¸

ak

c

,

Oktay

Algin

d,e

,

Ilyas

ankaya

a

aElectricalandElectronicsEngineeringDepartment,GraduateSchoolofNaturalScience,AnkaraYıldırımBeyazıtUniversity,Ankara,Turkey bBiomedicalEngineeringDepartment,CollegeofEngineering,Al-NahrainUniversity,Baghdad,Iraq

cPsychiatryDepartment,SchoolofMedicine,KırıkkaleUniversity,Kırıkkale,Turkey

dDepartmentofRadiology,AtaturkTrainingandResearchHospital,AnkaraYıldırımBeyazıtUniversity,Ankara,Turkey eNationalMRResearchCenter,BilkentUniversity,Ankara,Turkey

h

i

g

h

l

i

g

h

t

s

•Anovelapplicationoftherobustunsupervisedlearningapproachisproposedinthecurrentstudy.Robustgrowingneuralgas(RGNG)algorithmwas fedintofMRIdataandcomparedwithgrowingneuralgas(GNG)algorithm,whichhasnotbeenusedforthispurposeoranyothermedicalapplication.

•LearningalgorithmsproposedinthecurrentstudyarefedwithrealandfreeauditoryfMRIdatasets.

•Anothercomparisonwasconductedwiththemodel-based(hypothesis)dataanalysisapproachusingthestatisticalparametricmapping(SPM)package, whichisbasedonthegenerallinearmodel.

•ThefMRIresultobtainedbyrunningRGNGwaswithintheexpectedoutcomeandissimilartothosefoundwiththehypothesismethodindetecting activeareaswithintheexpectedauditorycortices.

•ResultsshowthatthefMRIapplicationofthepresentedRGNGapproachisclearlysuperiortootherapproachesintermsofitsinsensitivitytodifferent initializationsandthepresenceofoutliers,aswellasitsabilitytodeterminetheactualnumberofclusterssuccessfully,asindicatedbyitsperformance measuredbyminimumdescriptionlength(MDL)andreceiveroperatingcharacteristic(ROC)analysis.

a

r

t

i

c

l

e

i

n

f

o

Articlehistory:

Received17December2017

Receivedinrevisedform13February2018 Accepted14February2018

Availableonline20February2018 Keywords:

Clusteringtechnique Datamining

Growingneuralgas(GNG) Robustgrowingneuralgas(RGNG)

a

b

s

t

r

a

c

t

Background:Clusteringapproachesusedinfunctionalmagneticresonanceimaging(fMRI)researchuse brainactivitytodividethebrainintovariousparcelswithsomedegreeofhomogeneouscharacteristics, butchoosingtheappropriateclusteringalgorithmsremainsaproblem.

Newmethod:Anovelapplicationoftherobustunsupervisedlearningapproachisproposedinthecurrent study.Robustgrowingneuralgas(RGNG)algorithmwasfedintofMRIdataandcomparedwithgrowing neuralgas(GNG)algorithm,whichhasnotbeenusedforthispurposeoranyothermedicalapplication. LearningalgorithmsproposedinthecurrentstudyarefedwithrealandfreeauditoryfMRIdatasets. Results:ThefMRIresultobtainedbyrunningRGNGwaswithintheexpectedoutcomeandissimilarto thosefoundwiththehypothesismethodindetectingactiveareaswithintheexpectedauditorycortices. Comparisonwithexistingmethod(s):ThefMRIapplicationofthepresentedRGNGapproachisclearly superiortootherapproachesintermsofitsinsensitivitytodifferentinitializationsandthepresenceof outliers,aswellasitsabilitytodeterminetheactualnumberofclusterssuccessfully,asindicatedby itsperformancemeasuredbyminimumdescriptionlength(MDL)andreceiveroperatingcharacteristic (ROC)analysis.

Conclusions:TheRGNGcandetecttheactivezonesinthebrain,analyzebrainfunction,anddetermine theoptimalnumberofunderlyingclustersinfMRIdatasets.Thisalgorithmcandefinethepositionsof thecenterofanoutputclustercorrespondingtotheminimalMDLvalue.

©2018ElsevierB.V.Allrightsreserved.

∗ Correspondingauthorat:BiomedicalEngineeringDepartment,Collegeof Engi-neering,Al-NahrainUniversity,Baghdad,Iraq.

E-mailaddress:hadeelbme77@eng.nahrainuniv.edu.iq(H.K.Aljobouri).

1. Introduction

Functionalmagneticresonanceimaging(fMRI)isapowerfultool

usedbyneuroscientiststoexaminebrainactivitybycalculatingthe

levelsofoxygenintheblood.Bloodoxygenationleveldependent

https://doi.org/10.1016/j.jneumeth.2018.02.007 0165-0270/©2018ElsevierB.V.Allrightsreserved.

(2)

(BOLD)signalrepresentstheratioofoxygenatedtodeoxygenated

hemoglobinmeasurementsinthebloodandiscloselyrelatedto

neuralactivity.FMRIconsiders metabolicfunctioninmeasuring

neuralactivitybecauseitdeterminesthehemodynamicresponse

function(HRF)ormetabolicdemands(oxygenconsumption)inthe

brainorspinalcord(Aljobourietal.,2015).

FMRIisusedtounderstandneuronalmechanismsbehindmany

disorders,suchasbipolardisorder,schizophrenia,Parkinson’s

dis-ease,autismspectrumdisorders,andAlzheimer’sdisease.

ThefMRIdatasetis acquiredfromascannermachine inthe

form of raw data as sequences of 3D images because of the

variations of voxel intensitiesover time. With different

exper-imental conditions, the acquired fMRI data are formed as a

combinationofBOLDsignalchangesandnoisesorartifacts.These

artifacts are attributed to hardware systems (the MRI scanner

itself),individualsthemselves(e.g.,headmotion),orphysiological

effects.

ClusteringtechniquesinfMRIresearchareconsidered

model-free or exploratorydata analysis approaches. These techniques

candefinetheactivezonesandfindstructuresinthebrainand

fMRI data competently without the need for prior knowledge

aboutactivationpatternsorexperiments.However,choosingthe

appropriateclusteringalgorithmsremainsaproblem.Independent

componentanalysis(ICA)andprincipalcomponentanalysis(PCA)

algorithmsareregardedasfinemethodstoseparatefMRIsignals

intoagroupofdefinedcomponents.Thesealgorithmscannot

eas-ilypredictoccurrencesduringacquisitionandhavelimitationsin

termsofindependenceandorthogonality,respectively(Korczak,

2012).VariousclusteringalgorithmsareappliedinfMRIfordata

mininginsteadofthepreviousclassicalmethods,whichcannot

eas-ilypredictoccurrencesduringacquisition.Theclassicalmethods

includeK-means,fuzzyclassification,hierarchicalclassifications,

Linde–Buzo–Gray(LBG),clusteringusingrepresentatives(CURE),

neuralmodelsKohonen’sself-organizingmap(SOM),neuralgas

(NG),andFritzke’sgrowingneuralgas(GNG)algorithms.However,

oneofthemainproblemsoffMRIclusteringalgorithmsis

decid-ingthenumberofclustersasaninput(Dimitriadouetal.,2004;

Wismulleretal.,2004).Resultswithahighlevelofinterpretation

wereobtainedusingclusteringapproaches,buttheseapproaches

areassociatedwithhighcostintermsofcomputingtimeand

mem-oryspace(BockandDiday,2000;Lindquist,2008;Goutteetal.,

1999; Baumgartneretal.,1998; Liaoetal.,2008;Katwal,2011;

Pereiraetal.,2009).

TheGNGalgorithmexhibitsthebestclustering performance

andproducerobustness;however,thisalgorithmhaslimitations

associatedwiththesensitivityforinitialization(choosingasetof

neuronvectors),theorderofinputvectors,andtheexistenceof

manyoutliers(QinandSuganthan,2004).Therefore,anovel

appli-cation,whichreliesonusingtherobustgrowingneuralgas(RGNG)

algorithmwithfMRIdatasets,isproposedtodetecttheactivezones

inthebrain.ThisalgorithmwascomparedwiththeGNGalgorithm,

whichhasnotyetbeenusedforthispurpose.

RGNGwasproposedtoidentifyactivatedregionsinthebrain

of various fMRI datasets with differentand important features

unlikeotherclusteringapproaches.Differentrobustness

proper-tiesareassociatedwiththeRGNGnetworkbecauseitisinsensitive

toinitialization,inputsequenceordering,andoutliers,determines

theoptimalnumberofunderlyingclustersduringdifferentgrowth

stages,anddealswithmultimodaldatasetseffectively.

The approach of using RGNG with fMRI dataset is the first

attemptintheliterature.Thecurrentstudyis organizedas

fol-lows:Section2providesthemostimportantpackagesusedwith

fMRIdataanalysisincomparisonwiththeclusteringandespecially

theproposedRGNGapproach.Section3describestheproposed

workandalgorithmsusingsimpleflowchartsandtables.Section

4describesthepreprocessingandperformancemeasures.Section

5presentstheexperimentaloutputresults.Finally,Section6

con-cludesthepaperandintroducesfutureresearchdirections.

2. fMRIdataanalysistechniques

FMRIdataanalysismethodscanbedividedmainlyintotwo

cat-egories,namely,model-driven ormodel-based(hypothesis)and

data-drivenormodel-free(exploratory)approaches.The

model-drivenmethodsdeal withdefiniteactivationpatterns,response

functions,orexperiments.Thesemodelsrequireprevious

knowl-edgeand statisticallytest theanalyzeddataonthepresenceor

absenceofaresponse.Themethodsrelatedtothiscategory

dif-fereitherbystatisticalmethodorsignalestimationprocedurein

performingtheactivation.Anexampleisthecommonlyused

gen-erallinearmodel(GLM),whichisthemostfundamentalandbasic

approachusedforfMRIdataanalysiswithstatisticalparametric

mapping(SPM)(SPM,1991).

Data-drivenmethods,incontrast,havetheabilitytocountallof

thevoxelssimultaneously,definetheactivezonesandfind

struc-turesin thebrainandfMRI datacompetently withoutprevious

knowledgeaboutactivationpatternsorexperimentalparadigms.

Thesemethodscanbedividedmainlyintotwo groups,namely,

blindsourceseparation(BSS)andclusteringapproach.

BSSattemptstofindunobservedsignalsor“sources”from

sev-eralobservedmixturesandgenerateamodelofthedata.Various

methodsareusedforBSS:PCA(Fristonetal.,1993;Fristonetal.,

1996),ICA(Hyvarinenetal.,2001;McKeownetal.,1998;Calhoun etal.,2001;Mckeown,2000),andcanonicalcorrelationanalysis

(CCA)(Frimanetal.,2002)methodsareusedtoseparatethese

mix-turestoobtainsourcesignals.TheFMRIBSoftwareLibrary(FSL)

package (TheAnalysisGroup,2012)usesmelodicICA, which is

adata-driven(model-free)approach,butisinsufficientformost

fMRIdatasetsbecauseICAhassomelimitations.ICAattemptsto

findmaximallyindependentmapsandsplitthewideactivation

areas into a number of maps, which have a strongcorrelation

betweentimecourses(TCs)ofdifferentcomponents.The

indepen-dentcomponents(ICs)fromICAdecompositionareunordered,that

is,thisfeatureisassociatedwiththemodelorderselectionforlinear

model-basedregionextraction,whichremainsanopenproblem.

Thus,determiningwhetherornotICsarecorrelatedwithnonlinear

activationisdifficult.

Clustering(Chenetal.,2006; Seghieretal.,2007)analysisis

basedongroupvoxelsaccordingtotheirTCsignalsinasimilarHDR

(Hemodynamic response) over time. The advantages presented

for theproposed clustering RGNG algorithm will beexplained.

Thisapproachismainlyadata-drivenormodel-free(exploratory)

approach.Table1comparesstatistical,transformationand

cluster-ingmethods.

ClusteringanalysisiswidelyusedforfMRIdataprocessingto

detectthebrainactiveareaeffectively.Inthefollowingparagraph,

thedataminingideaisidentifiedbasedontheGNGnetwork.

Lachicheetal.(2005)introducedanewinteractivedatamining

approachtofMRIimages,whichhasnotbeenusedforthepurpose

of thecurrent study,and showedthat GNGsuccessfully

recog-nizedtheactiveareasinthefMRIimagesofthebrain(Lachiche

etal.,2005).TheideaofdefiningadistancebetweenvoxelsoffMRI

imageswasargued,andthisdistanceisproposedtobebasedon

thesignalonly.

Korczak(2007)introducedanewinteractivedatamining

tech-niquetofMRIimagestoobservecerebralactivity;thistechniqueis

basedonadata-drivenapproach(Korczak,2007).Different

unsu-pervised clustering algorithms were presented, developed, and

tested onsequencesof fMRI images. Five clusteringalgorithms

(GNG,SOM,LBG,K-means,andCURE)wereappliedtosynthetic

(3)

perfor-Table1

Comparisonamongstatistical,transformationandclustering.

BasedApproach fMRIAnalysisMethods ApproachProperties Statisticalmethod Model-Driven/Model-Based/Hypothesis

UsedwithSPMpackageandbasedonGLM

Themostfundamental,basicandcommonlyusedapproachforfMRI dataanalysis,butitneedspreviousknowledgeaboutactivation patternsorexperiments.

Transformationmethod Data-Driven/ModelFree/Exploratory

UsedwithFSLpackageandbasedonmelodicICanalysis

Itisbasedonlinearmixingandisunordered.Thus,itmustdealwith independentdata.

Clusteringmethod Data-Driven/ModelFree/Exploratory

RGNGisanexamplewhichwereusedinthisstudy

ItcandefineactivezonesandidentifystructuresinthebrainandfMRI datacompetentlywithouttheneedforpreviousknowledgeabout activationpatternsorexperiments.

manceoftheGNGalgorithmwasthebestamongallotherclustering methods,withacceptablerobustness.

Heydaretal.(2009)developedthealgorithmoftheGNG

net-work,whichcanruntheoptimalnumberofclustersautomatically

(Heydaretal.,2009).Theexperimentalresultsusedartificialand

real fMRI datasets with the proposed algorithm, which is an

improvedversionoftheGNGalgorithm.TheycomparedtheJaccard

coefficientoftheproposedalgorithmwithsomewell-known

clus-teringalgorithms,suchasK-means,NG,GNG,andfuzzyC-means

(FCM);theresultsshowedthat theproposedalgorithm

outper-formedtheotheralgorithms.

TheGNGoriginatesfromtheNGalgorithmbyFritzke(Fritzke,

1995;Fritzke,1997),andtheRGNGalgorithmwasintroducedby

QinandSuganthan(2004)withintheGNGstructure.Thework

pre-sentedinthispaperusingRGNGwithfMRIdatawillbethefirst

attemptintheliterature.Thecurrentstudypresentedhowtofeed

RGNGwithrealandfreeauditoryfMRIdatasets.

GNGandRGNG,bothartificialneuralnetworkapproachesbased

on unsupervised clustering for fMRI analysis, are compared in

Table2.Thistablepresentstheresearcherswhointroducedthese

approaches,theresearcherswhousedtheseapproachesinfMRI

research,and the advantages and limitations of each approach

(AlJobourietal.,2017).

3. Methodologyandproposedwork

The GNGalgorithm is reviewed beforeintroducing the

pro-posedRGNGalgorithmforfeedingwithfMRIdata.TheGNGand

RGNGalgorithmsareextensiveandcomplex.Thus,flowchartsand

amathematicalmodelweredevelopedforconvenienceandeasier

writingoftherelatedcodes.

3.1. GNGalgorithm

TheGNGalgorithmwasdevelopedbyFritzke(1995,1997);he

proposedchangingtheunitnumbers(mostlyincreased)inaSOM

networkwithavariabletopologicalstructure.TheGNGisagrowing

softcompetitivelearningalgorithm,whichcombinesthetopology

formationrulesofthecompetitiveHebbianlearning(Martinetzand

Schulten,1991)withthegrowingcellstructures(Fritzke,1994)into

anewmodel.

BeforefeedingtheGNGalgorithm,thefollowing parameters

mustbedefined:

Inthesubsequentexperiments,theparametersettingsarefixed

foreachalgorithm,withtypicalvaluesproposedintheliterature.

TheGNGalgorithmwassetwithtypicalvaluesasin(Fritzke,1997):

εb=0.05,εn=0.006,␣max=100,ˇ=0.0005,and=300.

Fig.1presentstheflowchartoftheGNGalgorithmandshows

thattheinactiveneuronsthatdonotwinduringalongtime

inter-valmaybedetectedthroughtheGNGalgorithm bytracingthe

(4)

changesofanagevariableassociatedwitheachedge.Theproposed

flowchartcanbesummarizedinthefollowingsteps:

TheGNGstartswithaminimalnetworksize,andafew

num-bersofnewneuronsandconnectionsareinsertedintoagrowing

structureusingvectorquantizationuntilthedesiredqualityofthe

modelisachieved(e.g.,netsize,timelimit,predefinednumbersof

neuronsinserted,orsomeperformancemeasure).

3.2. Robustgrowingneuralgas(RGNG)algorithm

The“deadnode”problemoccursintheGNGalgorithmbecause

ofthegrowthschemeassociatedwiththeGNGalgorithm.Dead

node problems occur because of inappropriate initializations,

whichcausesomeprototypestoneverwinthroughthetraining

process.Evenwiththeinitializationinsensitiveclustering

meth-ods,goodclusteringresultsmaynotbeobtainediftheorderofthe

inputsequenceisnotchosenproperly.

Asidefromproblemsrelatedtothesensitivityforinitialization

andtheorderofinputvectorsorting,otherproblemsrelatedtothe

presenceandpositionofvariousoutliersoccur.Thus,theGNG

net-workmayfailtodifferentiatetheoutliersfromtheinliersthrough

theoriginalprototypeupdatingrulewhenvariousoutliersexistin

adataset.

AnovelRGNGwaspresentedbecauseofthelimitationsofthe

GNGalgorithm(QinandSuganthan,2004)withintheGNG

struc-ture.TherobustnessofRGNGtowardinitialization,inputvector

sorting,andtheexistenceandpositionofvariousoutliersimproved,

aswellasitsabilitytofindtheoptimalnumberofneuronsduring

runtimedynamically.

Fig.2presentstheflowchartoftheRGNGalgorithm.The

pro-posedflowchartcanbesummarizedinthefollowingsteps:

BeforefeedingtheRGNGalgorithm,thefollowingparameters

mustbedefined:

Table2

ComparisonbetweenTwoArtificialNeuralNetworkApproachesbasedontheUnsupervisedClusteringforfMRIAnalysis.

Methods GNG RGNG

Introducedby • Fritzke(1995) • Lachicheetal.(2005)

• Korczak(2007) • Heydaretal.(2009) UsedwithfMRI • QinandSuganthan(2004) Itwasnotpreviouslyproposed Advantages • Itsabilitytomodifythenetworktopologybyremoving

edgeswithitsagevariable

• Theneighborhoodsortingstepisunnecessary • Itcanfindanetworksizeandstructureautomatically,

continuelearning,andaddunitsandconnectionsuntila performancecriterionisfulfilled

• Thenumberofclassesisnotfixedinadvanceasinmost clusteringalgorithms

• Insensitivetoinitialization,inputsequenceordering,and thepresenceofoutliersduringdifferentgrowthstages • Canautomaticallydeterminetheoptimalnumberofclusters • Dealswithmultimodaldatasetseffectively

Limitations Itssensitivityfor: • Initialization

• Theorderofinputvectors • Existenceofmanyoutliers

(5)

Fig.2.FlowchartdesignoftheRGNGalgorithm.

In thesubsequentexperiments, theparametersettings were

fixedforeachalgorithm,withtypicalvaluesproposedinthe

lit-erature.TheRGNGalgorithmwassetwithtypicalvaluesasin[12]:

εbi=0.1,εbf=0.01,εni=0.005,εnf=0.0005,␣max=100,k=1.3,and

=1×10−4.

For eachreference vectorwi,i=1,2, ..., c,a seriesof edges

emerged from itslocation toa jointwithits direct topological

neighbors.SimilartotheGNG,theRGNGalgorithmstartsinstep

1withtheinitializationofafewprototypevectors(usuallytwo),

W= {w1,w2}.

InfMRI,WrepresentstheTCofthefMRIdataset(seeFig.3),

widenotestheTCofexemplari,andwc istheTCoftheclosest

exemplarc.Prototypevectorsw1,w2arerandomlychosenwith

referencevectorsfromtheTCofallvoxelsP (x),andadatavoxel

xisgeneratedasaninputsignalfromthefMRIdatasetusedfor

training,X= {x1,x2,...,xN}.

The maximum number of neurons to grow is defined as

prenumnodeandthemaximumpredefinedtrainingepochisdefined

asMax iterduringeach growthstage withacertain prototype

number.Theinitialorcurrenttrainingepochnumberissetasm=0.

Theiterationpointinthetrainingepoch(taskperiods)m,t=0.

Thus,thefulliterationstepiterovereachgrowthstepisexpressed

as:

iter=m·N+t, ,(1)

whereNisthelengthofthefMRITC.

ClusteringalgorithmsattempttoclassifytheTCsignalsofthe

voxelsintodifferentgroupsaccordingtothesimilarityamongthe

groups.Thetemporalinformationisorderedinclustersandis

inde-pendentofitsspatialneighborhood.Theseclustersaredescribed

byanaverageTCoraclustercenterobtainedbyaveragingallofthe

TCsofthecluster.ThefMRIdataaretransformedintoaTCofvoxel

intensityvariationsproportionaltoitsaverage,asfollows:

Ixav= 1 N



Iix, (2) where Ix

av is theaverageintensityof voxelaof aseriesof N

images;

Wi=Iaxv−Iix. (3)

Thedistancesbetweentwo fMRIsignalsWaand Wb maybe

computedasaEuclidiandistance:

dE=



(Wai−Wbi)2. (4)

Theactivitylevelofthedatasetisgenerallybasedonthedistance

betweeninputvectorsxcomparedwithalloftheexemplarTCsWi.

InRGNG,thesmallestEuclideandistancex−wi canbemadeto

definethebestmatchingnode.

TheRGNGalgorithmusedtheprincipleoftheMDLvalueasthe

clusteringvalidityindex(tofindtheoptimalnumberofclustersand

theircenterpositions)correspondingtothesmallestMDLvalue.

Thus,theoptimalnumberofclustersisdeterminedautomatically

bysearching theextrememinimumvalue oftheMDLmeasure

throughthenetwork-growingprocess.TheRGNGapproachhasthe

smallestMDLvaluerecordedwithrespecttotheGNGcombined

(6)

Fig.3.Proposeddataminingsystemarchitecture.(a)Mainblockdiagram.(b) Exper-imentalparadigm“silence”and“talk”.

approachcanfindtheoptimalnumberofclustersandtheircenter

positionscorrespondingtothesmallestMDLvalue.

4. Simulationdesign

Theblockdiagram showstheprocess ofbrain functiondata

analysis,whichisperformedinthecurrentstudy.Theprocessis

composedoffivestages(seeFig.3B):

• preprocessingoftherawdata;

• clusteringvoxelstogetherbasedonthesimilarityoftheir

inten-sityprofileintheTCsoftheimage;

• overlaywiththestructuralimage;

• visualfMRIimage;

• validation.

4.1. Imagespatialpreprocessing

Theexperimentsinthepresentworkwereperformedin

MAT-LAB2016aandSPM12packageforthepreprocessingstage.Various

noisefactorsinterferewiththefMRIsignalsofinterest.Thesubject

istypicallynevercompletelymotionless.Thus,thepreprocessing

stepsmustbeadaptedtoeachidentifiedartifactbeforethe

cluster-ingphase.ThepresentworkusedSPMfortheauditoryfMRIdata

spatialpreprocessingstages.ThefMRIdatasetispreprocessedby

applyingthefollowingsteps:

• Realignment; • Coregistration; • Segmentation; • Normalize;

• SmoothingusingFWHM=6.

Thefunctional imageswerereoriented toMNIspace, which

isstandardbrainformedbyusingalargeseriesofMRIscanson

normalcontrolsdevelopedattheMontrealNeurologicalInstitute.

Thenthefunctionalrawdatawererealignedtocorrectforthehead

movements.Thehigh-resolutionanatomicalT1imageswere

coreg-isteredwiththerealignedfunctionalimagestoenableanatomical

localizationoftheactivations.Segmentationprocessisnot

manda-tory.SPM12usesMNItemplateimage,whicharethemostcommon

templatesused for fMRIspatial normalization.In this step,the

anatomicalandfunctionalimageswerespatiallynormalizedinto

MNIspace.Finally,thefunctionalrawdatawerespatiallysmoothed

withaGaussiansmoothingkernelof6.

4.2. FMRIdataset

Quantitative performance assessment uses an auditoryfMRI

dataset.AuditorydataiscomposedofentirebrainBOLD/EPIimages

acquiredona modified 2T SiemensMAGNETOM Vision system

(Johnetal.,2013).Eachacquisitionconsistsof64contiguousslices

(64×64×643×3×3mmvoxels).Acquisitiontook6.05s,witha

scantoscanrepetitiontime(TR)setarbitrarilyto7s.Atotalof96

acquisitionsweremadefromasinglesubjectinblocksof6scans

(acquiredduringthesameconditionasastimulantorrest),yielding

16blocksandeachblockfor42s.

The experimentalparadigm for successive blocks alternated

between rest and auditory stimulation, starting with rest (see

Fig.3B).Thefunctionaldatastartatacquisition4,functionalimage

(fM4).Auditorystimulationwascomposedofbisyllabicwords(e.g.,

“mother,”“house,”“weather,”and“movie”)presentedbinaurallyat

arateof60/min.Thefirstfewscansmustbediscarded(“dummy”

leaddidnotexistinscans)becauseofT1effects.

4.3. Performancemeasure

AnovelapplicationoftheRGNGalgorithmwascomparedwith

GNGandSPMusingtwoperformancemeasures,namely,theMDL

valueandreceiveroperatingcharacteristic(ROC)analysis.

IntheRGNGalgorithm,theMDLvalueisoneofthewell-known

informationtheoryevaluationmeasures,whichhasbeenusedas

theclusteringvalidityindex(Rissanen,1983).TheaverageMDL

val-uesduringthegrowthstageshavebeenplottedversusthelength

(7)

Fig.4.MDLvaluesversusN.

Fig.5.ROCcurvesanalysesoftheauditoryfMRIdataset.

approachescombinedwiththeMDLcriterion;thelengthofthe

fMRITCisselectedrandomlyasN=16.Eachdetectedcluster

num-bercorrespondedtotheMDLvalue.TheRGNGapproachhasthe

smallestMDLvaluerecordedwithrespecttoGNGcombinedwith

theMDLprinciple;thus,itcansuccessfullydeterminetheactual

numberofclusters.

TheROCanalysisisanotherindexoftheperformanceofRGNG

incomparisonwithSPM(Skudlarskietal.,1999).TheROCis

well-knowninmedicalimagingandmachinelearningapplications;the

ROCspaceconsistsofthefalsepositiveratio(FPR)onthex-axis

andthetruepositiveratio(TPR)onthey-axis(SunandXu,2014).

ThegoodclassifierspaceisindicatedbyahighTPRandalowFPR,

whereasthebadclassifierspaceisindicatedbyalowTPRanda

highFPR.

ThecurvesinFig.5,generallyindicatedthatthetwomethods

workasgoodclassifierswithahighTPRandalowFPR.TheRGNG

methodcandetectrealactivationsunderthesameFPRratio.

InfMRI,theFPRiscalculatedbydividingthenumberof

misclas-sifiedinactivatedvoxelsbythetotalnumberofvoxelsconsidered,

whereastheTPRiscalculatedbydividingthenumberofcorrect

classificationsofactivatedvoxelsbythetotalnumberofvoxels

con-Fig.6. ActiveareasinthebrainauditorycortexareawithintheSPMpackage.

sidered(Langeetal.,1999).Inthesamesituation,theROCcurves

fortheRGNGandSPMmethodsarecompared,asshowninFig.5.

5. fMRIresults

The principles behind the prototype-based clustering

algo-rithms were introduced in this work. The validity of the

performance oftheRGNG wasanalyzed andverifiedwithfMRI

experiments.FMRIanalysisinvolvesknownareasand functions

ofthebrain.Thus,thecommonandexpectedresultsmustbeused

intheexperiments.Oneoftheseareasistheauditorycortex.Real

auditoryfMRIdata,whicharefreelyavailableforeducationand

evaluationpurposes,wereusedintheexperiments[http://www.

fil.ion.ucl.ac.uk/spm/data/auditory/].Thesedatawereutilizedby

previousworks(Lachicheetal.,2005;Korczak,2007;Heydaretal.,

2009).

OneofthedecisiveadvantagesoffMRIisthatfMRIstudiesdo

notrequiretheanalysisofagroupofvolunteers,butcanproduce

valuableresultsatthelevelofsingleindividuals.Theanalysisof

singlevolunteersis crucialinanalyzing smallstructures,which

exhibit stronginterindividual variation (Campain andMinckler,

1976; Francesco etal., 2003), similar totheauditorycortex,as

showninFig.6.

5.1. ComparingauditorydatarunningRGNGwiththatofGNG

Ablockdesignexperimentwasconductedusingauditory

stim-ulus.Figs.7and8AandBshowtheactiveareasintheauditory

cortexoftheentirebrainwhenrunningtheGNG,andRGNG

algo-rithms.AlthoughauditorycortexregionswerefoundbyGNGand

RGNGalgorithms,inGNGotherareasarealsoactivatedoutsidethis

cortex.InRGNG,theseareasarelessorapproximatelydisappeared

underthesameexperimentandtheauditorystimulusofthewhole

brain.

Fig.7showstheclustersinatransparentorglassbrainimage

whichisamoreflexibleapproachbyspecifyingarealRGB

(red-green-blue)colorvalueforeveryvoxelintheimage.Fig.8AandB

showthealignmentoftheobtainedclustersintoastructuralspace

ofthebrainwhenrunningtheGNGandRGNG,respectively.With

regardtotheoutputresultsobtainedbyrunningthethree

unsuper-visedclusteringalgorithms,spatialinformationisvisualizedasfine

clustersintheauditorycortexarea.TheGNGalgorithmwasusedin

(8)

Fig.7. Clustersinatransparentbrainimagewhenrunningthe(a)GNGand(b) RGNGclusteringtechniques.

Fig.8. Clustersoverlaidontotheanatomicalimagewhenrunningthe(a)GNG(b) RGNGand(c)SPM.

2009).TheROIobtainedwithintheauditorycortexwhenrunning

theGNGalgorithm(Fig.8A)issimilartothatobtainedbythesame

approachintheliterature.Ingeneral,aclustercorrespondstoa

groupofvoxelswithasimilarHDRoveraTC.

Theblockdesignexperimentwasconductedbyrunningthe

pro-posedRGNGapproachusingauditorydata.Theactivationshown

inFig.8Bislocatedinthetemporallobe.Thespatialinformation

showsthattheareasofactivationobtainedaresimilartothose

expectedfromtheauditorycortexexperiments,whicharedetected

asvariationsofvoxelintensityovertime.Inthecurrentstudy,the

dataobtainedbytheRGNGapproachwasseparatedaccordingto

theTCsignalsofvoxelintensityvariationsrelativetoitsaverage.

Similartoallclusteringalgorithms,theRGNGattemptedtoportion

homogeneousareasofactivationinthebrainthatwere

compara-bletothoseareaslocatedusingotherapproachesandfoundinthe

recognizedcorticesrelatedtotheexperiment.Theseareasor

clus-tersaredescribedbyanaverageTCoraclustercenterobtainedby

averagingalloftheTCsofthecluster.

ThenovelapplicationofRGNGoutputclusteringresultscanbe

recognizedasthebestwithrespecttotheGNGapproachbecause

theclusterresultsdefinedthespecificauditorycortexarea.

More-over,thefMRIoutputresultsobtainedbyrunningtheRGNGwas

thesameastheoutcomeobtainedbyrunningtheSPMusingthe

samedatasetandthesameparadigm,aswillbediscussedinthe

nextsubsection.

5.2. ComparingauditorydatarunningRGNGwiththatofSPM

Theparadigmof theblockdesignexperiment alternatestwo

conditions,namely,withoutthestimulusandwithauditory

stim-uli,which consist of repetitions of two-syllable words,suchas

“mother,”“house,”“weather,”and“movie”.Fig.8Bshowstheability

oftheRGNGclusteringtechniquetoidentifywinnernodes,

deter-minetheoptimalnumberofunderlyingclusters,andproducea

TCforactivationdetectioninanauditorydataset.Fig.8Cshowsthe

areaofactivationintheauditorycortexofwholebrainrunningSPM

withf-contrasttestresultswithfamily-wiseerror(FWE)threshold,

withnomasking,theFWE-correctedpvalue=0.05.

TheresultsofSPMbasedonGLMusingtheparadigmasa

ref-erencesignalintroducedbiasintheexperiment.Bycontrast,the

RGNGapproachdidnotusetheparadigmasthereferencesignal

becauseitworksasamodel-freemethod.Insummary,theRGNG

resultswerewithintheexpectedoutputsandhavesimilarresultsto

thosefoundwiththehypothesismethodindetectingactiveareas

withintheexpectedauditorycortices.TheRGNG signalchanges

overaTCinauditoryfMRIdatasetswhichcanbecalculatedby

label-ingthepixelsofthesamecluster(membershipTC)orbyplotting

thedistanceoftheTCstoagivenclustercenter(distanceTC).

NovelandextensivesimulationstudiesonrealfMRIdatasets

were conductedusing the RGNG unsupervised clustering

algo-rithm. A potential problem associated withGLM model is the

requirementofanaccurateestimateofthefMRIparadigmdesign.

Indifferentcases,itisdifficulttoprovideprecisemodeldesigns;

eithertheproblemfromthesubjectswhodidthetaskincorrectly

(alsothesamesubjectmaygiveadifferentresponseforthesame

paradigmatadifferenttime)ordifferentsubjectsmaystillgive

dif-ferentBOLDsignalsduringthesameparadigm.TheresultinFig.8B

showsthatthismethodcancomplementthemodel-basedmethod

tocopewiththedifficultiesandchallengesinfMRIdataanalysis.

ThefindingscanimprovetherecognitionofthenatureofthefMRI

dataandtheunderlyingmechanisms.

6. Conclusions

Themajorobjectiveofthisstudyistodetectandclassifythe

activatedareasofthebrainusingarobustandefficientalgorithm.

Thistypeofstudyhasnotyetbeenconducted,andthecurrent

study,whichusesRGNGwithfMRI,isthefirstattempttodoso.

In conclusion, the RGNG can detect the activezones in the

brain,analyzebrainfunction,anddeterminetheoptimalnumber

ofunderlyingclustersinfMRIdatasets.Thisalgorithmcandefine

thepositionsofthecenterofanoutputclustercorrespondingtothe

minimalMDLvalue.ThevalidityoftheperformanceoftheRGNG

algorithmwastestedusingrealauditoryfMRIdata,whicharebased

(9)

Somedifficultieswereaddressedbyusingtheconventional

clus-teringalgorithms.For example,thenumber ofclustersmustbe

defined earlier and thecluster detection problemhasdifferent

dimensionswithinthesamedataset.TheRGNGmergestheGNG

structurewithrobustpropertiesandusesMDLtodefinethe

prob-lemsofoptimalnetworkrepresentationsandparameters,which

madetheRGNGinsensitivetotheinitializations,inputsequence

ordering,andoutliersandmorerobusttowardnoisyinputdata.

During thenetwork-growing process, theRGNG can effectively

determinetheoptimalnumberofclustersandtheir

correspond-ingpositions,whichareclosertotheactualclustercenters(with

thesmallestMDLvalue)withminimalinfluencefromtheoutliers.

Theexperimentaloutputresultsshowedthesuperior

perfor-manceoftheRGNGovermodel-basedapproachesandoneofthe

prototype-based clusteringalgorithms on realfMRI datasets as

revealedbytheirperformancemeasuredbyMDLandROC

anal-ysis.Thisworkproposednoveland powerfulmethodsfor fMRI

dataanalysis,whichintegratetheadvantagesofthehypothesisand

exploratoryanalysismethods.

TwotypesoffMRIanalysismethodswerecompared,namely,

GLManddata-drivenanalysesusingmachinelearningclassifiers.

TheGLMisthemostcommonmethodforfMRIdataanalysisbut

isbased heavilyona prioriBOLDmodeldesign. Insomecases,

theGLMcannotbeusedforbrainactivationdetectionwhen

pre-viousinformation aboutthedata is unavailable.Anexample is

aresearchinvolvingmentalsubjectorduringdaydreamingand

mind-wandering(defaultmodeofbrainfunction)(Yongnan,2010).

Thus,effectivealternativeapproachesusingdata-drivenanalysis

wereintroducedtodetectbrainactivitybasedonthedata

struc-ture.TheproposedapplicationofRGNGonarealfMRIdatasetwas

reviewedonasingle-subjectauditoryfMRIdata.Thismethodcan

bealsoextendedtomulti-subjectdata-drivenanalysis(multiple

subjectdata)offMRIdataset.RGNGapproachmaybepreferable

formultiplesubjectstudiesinsteadofanalysesdatafrom

single-subjectastheusedauditorydata.

Theparadigmoftheauditorydatasetusedintheexperiment

wasablock-typedatadesign.Forfuture,thisworkcanbeextended

towardexperimentswithevent-relateddatadesign.

TheRGNGcandealwellwithfMRI,whichiscomposedof

mul-timodaldatasets.Thus,theapproachcanbeappliedtootherreal

multimodaldatasets,suchasMRIimagesegmentationinthebrain

and otherregionsof thebody.Thus, clustersofdifferentorgan

shapesinthebodycanbedetectedusingotherdistancemetrics

becausetheEuclideandistancemetricusedwiththeRGNGcan

detecttheclustersofthebrain,whichisanapproximately

spher-icalorellipsoidalregionwithminimaldifferencesinthevariance

ineachdimension(FriguiandKrishnapuram,1999).

In future studies, cluster validity measures other than the

MDLcriterioncanbeusedwithRGNG.Minimummessagelength,

Bayesianinformationcriterion,andAkaike’sinformationcriterion

canbeappliedtotackletheuseofthecommonMDLvalidityindex

usedinthiswork.Thefindingsfromthisworkcanhelpaddressthe

variousdifficultiesthatneurologistsandpsychologistsencounter

duringanalysistoimprovetheinterpretationoffMRIdata.

Acknowledgment

Theauthorswouldliketothankthereviewerswhosecomments

greatlyimprovedthequalityofthemanuscript.

References

AlJobouri,H.K.,Jaber,H.A.,C¸ankaya,I.,2017.Performanceevaluationof prototype-BasedclusteringalgorithmscombinedMDLindex.Comput.Appl. Eng.Educ.25(4),642–654(WileyInc.).

Aljobouri,H.K.,C¸ankaya,I.,Karal,O.,2015.Frombiomedicalsignalprocessing techniquestofMRIparcellation.Biosci.Biotechnol.Res.Asia12,1115–1138.

Baumgartner,R.,Windischberger,C.,Moser,E.,1998.Quantificationinfunctional magneticresonanceimaging:fuzzyclusteringvscorrelationanalysis.Magn. Reson.Imaging16,115–125.

Bock,H.H.,Diday,E.,2000.AnalysisofSymbolicData,ExploratoryMethodsfor ExtractingStatisticalInformationfromComplexDataStudiesinClassification. DataAnalysisandKnowledgeOrganization,Springer-Verlag.

Calhoun,V.,Adali,T.,Pearlson,G.,Pekar,J.,2001.Spatialandtemporal independentcomponentanalysisoffunctionalMRIdatacontainingapairof task-relatedwaveforms.Hum.BrainMapp.13,43–53.

Campain,R.,Minckler,J.,1976.Anoteonthegrossconfigurationsofthehuman auditorycortex.BrainLang.3,318–323.

Chen,H.,Yuan,H.,Yao,D.,Chen,L.,Chen,W.,2006.Anintegratedneighborhood correlationandhierarchicalclusteringapproachoffunctionalMRI.IEEETrans. Biomed.Eng.53,452–458.

Dimitriadou,E.,Barth,M.,Windischberger,C.,Hornika,K.,Moser,E.,2004.A quantitativecomparisonoffunctionalclusteranalysis.Artif.Intell.Med.31, 57–71.

Francesco,D.S.,Fabrizio,E.,Tommaso,S.,Elia,F.,Elio,M.,Claudio,S.,Sossio,C., Raffaele,E.,Klaus,S.,Erich,S.,2003.fMRIoftheauditorysystem:

understandingtheneuralbasisofauditorygestalt.Magn.Reson.Imaging21, 1213–1224.

Frigui,H.,Krishnapuram,R.,1999.Arobustcompetitiveclusteringalgorithmwith applicationsincomputervision.IEEETrans.Patt.Anal.Mach.Intell.21, 450–465.

Friman,O.,Borga,M.,Lundberg,P.,Knutsson,H.,2002.ExploratoryfMRIanalysis byautocorrelationmaximization.Neuroimage16,454–464.

Friston,K.J.,Frith,C.D.,Liddle,P.F.,Frackowiak,R.S.,1993.Functionalconnectivity: theprincipalcomponentanalysisoflargePETdatasets.J.Cereb.BloodFlow Metab.13,5–14.

Friston,K.J.,Poline,J.B.,Strother,S.,Holmes,A.P.,Frith,C.D.,Frackowiak,R.S.,1996. AmultivariateanalysisofPETactivationstudies.Hum.BrainMapp.4,140–151. Fritzke,B.,1994.Growingcellsstructures—aself-organizingnetworkfor

unsupervisedandsupervisedlearning.NeuralNetw.7,1441–1460. Fritzke,B.,1995.AGrowingNeuralGasNetworkLearnsTopologies,Advancesin

NeuralInformationProcessingSystems7.MITPress,Cambridge,pp.625–632. Fritzke,B.,1997.SomeCompetitiveLearningMethods(draft),TechniqueReport.

InstituteforNeuralComputation,Ruhr-University,Bochum.

Goutte,C.,Toft,P.,Rostrup,E.,Nielsen,E.F.,Hansen,L.,1999.OnclusteringfMRI timeseries.Neuroimage9,298–310.

Heydar,D.,Ali,T.,Emad,F.,2009.ExtractingactivatedregionsoffMRIdatausing unsupervisedlearning.In:ProceedingsofInternationalJointConferenceon NeuralNetworks,Atlanta,GeorgiaUSA,pp.641–645.

Hyvarinen,A.,Karhunen,J.,Oja,E.,2001.IndependentComponentAnalysis.John Wiley&Sons.

John,A.,Gareth,B.,Chun-Chuan,C.,Jean,D.,Guillaume,F.,Karl,F.,Stefan,K.,James, K.,Vladimir,L.,Rosalyn,M.,Will,P.,Maria,R.,Klaas,S.,Darren,G.,Rik,H., Chloe,H.,Volkmar,G.,Jeremie,M.,Christophe,P.,2013.SPM8Manual, FunctionalImagingLaboratory,TrustCentreforNeuroimaging.Instituteof Neurology,LondonUK.

Katwal,S.B.,2011.AnalyzingfMRIdatawithgraph-basedvisualizationsof self-Organizingmaps.In:IEEEInternationalSymposiumonBiomedical Imaging,Chicago,pp.1577–1580.

Korczak,J.,2007.InteractiveMiningofFunctionalMRIData,Signal-Image TechnologiesandInternet-BasedSystem,(SITIS‘07).IEEEComputerSociety, Washington,DCUSA,pp.912–917.

Korczak,J.,2012.VisualexplorationoffunctionalMRIdata.In:Karahoca,A., INTECH(Eds.),DataMiningApplicationsinEngineeringandMedicine.,pp. 249–264.

Lachiche,N.,Hommet,J.,Korczak,J.,Braud,A.,2005.Neuronalclusteringofbrain fMRIimages.ProceedingofPatternRecognitionandMachineInference, 300–305.

Lange,N.,Strother,S.C.,Anderson,J.R.,Nielsen,F.A.,Holmes,A.P.,Kolenda,T., Savoy,R.,Hansen,L.K.,1999.PluralityandresemblanceinfMRIdataanalysis. Neuroimage10,1999.

Liao,W.,Chen,H.,Yang,Q.,Lei,X.,2008.AnalysisoffMRIdatausingimproved self-Organizingmappingandspatio-temporalmetrichierarchicalclustering. IEEETrans.Med.Imaging27,1472–1483.

Lindquist,M.A.,2008.ThestatisticalanalysisoffMRIdata.Stat.Sci.23,439–464. Martinetz,T.,Schulten,K.,1991.A.NeuralGasNetworkLearnsTopologies,

ArtificialNeuralNetworks.Elsevier,pp.397–402.

McKeown,M.,Makeig,S.,Brown,G.,Jung,T.,Kindermann,S.,Bell,A.,Sejnowski,T., 1998.AnalysisoffMRIdatabyblindseparationintoindependentspatial components.Hum.BrainMapp.6,160–188.

Mckeown,M.J.,2000.Detectionofconsistentlytask-relatedactivationsinfMRI datawithhybridindependentcomponentanalysis.Neuroimage11,24–35. Pereira,F.,Mitchell,T.,Botvinick,M.,2009.MachinelearningclassifiersandfMRI:a

tutorialoverview.Neuroimage45.

Qin,A.K.,Suganthan,P.N.,2004.Robustgrowingneuralgasalgorithmwith applicationinclusteranalysis.NeuralNetw.17,1135–1148.

Rissanen,J.,1983.Auniversalpriorforintegersandestimationbyminimum descriptionlength.Ann.Stat.11,416–431.

SPM,1991.StatisticalParametricMapping.http://www.fil.ion.ucl.ac.uk/spm/. Seghier,M.L.,Friston,K.J.,Price,C.J.,2007.Detectingsubject-Specificactivations

usingfuzzyclustering.Neuroimage36,594–605.

Skudlarski,P.,Constable,R.T.,Gore,J.C.,1999.ROCanalysisofstatisticalmethods usedinfunctionalMRI:Individualsubjects.Neuroimage9,311–329.

(10)

Sun,X.,Xu,W.,2014.FastimplementationofDeLong’salgorithmforcomparing theareasundercorrelatedreceiveroperatingcharacteristiccurves.IEEESignal ProcessLett.21,1389–1393.

TheAnalysisGroup,2012.FMRIB(Oxford,UK)http://fsl.fmrib.ox.ac.uk/fsl/fslwiki/. Wismuller,A.,Meyer-Base,A.,Lange,O.,Auer,D.,Reiser,M.F.,Sumners,D.,2004.

Model-freefunctionalMRIanalysisbasedonunsupervisedclustering.J. Biomed.Inform.37,10–18.

Yongnan,J.,2010.Data-drivenfMRIDataAnalysisBasedonParcellation,Ph.D Thesis.UniversityofNottingham(October).

Şekil

Fig. 1 presents the flowchart of the GNG algorithm and shows that the inactive neurons that do not win during a long time  inter-val may be detected through the GNG algorithm by tracing the
Fig. 2 presents the flowchart of the RGNG algorithm. The pro- pro-posed flowchart can be summarized in the following steps:
Fig. 2. Flowchart design of the RGNG algorithm.
Fig. 3. Proposed data mining system architecture. (a) Main block diagram. (b) Exper- Exper-imental paradigm “silence” and “talk”.
+3

Referanslar

Benzer Belgeler

The aim of this experiment is to cluster the fifteen clients (vectors) which shows the two clusters where the clients grouped into two clusters (cluster one and cluster two) for

Tüm çocuk yaş grubu değerlendirildiğinde 285‘i (%79,8) hafif kafa travması, 55’i (%15,4) orta kafa travması ve 17’si (%1,9) ağır kafa travması olarak saptandı..

İçeride ise Sır Odası’nın duvarları tamamen sıvalı olduğu için malzemenin niteliği tam olarak anlaşılamamakla birlikte kemer üzerinde yer alan 1896-97 ve 1974

decreasing the noise imparted by multi-mode diodes in cladding-pumped amplifiers, we evaluate the impact, in terms of the noise performance, of using multiple, low- power pump diodes

(a) Measured and (b) simulated frequency response of a normal  unit cell with perpendicular (black line) and parallel (red line) orientations with respect to the incident

Anket kapsamındaki aile şirketlerinin genel yönetim politikalarının tespiti amacıyla sorulan soruların dördüncüsü olan “şirket ailenizden miras mı kaldı?”

The theory regarding mechanism of hematocrit in CHD is limited. Hematocrit, the proportion of the total blood volume occupied by red blood cells, is a major determinant

In this study the efficacy of Kefir (Altınkılıç) and Ensure (Abbott) as enteral feeding products as colonic anastomotic healing has been investigated.. MATERIAL