ContentslistsavailableatSciVerseScienceDirect
Biomedical
Signal
Processing
and
Control
jou rn a l h o m e pag e :w w w . e l s e v i e r . c o m / l o c a t e / b s p c
Technical
note
Sparse
spatial
filter
via
a
novel
objective
function
minimization
with
smooth
1
regularization
Ibrahim
Onaran
a,b,∗,
N.
Firat
Ince
a,c,
A.
Enis
Cetin
baDepartmentofNeurosurgery,UniversityofMinnesota,Minneapolis,MN55455,USA bDepartmentofElectricalEngineering,BilkentUniversity,Ankara,Turkey
cDepartmentofElectricalandComputerEngineering,UniversityofMinnesota,Minneapolis,MN55455,USA
a
r
t
i
c
l
e
i
n
f
o
Articlehistory: Received19April2012
Receivedinrevisedform7September2012 Accepted8October2012
Available online 8 November 2012 Keywords:
Brainmachineinterfaces Commonspatialpatterns Sparsespatialprojections Rayleighquotient Unconstrainedoptimization
a
b
s
t
r
a
c
t
Commonspatialpattern(CSP)methodiswidelyusedinbrainmachineinterface(BMI)applicationsto extractfeaturesfromthemultichannelneuralactivitythroughasetofspatialprojections.Thesespatial projectionsminimizetheRayleighquotient(RQ)astheobjectivefunction,whichisthevarianceratioof theclasses.TheCSPmethodeasilyoverfitsthedatawhenthenumberoftrainingtrialsisnotsufficiently largeanditissensitivetodailyvariationofmultichannelelectrodeplacement,whichlimitsits applicabil-ityforeverydayuseinBMIsystems.Toovercometheseproblems,theamountofchannelsthatisusedin projections,shouldbelimitedtosomeadequatenumber.Weintroduceaspatiallysparseprojection(SSP) methodthatexploitstheunconstrainedminimizationofanewobjectivefunctionwithapproximated1
penalty.UnliketheRQ,thisnewobjectivefunctiondependsonthemagnitudeofthesparsefilter.The SSPmethodisemployedtoclassifythemulticlassECoGandtwoclassEEGdatasets.Wecomparedour resultswitharecentlyintroducedsparseCSPsolutionbasedon0norm.Ourmethodoutperformsthe
standardCSPmethodandprovidescomparableresultsto0normbasedsolutionanditisassociated
withlesscomputationalcomplexity.Wealsoconductedseveralsimulationstudiesontheeffectofnoisy channelandintersessionvariabilityontheperformanceoftheCSPandsparsefilters.
© 2012 Elsevier Ltd. All rights reserved.
1. Introduction
TheBMItechnologyaimstohelpdisabledpeopletoestablish communicationwiththeirenvironmentsolelybytheirbrain sig-nals.Withtherecentadvancesinelectrodedesignandrecording technology,thenumberofrecordingchannelsusedinBMI appli-cationsisincreasingtocapturesignalsfromalargerareaofthe brainortogetmoreinformationfromsmallerregionsusingdense electrodegrids.Therefore,adimensionreductionalgorithmneeds tobeemployedtodecreasethecorrelationbetweenchannelsand improvethesignaltonoiseratio(SNR).Inthisscheme,theCSP algo-rithmiswidelyusedduetoitssimplicityandlowercomputational complexitytoextractfeaturesfromhigh-densityrecordingsboth usingnoninvasiveandinvasivemodalities[1,2].
DespitethebenefitsoftheCSPmethod,italsohasanumber of drawbacks. One major problemof theCSP is that it gener-allyoverfits thedata whenit is recorded froma largenumber of electrodesand when there is limited number of train trials.
∗ Correspondingauthorat:DepartmentofElectricalEngineering,Bilkent Univer-sity,Ankara,Turkey.
E-mailaddresses:[email protected],[email protected](I.Onaran), fi[email protected](N.F.Ince),[email protected](A.E.Cetin).
Moreover,thechancethatCSPusesanoisyorcorruptedchannelis linearlyincreasedwithincreasingnumberofrecordingchannels. RobustnessovertimeisalsoamajordrawbackinCSPapplications [3,4].SinceallchannelsareusedinspatialprojectionsofCSP,the classificationaccuracymayreduceincasetheelectrodelocations slightlychangeindifferentsessions.Thisrequiresalmostidentical electrodepositionsovertime,whichisdifficulttorealize[5].The sparsenessof thespatial filtermighthave animportantrole to increase the robustnessand generalization capacity of theBMI system.
TheCSPmethodminimizestheRayleighQuotient(RQ)ofthe spatial covariance matrices to achieve the variance imbalance betweentheclassesofinterest.TheRQisdefinedas
R(w)= wTAw
wTBw (1)
whereAandBarethespatialcovariancematricesoftwodifferent classesandwisthespatialfilterthatwewanttofind.Onewayto reducethenumberofchannelsusedintheprojectionw,isto trans-formtheCSPalgorithmintoaregularizedoptimizationproblemin theformof
L(w)=R(w)+||w|| (2)
1746-8094/$–seefrontmatter © 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.bspc.2012.10.003
whereR(w)istheobjectivefunction,||w||isthe1 normbased penaltyandisaconstantthatcontrolsthesparsityofthesolution. Inthepastfewyears,thereisgrowinginterestinusing1penalty toconstructsparsesolutions.However,RQdoesnotdependon themagnitudeofthesparsefilter.Therefore,RQcannotbedirectly usedinanormbasedminimizationproblem,sincetheoptimizer alwaysminimizesthenormalongthedirectionwhichRQhasbeen minimized.
AnumberofstudiesinvestigatedputtingtheCSPinto alterna-tiveoptimizationformstoobtainasparsesolutionforit.In[6]the authorsconvertedCSPintoaquadraticallyconstrainedquadratic optimizationproblemwith1penalty;othersusedan1/2[3,7] normbasedsolution.Thesestudieshavereportedaslightdecrease ornochangeintheclassificationaccuracywhiledecreasingthe numberofchannelssignificantly.Recently,in[8]quasi0 norm basedcriterionwasusedforobtainingthesparsesolutionwhich resulted an improved classification accuracy. Since 0 norm is non-convex,combinatorialandNP-hard,theyimplementedgreedy solutionssuchasForwardSelection(FS)andBackwardElimination (BE)todecreasethecomputationalcomplexity.Ithasbeenshown thatBEwasbetterthanFS(lessmyopic)intermsofclassification errorandsparsenesslevelbutassociatedwithveryhighcomplexity makingitdifficulttouseinrapidprototypingscenarios.
Inthispaper,weconstructacomputationallyefficientspatially sparseprojection(SSP)basedonanovelobjectivefunctionwith similarcharacteristicstoRQ.Thisnewobjectivefunctioncanbe minimizedintheformof(2)toaddressthedrawbacksof regu-larCSPmethod.Weshowthatournewobjectivefunctionhasthe sameminimizationsolutionasRQanditdependsonthe magni-tudeofthespatialfilter.Themagnitudedependencyofournew objectivefunctionallowsustouseacontinuousanddifferentiable functionapproximating1norm[9]asregularizationterminan unconstrainedoptimizationframeworkandcanbesolvedusing standardalgorithmswithlow complexity.Therestofthepaper isorganizedasfollows.Inthefollowingsection,wedescribeour novelobjectivefunctionanditsrelationtoRQ.Thenweexplain itsuseinanunconstrainedoptimizationproblem.Next,weapply ourmethodontheBCIcompetitionIVECoGdatasetinvolving indi-viduatedmovementsoffivefingers[10]andtheBCIcompetition IIIEEGdatasetIVa[11]involvingimaginaryfootandhand move-ments.WealsocompareourmethodtostandardCSPandthe0 normbasedBEsolutiongivenin[8].Finally,westudiedtheeffect ofadditionalGaussiannoiseandsimulatedchanneldisplacements ontheclassificationaccuracyanddiscussourresultsandprovide futuredirections.
2. Materialsandmethods
2.1. StandardCSPandanewobjectivefunction
IntheCSPframework,thespatialfiltersareaweightedlinear combinationofrecordingchannels,whicharetunedtoproduce spatialprojectionsmaximizingthevarianceofoneclassand mini-mizingtheother.Thespatialprojectioniscomputedusing
XCSP=WTX (3)
wherethecolumnsofWarethevectorsrepresentingeachspatial projectionandXisthemultichannelECoGdata.
MaximizingtheRQ(1)isidenticaltothefollowingoptimization problem.
maximize
w w
TAw
subjectto wTBw=1. (4)
AfterwritingthisoptimizationproblemintheLagrangeform andtakingthederivativewithrespecttow,weobtaintheidentical
problemintheformofAw=BwwhichistheGeneralized Eigen-valueDecomposition(GED).Thesolutionsofthisequationarethe jointeigenvectorsofAandBandistheassociatedeigenvalueof aparticulareigenvector.
ThedrawbacksoftheCSPmethodthataredescribedearlierlead ustofindawaytosparsifythespatialfiltertoincreasethe classi-ficationaccuracyandthegeneralizationcapabilityofthemethod. Weassumethatthediscriminatoryinformationisembeddedina fewchannelswherethenumberofthesechannelsismuchsmaller thantheactualnumberofallrecordingchannels.Sothe discrimi-nationcanbeobtainedwithasparsespatialprojection,whichuses onlyinformativechannels.Inthisschemeassumethatthedatawas recordedfromKchannels.Weareinterestedinobtainingasparse spatialprojectionusinganunconstrainedminimizationproblemin theformof(2),wherewhasonlyknonzeroentries,card(w)=kand kK.WenotethattheR(w)doesnotdependonthemagnitudeof w,asshowninthefollowingequation.Letw∗=˛w,then
R(w∗)=R(˛w)=˛2wTAw
˛2wTBw =R(w) (5)
where˛isanyscalarwhichisnotequaltozero.SinceR(w)does notdependonthegainofw,theoptimizerarbitrarilyreducesthe gainofwtominimizeregularizationterm||w||afterfindingthe directionthatminimizesR(w).Thus,thesolutionofthe optimiza-tionproblemthatusesR(w)asanobjectivefunctionisessentially thesameastheGEDsolution.
Tofindasparsesolutionweneedtohaveanobjectivefunction thatdependsonthegainofw.Inthisscheme,wereplacedR(w) withthefollowingobjectivefunction.
G(w)=wTAw+ 1
wTBw (6)
Thisfunctionisboundedfrombelowandhasinteresting prop-erties.Letusdefinea=wTAwandb=wTBw.IfwedefineRQin termsofaandbsuchthatR=a/bthenournewobjectivefunction canbeexpressedas G(w)=a+1 b= ab b + 1 b=Rb+ 1 b (7)
Thederivativeof G(w)withrespecttoR isequaltobwhich isalwayspositive.ThisindicatesthatourobjectivefunctionG(w) decreaseswithadecreaseinRvalue.Aftertakingthederivativeof G(w)withrespecttobandsolvingEq.8,
∂
G(w)∂
b =R−1
b2 =0 (8)
wenotethatbisequalto√R−1.ByinsertingbvalueintotheEq. 7weobtaintheminimumvalueofG(w)as2√R.Thisresultshows thatthedirectionthatminimizesRalsominimizesG(w).
WeplugG(w)intounconstrainedoptimizationformulationin (2)astheobjectivefunction.Ratherthanworkingtosolve(2)with anon-differentiable1penalty,wereplaceditwithatwice differ-entiablesmoothversionof1 (epsL1)whichissufficientlyclose tominimizing1[9].Themainadvantageofthisapproachisthat, sinceepsL1andG(w)arebothtwicedifferentiablewecandirectly applyan unconstrained optimizationmethodtominimizeL(w) [12].TheepsL1isdefinedas
||w||= K
i=1 w2 i + (9)where
isasufficientlysmallparameterandKisthedimension ofw.TheepsL1approximatesthe1normandtheyareidentical whenisequaltozero.TwicedifferentiabilityoftheepsL1norm allowsustouseitwhenwiisequaltozerounliketheregular1 normwhichisnotdifferentiableatzero.ThesolutionwthatminimizesthefunctionL(w)=G(w)+||w|| tendstobecomesparseas getsbigger.The entriesof w gen-erallywerenotexactlyequaltozero,sowenormalizedwtoits maximumabsolutevalueandeliminatedtheweightsconsequently correspondingchannelsthatdonotexceedapredefinedthreshold (=10−2).Weused“fminunc”functionofMatlabtofindthe solu-tionofourunconstrainedminimizationproblem.Wecomputedthe desiredcardinalitywhichisthenumberofchannelstobeselected forthespatialprojectionbyimplementingabisectionsearch[13] onthe.Theupperborderofwasdeterminedinitiallyusingthe G(wc)/||wc||ratiowherewcisthefullCSPsolution.Incasetheinitial upperborderresultsacardinalitylargerthanthedesiredvalue,we keptdoublingtheparameteruntilweobtainedathatresultsa cardinalitywhichislessthanorequaltothetargetvalue.
Followingtheaboveprocedure,wecomputedthefirstspatial fil-terwthatminimizestheG(w)whichalsominimizestheR(w).The solutionthatmaximizesR(w)isalsoausefulspatialfilter. There-fore,weinterchangedthematrixAandBtofindasolutionthat maximizesR(w).Inordertofindmultiplesparsefilterswedeflated thecovariancematriceswithsparsevectorsusingtheSchur com-plementdeflationmethoddescribedin[14].
2.2. ECoG&EEGdatasets
WeappliedtheSSPmethodontwodifferentdatasets,multiclass ECoGandtwoclassEEGofBCIcompetitionsIVandIIIrespectively. TheECoGdatawasrecordedfromthreesubjectsduringfinger flexionsandextensions[10]withasamplingrateof1kHz.The elec-trodegridwasplacedonthesurfaceofthebrain.Eachelectrode arraycontained48(8×6)or64(8×8)platinumelectrodes.The fingerindextobemovedwasshownwithacueonacomputer monitor.Thesubjectsmovedoneoftheirfivefingers3–5times duringthecueperiod.TheECoGdataofeachsubjectwassubband filteredinthegammafrequencyband(65–200Hz)asin[15].We used1sdatafollowingthemovementonsetintheanalysis.The datasetcontainsaround146trialsforeachsubject.
WealsousedtheBCIcompetitionIIIdatasetIVa[11].Thedataset isrecordedfromfivesubjects(aa,al,av,aw,ay)whowereasked toimagineeitherrightfootorrightindexfingermovements.The samplingrateofthedatawas1kHzanddatawasrecordedfrom 118channels.TheEEGsignalwasfilteredintherangeof8–30Hz. Therewere140trialsavailableforeachclass.Onceagain,1sdata followingthecuewasusedintheanalysis.
ForbothECoG andEEGdatasets,thesignalwastransformed intofourspatialfiltersbytakingfirstandlasttwoeigenvectorsfor eachCSPmethods.Aftercomputingthespatialfilteroutputs,we calculatedtheenergyofthesignalandconvertedittologscalefor eachsparsefilterandweusedthemasinputfeaturestolib-SVM classifierwithanRBFkernel[16].Wealsoinvestigatedtheefficacy ofusingLinearDiscriminantAnalysis(LDA)classifier[17]whichis parameterfreedecisionfunction.
SincewearetacklingamulticlassproblemfortheECoGdataset, weusedthepairwisediscriminationstrategyof[2]toapplythe CSPtothefive-classfingermovementdata.Inotherwords,we constructedsparsespatialfilterstunedtocontrastpairsoffinger movementssuchas1vs.2;1vs.3;2vs.4,etc.
Ineachdataset,wecomparedtheSSPtothestandardCSPandto the0normbasedBEmethodof[8]asitprovidedsuperiorresults in terms of classification accuracyand reduced cardinality. We studiedtheclassificationaccuracyasafunctionofcardinality.On thetrainingdatawiththepurposeoffindingoptimumsparseness levelfortheclassification,wecomputedseveralsparsesolutions, withdecreasingcardinality.FortheECoGdataset,thesparseCSP methodswereemployedwithk∈{40,30,20,15,10,5,2,1}.For theEEGdataset,wecomputedthesparsefilterswithk∈{80,60, 40,30,20,15,10,5,2,1}.Foreachcardinality,wecomputedthe
Table1
ECoGdatasetclassificationerrorrates(%)foreachsubjectusingSVMclassifier.
Cardinality Subject1 Subject2 Subject3 Avg
BE 5 19.8 17.1 16.8 18
SSP 5 18.4 13.4 18 17
CSP All 30.7 26 32.8 30
correspondingRQvalue.WestudiedtheinverseoftheRQ(IRQ) curveanddeterminedtheoptimalcardinalitywhereitsvalue sud-denlydroppedindicatingwestartedtoloseinformativechannels. FortheECoGdataset,halfofthetrialswereusedintrainingand theremaininghalffortesting.Inaverage,weused15±2traintrials perfinger(thethumb,index,middle,ringandlittlefingers respec-tively).TheEEGdatasetcontains140trialsperclassandsubject. We used70trials intraining toestimatethesparsefilters,and 70trialsfortesting.Inbothdatasets,thevalueofthe
inepsL1 regularizationtermwaschosentobe10−6.3. Results
WeobservedthatfortheSSPmethod,anyparticularvaluecan leadtodifferentcardinalityandnormalizedIRQvaluesfordifferent subjectsasshowninFig.1.Inparticular,thisintersubject variabil-ityofIRQdidnotallowustousethesamevalueforallsubjects (SeeFig.1aandb).However,thevariabilityofIRQvaluesof differ-entsubjectswaslowerwhenwefixedthecardinalityasshownin Fig.1eandf.Consequently,duetothisreducedvariabilityandto compareourmethodtotheBEtechnique,westudiedthe classifi-cationerrorasafunctionofcardinality.Inordertodecideonthe optimalcardinalityleveltobeusedonthetestdata,theIRQ val-ueswerecomputedonthetrainingdata,scaledtotheirmaximum valueandaveragedoversubjects.Inthefollowingstep,we com-putedtheslopeoftheIRQcurveandnormalizedittoitsmaximum valuetogetanideaabouttherelativechangeintheIRQ.
WedepictedthechangeinIRQvaluesforeachcardinalityas showninFig.2aandb.Asexpected,decreasingthecardinalityof thespatialprojectionresultedtoadecreaseintheIRQvalue.To determinetheoptimumcardinalitytobeusedinclassificationon thetestdata,weselectedthecardinalitythatisbelow10%ofthe maximumrelativechange(seethedashedlinesinFig.2aandb).For theECoGdataset,thecardinalityvaluewasfoundtobe5andforthe EEGdataset,itwasfoundtobe15fortheSSPmethod.FortheBE methodthesevalueswere5and10respectively.Theseindices per-fectlycorrespondedtotheelbowoftheIRQcurve,whichindicates lossofinformativechannels.InTables1and2,weprovidethe clas-sificationresultsandselectedcardinalitiesfortheECoGandEEG datasetusingdifferentmethodsincludingSSP,CSPand0based greedysolution,BE.Inordertogiveaflavoraboutthechangein errorrateversusthecardinality,weprovidedtherelated classi-ficationerrorcurvesinFig.2candd.FortheECoGdataselected cardinalityprovidedminimumtesterrors.However,fortheEEG data,althoughtheminimumclassificationerrorwasobtainedat cardinality5fortheBEmethod,wenoticedthatweidentifiedthe optimumcardinalityas10inthetrainingdata.
Onallsubjectswestudied,weobservedthattheSSPmethod consistentlyoutperformedtheCSPmethod.Wenotedthatthe min-imumerrorratewasobtainedwithSSPmethodforECoGdata.Both
Table2
EEGdatasetclassificationerrorrates(%)foreachsubjectusingSVMclassifier.
Cardinality aa al av aw ay Avg
BE 10 13.6 2.9 30.7 2.1 5.0 10.9
SSP 15 19.3 1.4 23.6 4.3 5.7 10.9
10−5 10−4 10−3 10−2 0.6 0.7 0.8 0.9 1 1.1 Normalized IRQ λ Subject 1 Subject 2 Subject 3 10−5 10−4 10−3 10−2 0.2 0.4 0.6 0.8 1 Normalized IRQ λ aa al av aw ay 10−5 10−4 10−3 10−2 10 20 30 40 50 60 λ Cardinality Subject 1 Subject 2 Subject 3 10−5 10−4 10−3 10−2 0 20 40 60 80 100 120 λ Cardinality aa al av aw ay 0 10 20 30 40 All 0.6 0.7 0.8 0.9 1 1.1 Normalized IRQ Cardinality Subject 1 Subject 2 Subject 3 20 40 60 80 100 0.2 0.4 0.6 0.8 1 Normalized IRQ Cardinality aa al av aw ay
Fig.1.NormalizedIRQvaluesareshownin(a)forECoGandin(b)forEEGdata.ThecardinalityvsvalueoftheminimizationfunctionL(ω)=G(ω)+ω(c)ECoGand(d) EEGdataforeachsubject.Theverticallinesindicatethevaluesthatareinitiallychosenforbisectionsearch.Theinitialdatapointcorrespondsto=0whichproducesthe regularCSPsolution.ThenormalizedIRQvaluesvs.cardinalityforeachsubjectisshownin(e)and(f).
SSPandBEmethodsusedcardinalityof5toachievetheminimum errorrate.AsexpectedthefullCSPsolutiondidnotperformasgood astheothersparsemethodsandlikelyoverfittedthetrainingdata. TheSSPmethodimprovedtheclassificationerrorratewithanerror differenceof13.2%.WeobtainedcomparableresultsonEEGdata usingtheSSPandBEmethods(p-value=0.5,pairedt-test), how-everBEprovidedasignificantimprovementoverSSPmethodon thenumberofchannels(p-value=0.003)usedinspatialprojection. TheerrordifferencebetweenregularCSPandSSPisless appar-entintheEEGdatasetwherethedifferencebetweenclassification accuracieswas3.8%.Thiscouldbeduetothehighnumberof train-ingtrialsusedinEEGdata.Westudiedtheeffectoftheamount oftrainingdataontheclassificationaccuracyandpresentedthe resultsinFig.4a.Whenasmallnumberoftrainingtrials,aslowas 15areusedintheEEGdataset,thedifferencebetweenthesparse andstandardCSPtechniquewasmorethan6%.Interestingly,with increasingnumberoftrainingtrialstheSSPmethodconsistently providedbetterresultsandthedifferenceremainedbetween3and 4%.TherewasnonoticeabledifferencebetweenSSPandBE.
TheclassificationresultsobtainedwithLDAclassifieraregiven inTable3.We observedthat theLDA classifierwhich doesnot involve parameter selection like SVM, provided slightly higher errorratesforthesparsesolutions.Thiscouldbedueto nonlin-eardecisionsurfaceandmaximummarginidentifiedbytheSVM classifier.Interestingly,inbothdatasets,theLDAclassifierresulted inlowererrorrateswiththenon-sparseCSPsolution.
Table3
Averagetesterrorrate(%)andcorrespondingcardinality.
ECoG EEG
BE SSP CSP BE SSP CSP
LDA 19.9 18.9 24.1 11.4 11.1 14.3
SVM 17.9 16.6 29.8 10.9 10.9 14.7
Cardinality 5 5 All 10 15 All
Fig.3illustratesthedistributionofthespatialfiltersobtained usingSSPandCSPalgorithmsforallsubjects.Weobservedthat theSSPfiltercoefficientsarelocalizedonthelefthemisphereand thecentralarea,whichisinaccordancewiththecorticalregions relatedtorighthandandthefootmovementgeneration.
Arvaneh et al. [7] used the 1/2 ratio as a penalty term and they applied their algorithm to the BCI competition III EEG dataset IVa [11] which we used in this paper as well. Theyachieveda meanerrorrateof17.7±15.4%using22.6±11 channels. Here, we compared our method with the study of Arvaneh et al. by extracting one filter from each end of the sparse solutions. The SSP method achieved a mean error rate of 12±11.3% with an average number of channels 25.6±2.3. The obtained results indicated that the SSP method provided a significant improvement (p-value=0.024, paired t-test) over the1/2basedalgorithmontheclassificationaccuracywithout anysignificantdifferencebetweennumberofchannelsused (p-value=0.28).
In order to compare the computational complexity of SSP methodtotheBE,wecomputedsparsefilterswithacardinalityof twofromanincreasingnumberofrecordingchannelsonsimulated data.Thetrainingwasperformedonaregulardesktopcomputer with4GBofRAMandequippedwithaCPUrunningat2.66GHz. Theelapsedtimeperfiltercomputationincreasedexponentially fortheBEmethodandlinearlyfortheSSPmethodasshownin Fig.4b.With128channels,theBEalgorithmcomputedasingle spatialfilterwithtwononzeroentriesin90s.FortheSSPmethod withthesamesetupabove,theelapsedtimewaslessthana sec-ond.Although,weusedtherelativechangeintheIRQtoidentify theoptimumsparsitylevel,onecanalsorunatypicalk-foldcross validationproceduretoidentifytheoptimumlevel.However,in suchacasetrainingthesystemwithBEmethodwilltakeseveral hourswhichmaynotbefeasibleforBMIapplications.Ontheother handwiththeSSPmethodtrainingthroughcrossvalidationcanbe executedinafewminutes.
0 10 20 30 40 0 0.2 0.4 0.6 0.8 1 Cardinality 0 20 40 60 80 0 0.2 0.4 0.6 0.8 1 Cardinality 1 10 20 30 40 All Channels 10 15 20 25 30 35 40 Cardinality Classification Error (%) BE SSP 20 40 60 80 All Channels 10 15 20 25 30 35 40 Cardinality Classification Error (%) BE SSP
Fig.2.TheaverageIRQofallsubjectsversuscardinality(a)ECoGand(b)EEGdata.Theredlineisthe10percentthresholdthatdeterminestheoptimumcardinalityto beusedinthetestdata.TheoptimumcardinalitylevelsforECoGandEEGarefiveand15respectively.TheclassificationerrorcurvesofSSPandBEmethodsversusthe cardinalityaregivenin(c)forECoGandin(d)forEEGThelastdatapointcorrespondstotheresultsobtainedfromstandardCSPwhichusesallchannels.(Forinterpretation ofthereferencestocolorinthisfigurelegend,thereaderisreferredtothewebversionofthearticle.)
Inordertoevaluatetheeffectofnoiseandintersession variabil-ityontheperformanceofourapproachwestudiedtwodifferent controlledexperiments:
i AddingGaussiannoisetoarandomlyselectedchannelinECoG andEEGdata.
Fig.3.TheCSPandSSPfiltersforhandandfootmovementimagination.
iiSimulationofelectrodedisplacementinEEGdata.
Duringthefirstexperiment,weaddedzeromeanGaussiannoise tooneofthechannelsandcalculatedtheresultingclassification errortoobtainfinalclassificationaccuracyforthenoisydata.This experimentwasrepeatedforallchannelsandthentheresulting averageclassificationerrorwascomputed.Thevarianceofthe addi-tionalnoisewasincreasedinacontrolledmannerandproportional totheaveragevarianceofallchannels.Theratioofthenoiseto thevarianceoftheoriginaldatawasnamedasnoisetosignalplus noiseratio(NSNR)sincetheoriginalsignalalreadycontaminated fromdifferentnoisesources.InFig.5,wedepictedthe classifica-tionaccuracyvs.NSNRwhichwasexpressedinDecibelscale.Itwas notedthatanincreaseinNSNRcausederrortoincreaseforall meth-ods,howevertheincreaseinCSPmethodwasmorethanthesparse filters.WhileusingthesparsemethodsintheECoGdata,the classi-ficationerrorreachedaplateauafter5dBwhereasthestandardCSP errorincreasedmonotonically.Asimilarbehaviorwasobservedin theEEGdata.
Inthesecondexperiment,inordertostudyintersession vari-abilitywesimulatedelectrodedisplacementsbyinterpolatingthe EEGtestdataatdifferentpositions.Sinceelectrodelocationsfor theECoGdatawerenotavailable,weconductedthisexperiment withtheEEGdataonly.Werandomlydeterminedthedirectionand theamountofdisplacementofthenewelectrodelocations.Tobe morerealistic,weintroduceddisplacementwhichwasuniformly distributedover118electrodesvaryingbetween0and50%ofthe
20 40 60 80 100 5 10 15 20 25 30 Number of Trials Minimum Error BE SSP CSP 20 40 60 80 100 120 0 20 40 60 80 100 Number of Channels Elapsed Time (s) BE SSP
Fig.4. (a)Theminimumerrorvs.thenumberoftrials.(b)Theaverageelapsedtimetoestimateaspatialfilterwithacardinalityoftwovs.thenumberoftotalrecording channels. −20 −10 0 10 20 10 20 30 40 50 60 70 80 90 100
Noise to Signal Plus Noise Ratio (dB)
Classification Error (%) BE SSP CSP −20 −10 0 10 20 10 20 30 40 50 60
Noise to Signal Plus Noise Ratio (dB)
Classification Error (%)
BE SSP CSP
Fig.5. Thenoisetosignalplusnoiseratio(NSNR)vs.classificationaccuracyfor(a)ECoGand(b)EEGsets.
distancetothenearestelectrodeasshowninFig.6a.Thetestingof algorithmswasconductedontheinterpolatedEEGdataonthe dis-placedelectrodelocations.Theclassificationresultsvs.cardinality isshowninFig.6b.ItwasnotedthattheerrorchangeforCSP,BE andSSPmethodsare13%,12%and10%,respectively.Theincrease inerrorratewasnotassevereasinthepreviousnoise contami-natedchannelexperiment.Thesparsemethodshadslightlylower errorincreasethanthestandardCSP.Initially,itwasexpectedthat thesparsemethodswouldnotbethisvulnerabletodisplacement in theelectrode locations.At this point, we speculate that the smoothnessoftheprojections have a certainadvantageonthe
displacedelectrodes.Forinstance,duetosparseness,itislikelythat theshiftedelectrodelocationsareweightedwithzero.Thismakes thesparseprojectionsdiscardtheshiftedchannelsthatfalloutside theprojectionzone. Incontrary,thestandard CSPis associated withsomesortofsmoothnessandmanycrucialchannels,although displaced,couldstillbeprojectedwithsimilarweightsduetothe correlateddistribution.Consequently,itshouldbenotedthatnot onlysparsenessbutalsosmoothnessof projectionscouldbean importantparametercontributingtothegeneralizationcapability ofthemethods.Nevertheless,onceagainweneedtohighlight that,theincreaseinerrorratewaslowerforthesparsemethods.
0 10 20 30 40 50 0 5 10 15 20 Displacement (%) Number of Electrodes 20 40 60 80 All Channels 10 20 30 40 50 60 Cardinality Classification Error (%) BE SSP
Fig.6. (a)Thehistogramofthenumberofelectrodeswithrespecttothedisplacementinducedonthetestdata.(b)Thecardinalityvs.classificationerroronthetestdata withdisplacedelectrodelocations.
4. Conclusion
Theneedforthesparsefiltersisapparentwhenthereislarge numberofrecordingelectrodesandinsufficientamountoftraining data.Tominimizeoverfittingonthetrainingdataandeliminate noisychannels,weintroducedaspatiallysparseprojection tech-nique(SSP)basedonanovelobjectivefunction.Unlike theRQ, thisnewobjectivefunctionhasadependencyonthefilter magni-tude.Byusinganapproximated1norm,wecomputedthesparse spatialfiltersthroughanunconstrainedminimizationformulation withstandardoptimizationalgorithm.Weappliedourmethodto ECoGand EEGdatasetsandcompared itsefficiencytostandard CSP,and toa0 normbasedgreedytechnique.TheSSPmethod outperformed thestandard CSP onboth datasets and provided comparable resultsto0 norm based method,which is associ-atedwithhighercomputationalcomplexity.OntheECoGdata,the SSPmethodprovided44%decreaseintheerrorratecomparedto standardCSPmethodandusedonlyfivechannelsineachspatial projection.TheerrordifferencebetweenregularCSPandSSPisless apparentintheEEGdatasetasSSPmethodprovided26%decrease intheerrorrate.IncontrarytotheECoGdata,wealsoobserved thatmorechannelswereusedtoachieveminimumclassification accuracyintheEEGdataset.Thiscouldbeduethelowspatial res-olutionoriginatingfromthevolumeconductionandlowSNRof theEEG.Nevertheless,theSSPalgorithmwasabletoreacha mini-mumerrorratewithonly15channels.OurresultsindicatethatSSP methodcanbeeffectivelyusedtoextractfeaturesfrombothEEG andECoGdatasetswithlargenumberofrecordingchannels.We notethatthesparsemethodsprovidedsuperiorresultscompared tothestandardCSPwhenthereisanoisy/corruptedchannelinthe testdata.Inanothersetupwheredisplacedchannelswereusedto simulateintersessionvariability,wenotethatthesparsemethods hadslightlybetterrobustnessthanthestandardCSP.These obser-vationsindicatethatthesparsespatialprojectionframeworkcan beeffectivelyusedasarobustfeatureextractionengineoffuture BCIsystems.
Acknowledgments
Thisresearchwassupportedin partby theNationalScience Foundation,awardCBET-1067488,andbyagrantfromthe Univer-sityofMinnesotaInterdisciplinaryInformatics(UMII).Itwasalso
supportedinpartbytheTheScientificandTechnologicalResearch CouncilofTurkey(TUBITAK),2211PhDfellowshipforTurkish citi-zensandproject111E057.
References
[1] B.Blankertz,R.Tomioka,S.Lemm,M.Kawanabe,K.-R.Muller,Optimizing spa-tialfiltersforrobustEEGSingle-trialanalysis,signal,ProcessingMagazine,IEEE 25(2008)41–56.
[2]N.F.Ince,R. Gupta,S.Arica,A.H.Tewfik,J. Ashe,G. Pellizzer,High accu-racydecodingofmovementtargetdirectioninnon-humanprimatesbased oncommonspatialpatterns oflocalfield potentials,PLoS One 5(2010) e14384.
[3]J. Farquhar, N.J. Hill, T.N. Lal, B. Schlkopf, Regularised CSP for sen-sor selection in BCI, in: 3rd International Brain-Computer Interface Workshop and Training Course, 14-15 Aug 2006, Graz, Austria, 2006, http://eprints.pascal-network.org/archive/00002709/
[4]B.Reuderink,M.Poel,RobustnessoftheCommonSpatialPatternsalgorithmin theBCI-pipeline,2008.
[5] H.Ramoser,J.Muller-Gerking,G.Pfurtscheller,Optimalspatialfilteringof singletrialEEGduringimaginedhandmovement,IEEETransactionson Reha-bilitationEngineering8(2000)441–446.
[6]X. Yong, R. Ward, G. Birch, Sparse spatial filter optimization for EEG channelreductioninbrain–computer interface,in:Acoustics,Speechand Signal Processing, IEEE International Conference on ICASSP 2008, 2008, pp.417–420.
[7]M.Arvaneh,C.Guan,K.K.Ang,C.Quek,Optimizingthechannelselectionand classificationaccuracyinEEG-basedBCI,IEEETransactionsonBiomedical Engi-neering58(2011)1865–1873.
[8]F.Goksu,N.Ince,A.Tewfik,Sparsecommonspatialpatternsinbraincomputer interfaceapplications,in:IEEEInternationalConferenceonAcoustics,Speech andSignalProcessing(ICASSP),2011,pp.533–536.
[9] S.-i.Lee,H.Lee,P.Abbeel,A.Y.Ng,EfficientL1RegularizedLogisticRegression, in:InAAAI.
[10]K.J.Miller,G.Schalk,PredictionofFingerFlexion4thBrain–ComputerInterface DataCompetition,2008.
[11] G.Dornhege,B.Blankertz,G.Curio,K.-R.Müller,BciCompetionIII,DatasetIVa, 2005.
[12]M.Schmidt,G.Fung,R.Rosales,FastOptimizationMethodsforL1 regulariza-tion:AComparativeStudyandTwoNewApproaches,2009.
[13] R.Burden,J.Faires,NumericalAnalysis,8thed.,ThomsonBrooks/Cole,2005. [14]L.Mackey,DeflationMethodsforSparsepca,in:D.Koller,D.Schuurmans,Y.
Bengio,L.Bottou(Eds.),in:AdvancesinNeuralInformationProcessingSystems, vol.21,2009,pp.1017–1024.
[15] I.Onaran,N.F.Ince,A.E.Cetin,ClassificationofMultichannelECoGRelatedto IndividualFingerMovementswithRedundantSpatialProjections,in: Inter-nationalIEEEEMBSConference,2011,http://www.ieeeexplore.info/search/ searchresult.jsp?newsearch=true&queryText=Classification+of+Multichannel+ ECoG+Related+to+Individual&x=34&y=18
[16]C.-C.Chang,C.-J.Lin,AlibraryforSupportVectorMachines,2001.
[17]R.O.Duda,P.E.Hart,D.G.Stork,PatternClassification,2nded.,JohnWiley& Sons,Inc.,NewYork,N.Y.,2000.