• Sonuç bulunamadı

Resource-aware event triggered distributed estimation over adaptive networks

N/A
N/A
Protected

Academic year: 2021

Share "Resource-aware event triggered distributed estimation over adaptive networks"

Copied!
11
0
0

Yükleniyor.... (view fulltext now)

Tam metin

(1)

Contents lists available atScienceDirect

Digital

Signal

Processing

www.elsevier.com/locate/dsp

Resource-aware

event

triggered

distributed

estimation

over

adaptive

networks

Ihsan Utlu

a

,

b

,

O. Fatih Kilic

a

,

Suleyman

S. Kozat

a

,

aDepartmentofElectricalandElectronicsEngineering,BilkentUniversity,Ankara,Turkey bASELSANResearchCenter,Ankara06370,Turkey

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory:

Availableonline1June2017 Keywords:

Distributedestimation Adaptivenetworks

Event-triggeredcommunication Level-crossingquantization

We propose a novel algorithm for distributed processing applications constrained by the available communication resources using diffusion strategies that achieves up to a103 fold reduction in the communicationloadoverthenetwork,whiledeliveringacomparableperformancewithrespecttothe stateoftheart.Aftercomputationoflocalestimates,theinformationisdiffusedamongtheprocessing elements(ornodes)non-uniformlyintimebyconditioningtheinformationtransferonlevel-crossingsof thediffusedparameter,resultinginagreatlyreducedcommunicationrequirement.Weprovidethemean andmean-squarestabilityanalysesofouralgorithms,andillustratethegainincommunicationefficiency comparedtootherreduced-communicationdistributedestimationschemes.

©2017ElsevierInc.Allrightsreserved.

1. Introduction

In tandem with the increasing computational capabilities of processingunits andthe growing amount of generateddata, the demandon distributednetworks anddecentralizeddata process-ing algorithms have remained an area ofgrowing interest [1–3]. With intrinsic characteristics such as robustness and scalability, distributed architectures provide enhanced efficiency and perfor-mance for a wide variety of applications ranging from adaptive filtering,sequential detection, sensor networks, to distributed re-source allocation [4–9]. However, successful implementation of suchapplicationsdependsonasubstantialamountof communica-tionresources.Asanexample,insmartgridapplications, measure-mentunitsoperatingwithhighfrequencyputthecommunication infrastructureofthegridundersignificantpressure[10].Thiscalls forresource-efficient,event-triggereddistributed estimation solu-tionsthatincorporateevent-drivencommunication[11–15].Tothis end,inthispaper,weconstructdistributedarchitecturesthathave asignificantlyreducedcommunicationloadwithoutcompromising performance.Weachievethisbyintroducingnoveleventtriggered communicationarchitecturesoverdistributednetworks.

Inadistributedprocessingframework,agroupof measurement-capableagents,termednodes,inanetworkcooperatewithone an-otherinordertoestimateanunknowncommonphenomenon[16].

*

Correspondingauthor.

E-mailaddresses:utlu@ee.bilkent.edu.tr(I. Utlu),kilic@ee.bilkent.edu.tr (O. Fatih Kilic),kozat@ee.bilkent.edu.tr(S.S. Kozat).

Among the different approaches for distributed estimation, we specificallyconsiderdiffusion-basedprotocolsthatexploitthe spa-tial diversity of the network by restricting information sharing to neighboring nodes, without considering any central process-ing unit or a fusion center [16,17]. Diffusion protocols provide an inherentlyscalabledata processingframeworkthat isresilient to changes in network topology such aslink failures as well as changesinthestatisticalpropertiesoftheunknownphenomenon that ismeasured [16].However, therequirementfor allnodes to exchange their currentestimateswiththeir neighbors ateach it-eration places a heavy burden on the available communication resources[18].

Here, we propose novel event-triggered distributed estima-tion algorithms for communication-constrained applications that achieve up to a 103 fold reduction in the communication load over thenetwork. We achievethis byleveraging the uneven dis-tributionoftheeventsovertimetoefficientlyreducethe commu-nication loadin real life applications. In particular, we condition an informationexchange betweenthe neighboringnodes on the level-crossingsofthediffusedparameter[19],unlikeusingafixed rateofdiffusion,cf.[16,17].Furthermore,weshow thatitis suffi-cienttoonlydiffusetheinformationindicatingthedirectionofthe changeinthelevels,whichcanbehandledusingonlytwobitsfor aslowly-varyingparameter.

Reducedcommunicationdiffusionisextensivelystudied inthe signal processing literature [18,20–23]. In [18,20,21], the authors restrict the number of active links between neighbors using a probabilistic framework, or by adaptively choosing a single link ofcommunicationforeach node.In[22],localestimatesare

ran-http://dx.doi.org/10.1016/j.dsp.2017.05.011 1051-2004/©2017ElsevierInc.Allrightsreserved.

(2)

Fig. 1. An exampledistributednetworkwithbidirectionalconnections.Circulararea representstheneighborhoodoftheithnode.

domlyprojected,andtheinformationtransferbetweenthe nodes isreducedto asingle bit.In[23],onlycertain dimensionsofthe parametervectoraretransmitted.Ontheotherhand,inthispaper, wereducethecommunicationloaddowntoonlyasinglebitora couple ofbits, unlike [18,20,21,23], in which authors diffuse pa-rametersinfullprecision.Furthermore,weregulatethefrequency ofinformation exchange depending on the rateof change ofthe parameter, unlike [22] where the authorstransfer information at eachsingletimeinstant.

Ourmaincontributionsareasfollows.Weintroducealgorithms fordistributedestimationthati)significantly reducethe commu-nicationloadonthenetwork,ii)whilecontinuingtoprovideequal performancewiththestateoftheart. Wealsoperformthemean andmean-squarestabilityanalysesofouralgorithms.Through nu-mericalexamples,weshowthatouralgorithmsprovidesignificant reductioninthecommunicationloadoverthenetwork.

The paperis organized asfollows: In Section 2,we introduce thedistributedestimationframeworkanddiscussthe adapt-then-combine(ATC)diffusionstrategy.Wefurtherdetailouralgorithms in Section 3, where we formulate the level-triggered distributed estimationalgorithm.InSection4,wepresentthealgorithmic de-scriptionoftheproposedscheme.InSections5and6,weprovide respectively the mean and mean-square stability analyses of the proposed distributed adaptive filter and state the conditions for stability.Weprovideexperimentalverificationofthealgorithm in Section7,andconcludingremarksinSection8.

2. Problemdescription

Consider a network with N nodes that are distributed spa-tiallyasshowninFig. 1.Eachnode sequentiallyobservesa noise-corruptedtransformationofanunknownparameterwo througha

linearmodel

di,t

=

uTi,two

+

vi,t

,

i

=

1

, . . . ,

N (1)

anddiffusesinformationto itsneighboring nodes j

Ni

,1 where wo

∈ R

M is the unknown phenomenon, with ui,t and vi,t

rep-resentingthe regressorandthe noiseprocesses, respectively.The additiveobservationnoise vi,t andtheregressor ui,t areassumed

to be temporally and spatially independent, and independent of one another, with E



ui,tuTi,t



=

σ

2 u,iIM, E



v2i,t



=

σ

2 v,i. For each

node i,we assume thatattime t onlythe regressorui,t andthe

1 Werepresentvectors(matrices)byboldlower(upper)caseletters.Fora vec-tora (amatrix A),aT ( AT)isthetranspose.arepresentstheEuclideannorm.

Thediag{A}returnsanewmatrixwithonlythemaindiagonalofA whilediag{a} putsa onthemaindiagonalofthenewmatrix.col{a1, . . . ,aN}producesacolumn

vectorformedbycolumn-wisestackingitsargumentsontopofoneanother.IM

representstheM×M identitymatrix.⊗ standsfortheKroneckerproduct,Tr{·} standsforthetrace.

observation di,t along with the parameter estimates from

neigh-boring nodes

φ

j,t

,

j

Ni

are available toit. Therefore each node

incursthecostfortheparameter w [17]

Ji

(

w

)

=

1 2E

|

di,t

u T i,tw

|

2

+

1 2



j∈Ni\{i}

α

i,j



w

− φ

j



2 2

,

(2)

where

α

i,jisanon-negative,realcoefficientsatisfying



Nj=1

α

i,j

=

1 thatassigns differentweightstodifferentneighbors.Inorderto minimize (2)inan onlinemanner,weemploy thestochastic gra-dientapproach[24].Tothisend, wecalculatethegradientfor(2)

as



wJi

(

w

)



T

=

Ru,iw

Rdu,i

+



j∈Ni\{i}

α

i,j

(

w

− φ

j

),

(3)

where Ru,i

=

E

[

ui,tuTi,t

]

and Rdu,i

=

E

[

ui,tdi,t

]

.Using the

instan-taneous approximations Ru,i

ui,tuiT,t and Rdu,i

ui,tdi,t in(3),

we obtainan approximateexpressionforthegradientofthe cost functionin(3)as



wJi

(

w

)



T

ui,t

(

uTi,twi,t

di,t

)



j∈Ni\{i}

α

i,j

j

wi,t

).

(4)

Consideringthatweareoptimizingasumoftwoconvexcost func-tionsin(2)withtheuseof(4),wenotethatwecancarryoutthe optimization using incremental solutions over (2) where the up-dateisperformedintwosteps.Sinceweconsiderthe adapt-then-combine(ATC)diffusionstrategyforthispaper,firstwe createan intermediate estimatebyusingthegradientofthefirstsummand in(2)andthenupdatetheestimateusingthesecondsummandin

(2)as[17]

φ

i,t+1

=

wi,t

+

μ

iui,t

(

di,t

uiT,twi,t

),

(5) wi,t+1

= φ

i,t+1

+

η

i



j∈Ni\{i}

α

i,j

j,t+1

− φ

i,t+1

),

(6) where

μ

i and

η

i are positive step sizes. Note that we have

re-placedtheestimatescomingfromneighbors

φ

j withtheir

instan-taneousapproximations

φ

j,t+1.Now,werepresenttheequationin

(6)as wi,t+1

=



j∈Ni

pi,j

φ

j,t+1

,

(7)

where we have defined pi,i

= (

1



jNi\{i}

η

i

α

i,j) and pi,j

=

η

i

α

i,jfor j

=

i toobtain(7),yieldingthenetworkmatrixP

= [

pi,j

]

comprisedofthecombinationweights



Nj=1pi,j

=

1 withpi,j

0. 3. Distributedestimationwithleveltriggeredsampling

Thewell-knownATCfulldiffusionscheme(7)requiresallnodes inthenetworktocommunicatetheircurrentestimates(i)intheir entirety,and(ii)atafixedratetoalltheirneighboringnodes[17]. We proposeanewscheme,whichachievesanincreased commu-nicationefficiencyby conditioningthediffusionofinformationon thetriggerofanevent,insteadofrelyingonafixedrateofdiffusion. Ourapproachconsiderablyreducestheloadoncommunication re-sources sinceonly“significantchanges” inthediffusedparameter, e.g.,anabruptchangeinthelocalestimate,areconveyedbasedon theparticularrealizationofthesignal.

Toclarify theframework,we considerthediffusionofascalar parameter

ξi

,t fromagivennodei toaneighboringnode j.Asan

example,thisinformationcan be asingle componentofthe esti-mates [23],orthe errorassociatedwith an additionalestimation layer [22]. In our distributed framework, due to communication constraints, a quantized version of the original parameter,

ξ

iq,t is

(3)

Fig. 2. Illustration oftheoperation oftheLCquantizer.Blue dotsrepresentthe originalnodeestimates,whileredonesrepresentthequantizedversionofthe cor-respondingestimates.(Forinterpretationofthereferencestocolorinthis figure legend,thereaderisreferredtothewebversionofthisarticle.)

shared.We aimto formaquantizationscheme,whichguarantees that

ξi

,t and

ξ

iq,t are approximatelyequal to each other forall t,

while atthe same time keepingthe load on communication re-sourcesrelativelysmall.

To solve this problem, we propose an event-triggered com-municationalgorithmwhere,astheevent-triggeredapproach,we specificallyuselevelcrossing(LC)quantization[19].Toclarifythe framework,suppose we haveadiscrete time signal

ξi

,t asshown

inFig. 2thatrepresentstheinformationtobecommunicatedfrom thenode i tothenode j,e.g.,theestimatedparameter,orthe es-timationerror.In conventionalquantization, ateachtime instant, wesampleandquantizethisparameter.Ontheotherhand,inthe LCquantization, weconsiderasetoflevels

S  {

l1

,

. . . ,

lK

}

,which

isillustrated in Fig. 2. At each discrete time index t, thenode i

checks whether a level-crossing has occurred on

ξi

,t. When the

parameter

ξi

,t crossesalevelli,t,i.e.,

ξ

i,t−1

li,t

ξ

i,t

li,t

<

0 for some li,t

S

,

thenode i transmitsinformationtoitsneighboringnodes.For ex-ample,thisinformationcan bethe directionof thelevel-crossing

[19].Aneighboringnode j usesthisreceivedinformationtoform anestimate

ξ

iq,t for

ξi

,t.

If there is an information transfer by the node i at time t,

thereceivingnode j estimatestheparameterasthelevelthrough whichalevelcrossinghasoccurred:

ξ

iq,t

=

li,t

.

(8)

Forthe time instants when the node i is silent, the node j

in-fers that nosignificant change inthe parameter hastakenplace, anduses the estimatedparameter value from the previous time instant:

ξ

iq,t

= ξ

iq,t1

.

(9)

We note that the set of levels

S

is known by all nodes in the network. Hence, as the diffused information, it is sufficient for the node i to only convey how

ξ

iq,t changes compared to the previously-crossedlevel

ξ

iq,t1. In particular, we note the follow-ingtwocases: Inthefirstcase,theparameter

ξi

,t changes slowly

enough such that a crossing through multiple levels do not oc-cur,sothat thenode i onlyneeds toindicatethedirection ofthe changeinlevels.Therefore,wetransmittwobitsforthiscase,one forindicating that the single level crossing occursand the other forindicatingthedirectionofcrossing.Inthesecondcase,wemay havemultiple crossings where we directly code the full location

information ofthe new levelvalue

ξ

iq,t with a flag bitindicating multiplelevelcrossingoccurredusing

log2

(K

)

+

1 bits.Asshown, this approach significantly lowers the amount of communication whilemaintainingestimationperformance.

4. Algorithmdescription

In this section, we present the full algorithmic description of theproposeddiffusionschemewiththelevel-crossingquantization

[19].At time t, a given node i in the network makes the scalar observationdi,tthroughthelinearmodeldi,t

=

uTi,two

+

vi,t,which

is then used to update its intermediary local estimate using the LMSadaptation

φ

i,t+1

= (

IM

μ

iui,tuTi,t

)

wi,t

+

μ

iui,tdi,t

.

Due to the quantized communication framework, a neighboring node j doesnot have access to the true value of the parameter

φ

i,t+1,whichhas M entries.As such, basedonthe limited infor-mation it receives fromthe node i, the node j triesto estimate thisparameterasthe M-entryvector

φ

iq,t+1.Specifically,intheLC quantization, thenode j receivesinformationabouthowthe cur-rent values of the entries of the parameter

φ

i,t+1 have changed relative to the most recent estimate the node j has access to, namely

φ

qi,t.Thenode i recordsthemostrecentestimate,

φ

qi,t,asa referenceanddiffusesinformationtotheneighboringnodes j

Ni

indicating how the current estimate

φ

i,t+1 compares to this ref-erence ona per-entry basis. Inparticular, the node i makes this comparisonby checkingfora levelcrossing between correspond-ingentries ofthetwo vectorquantities

φ

qi,t and

φ

i,t+1.Ifthereis a levelcrossing on an entry,the node i transmitsinformation to its neighbors through a channel frequency allocated to this par-ticular entry. If there is a single level-crossing, this information indicates thedirectionofthelevelcrossing;otherwise, the trans-mittedinformationdirectlyspecifiesthelocationofthenewlevel. Aneighboringnode j thenconstructstheestimate

φ

qi,t+1 using(8)

or(9)onaper-entrybasis,dependingonwhetherthenodei

dif-fusesinformationornot,respectively,attime t.

While diffusing informationrelated to its own local estimate, the node i alsoreceives information fromtheneighboring nodes

j representing their local estimates

φ

j,t+1. For each neighboring node j, the node i uses this diffused information to reconstruct

φ

qj,t+1 using (8) or (9). The final estimate wi,t+1 is then con-structedusingthecombination

wi,t+1

=

pi,i

φ

i,t+1

+



j∈Ni\{i}

pi,j

φ

qj,t+1

.

Remark.Inordertokeep thepresentationclear,we illustratethe special caseof M

=

1 ofthe proposed algorithm inAlgorithm 1, whichcanbegeneralizedtoarbitraryM inastraightforward man-ner.

Remark.Wenotethatanalternativeapproachtodealingwiththe

M

>

1 case is to havethe nodesinthe network transmit onlya certain entryoftheirintermediary estimates

φ

i,t.Asan example,

in thiscase, the nodescan cyclethrough different entriesacross timeinaround-robin fashion.Thenon-communicatedentriesare replacedbythecorrespondingentriesinthelocalintermediary es-timate[23].ThisapproachisexploredinSection7.

5. Meanstabilityanalysis

Tocontinuewiththestabilityanalysisoftheproposedscheme, we assume that the regressors ui,t are temporally and spatially

(4)

Algorithm1ATCdiffusionLMSwiththeLCquantization,M

=

1. 1: for i=1 toN do Initialization: 2: wi,0= φ q i,0=0 3: end for 4: for t0 do 5: for i =1 toN do Localadaptation: 6: φi,t+1= (1−μiu2i,t)wi,t+μiui,tdi,t

Checkforlevelcrossing: 7: ifli,t∈ Ssuch that

(φqi,tli,t)(φi,t+1−li,t)<0 then

8: if The crossingistoanadjacentlevel then 9: Diffusethedirectionofthecrossing 10: else

11: Diffusethelocationofthenewlevel

12: end if

13: Locallystoreφiq,t+1=li,tinrecord

14: else 15: Remainsilent 16: Locallysetφiq,t+1= φ q i,t 17: end if Reconstruction: 18: for all j ∈ Ni\{i}do

19: if node j issilent then

20: Reconstructasφqj,t+1= φq j,t

21: else

22: Reconstructφqj,t+1usingthediffused information 23: end if 24: end for Combination: 25: wi,t+1=pi,iφi,t+1+jNi\{i}pi,jφqj,t+1 26: end for 27: end for

independent,zero meanandwhite,with covariancematrix



i



E



ui,tuiT,t



=

σ

2

u,iIM.Theobservationdi,t atnodei is assumedto

followalinearmodeloftheform

di,t

=

uTi,two

+

vi,t

,

(10)

where

{

vi,t

}

t≥1 isazeromeanwhiteGaussiannoise processwith variance

σ

2

v,i,independentof

{

uj,t

}

t≥1

i,j.

Inour proposedlevel-triggered estimationframework, ateach node i, the diffusion LMS update for the ATC strategy takesthe form

φ

i,t+1

= (

IM

μ

iui,tuiT,t

)

wi,t

+

μ

iui,tdi,t

,

(11) wi,t+1

=

pi,i

φ

i,t+1

+



j∈Ni\{i} pi,j

φ

qj,t+1

,

(12)

wherethecombinationmatrix P istakentobestochastic,withits rowssumming up to unity. We rewrite theexpressions (11)and

(12)as

φ

i,t+1

= (

IM

μ

iui,tuiT,t

)

wi,t

+

μ

iui,tdi,t

,

(13) wi,t+1

=



j∈Ni pi,j

φ

j,t+1



j∈Ni\{i} pi,j

α

j,t+1

,

(14)

bydefiningthequantizationerrorfornode j

α

j,t

 φ

j,t

− φ

q j,t

.

We representthe diffusionupdate over thenetwork

N

in state-spaceformbyintroducingthefollowingglobalquantities:

dt



col

d1,t

, . . . ,

dN,t

vt



col

v1,t

, . . . ,

vN,t

wo



col

{

wo

, . . . ,

wo

}

Ut



diag

u1,t

, . . . ,

uN,t

M



diag

{

μ

1IM

, . . . ,

μ

NIM

}

wt



col

w1,t

, . . . ,

wN,t

φ

t



col

φ

1,t

, . . . , φ

N,t

φ

qt



col

φ

q1,t

, . . . , φ

qN,t



α

t



col

α

1,t

, . . . ,

α

N,t

G



P

IM PC



P

diag

{

P

}

GC



PC

IM

Usingtheabove-definedquantities,thediffusionupdates(13),(14)

takethefollowingglobalstate-spaceform:

φ

t+1

= (

IM N

M UtUtT

)

wt

+

M Utdt

,

(15)

wt+1

=

G

φ

t+1

GC

α

t+1

.

(16) Similarly, the data model (10)can be expressed in terms of the globalquantitiesas

dt

=

UtTwo

+

vt

.

(17)

To facilitate the mean stability analysis, we define the global deviationparameters

˜

wt



wo

wt

,

˜φ

t



wo

− φ

t

.

Aftersubstituting(17)andsubtractingbothsidesof(15),(16)from wo, the diffusion updates in terms of the deviation parameters takethefollowingform:

˜φ

t+1

= (

IM N

M UtUtT

)

w

˜

t

M Utvt

,

(18)

˜

wt+1

=

G

˜φ

t+1

+

GC

α

t+1

,

(19) where wehave usedtherelation G wo

=

wo, whichresultsfrom thestochasticnatureof P .

Theexpressions(18),(19)canbeexpressedcompactlyas

˜

wt+1

=

G

(

IM N

M UtUtT

)

w

˜

t

G M Utvt

+

GC

α

t+1

.

(20)

Assumption.Thequantizationerroroverthenetwork

α

t haszero

mean.Thisis areasonable assumptionfortheanalysisof quanti-zation effects[24].The applicabilityof theassumptionis verified byourexperimentsinSection7.

Takingexpectationsofbothsidesof(20)yields

E



w

˜

t+1



=

G

(

IM N

M)E



˜

wt



,

(21)

where





diag

{

1

, . . . , N

}

isblockdiagonal.Formeanstability andasymptoticunbiasednessofthedistributedfilter(11)–(12),we require that the spectral radius

|

G

(

IM N

M

)

|

<

1, which,

not-ing that G is stochasticwithnonnegativeentries,isequivalent to requiring

|(

IM N

M)

| <

1

,

(22)

by the Theorem 4

.

4 of [25]. Noting that the eigenvalues of the block diagonalmatrix IM N

M



istheunionofthe eigenvalues

ofitsindividualblocksIM

μ

i



iwhere



i

=

σ

u2,iIM;weconclude

that the distributedfilter is mean stableif

|

1

μ

i

σ

u2,i

|

<

1

,

i

=

1

,

. . . ,

N,i.e.,if 0

<

μ

i

<

2

σ

2 u,i i

=

1

, . . . ,

N

,

(5)

6. Mean-squarestability

Weutilizetheweighted energyrelationapproach[24] to pro-ceed the mean square transient analysis of the distributed fil-ter. Through a positive-definite weighting matrix



, taking the weightednormofbothsidesof(20)yields:

˜

wtT+1



w

˜

t+1

=

˜

wTt

(

IM N

M UtUtT

)

TGT

G

(

IM N

M UtUtT

)

w

˜

t

2vtTUTt M GT



G

(

IM N

M UtUtT

)

w

˜

t

+

2

α

Tt+1GTC



G

(

IM N

M UtUTt

)

w

˜

t

2vtTUTt M GT



GC

α

t+1

+

vTtUtTM GT



G M Utvt

+

α

tT+1GCT



GC

α

t+1

.

(23)

Notingthat vt iszero-mean andindependentof Ut and w

˜

t, and

takingtheexpectedvalueofbothsidesof(23)yieldsthefollowing variancerelation: E

 ˜

wt+1



2

=

E

 ˜

wt



2

+

2E



α

T t+1GTC

G

(

IM N

M UtUtT

)

w

˜

t



2E



vtTUtTM GT

G

C

α

t+1



+

E



vTtUTtM GT



G M Utvt



+

E



α

Tt+1GTC



GC

α

t+1



,

(24) where







GT



G

GT

G M U

tUtT

UtUtTM GT



G

+

UtUtTM GT

G M U

tUtT

.

BythetemporalindependenceoftheregressorprocessUt andthe

independenceofthenoiseprocessvt fromUt,wehavetheresult

thatUt isindependentofw

˜

t.Hence,therandomweightingmatrix



 canbereplacedbyitsmeanvalue







E









in(24).Thus,





=

GT



G

GT

G M



− 

M GT

G

+

E



UtUtTM GT

G M U

tUtT



,

(25) where





E



UtUtT



.Substituting the

˜φ

t+1 expressionfrom(18) into(24)yieldsthefollowingfinalformofthevariancerelation

E

 ˜

wt+1



2

=

E

 ˜

wt



2

+

2E



α

T t+1GTC



G

˜φ

t+1



+

E



α

t+T 1GTC

G

C

α

t+1



+

E



vtTUtTM GT

G M U

tvt



.

(26)

To capture the mean-square behavior of the adaptive net-work, we express the relations (25), (26) in a compact form by using the convenient vector notation [24]. In particular, we use the bvec

{·}

block vectorization operation [16] which trans-forms an arbitrary M N

×

M N block matrix



with the

(i,

j)th block



i j of size M

×

M into the vector col

{

σ

1

, . . . ,

σ

N

}

,where

σ

j



col

vec

{

1 j

}, . . . ,

vec

{

N j

}

. We also use the block Kro-neckerproduct A



B definedashavingthe

(i,

j)thblock

[

A



B

]

i j

=

Ai j

B11

. . .

Ai j

B1N

..

.

. .

.

..

.

Ai j

BN1

. . .

Ai j

BN N

⎦ ,

(27)

which is related to the bvec

{·}

operator via bvec

{

A BC

}

=

(C

T



A)bvec

{

B

}

.Defining

σ



bvec

{}

andvectorizingbothsides of(25)yields bvec

{



} = ((

IM N



IM N

)

− (

M



IM N

)

− (

IM N

 

M

)) (

GT



GT

)

σ

+

bvec

{

E



UtUtTM GT

G M U

tUtT



}.

(28) The term E



UtUtTM GT



G M UtUtT



on the right-hand side of

(28) can be vectorized by resorting to the Gaussian factoriza-tion theorem [16,17]. We let

˜ =

M GT



G M with

(i,

j)th block

˜

i,j and with the vectorized form bvec

{ ˜}

=

col

˜

σ

1

, . . . ,

σ

˜

j

where

σ

˜

j

=

col

˜

σ

1 j

, . . . ,

σ

˜

N j

.Then,the

(k,

l)thblock



kl of





E



UtUTt

˜

UtUTt



isgivenby



kl

=





k

˜

kl



l for k

=

l

,



k

˜

kl



k

+

2



kTr

{ ˜

kk



k

}

for k

=

l

,

withthevectorizedform

γ

kl

=



(

l

⊗ 

k

)

σ

˜

kl for k

=

l

,

(

l

⊗ 

k

)

+

2rkrkT

˜

σ

kl for k

=

l

,

by the factorization theorem, where



k



E



uk,tukT,t



, rk



vec

{

k

}

. Letting bvec

{}

=

col

γ

1

, . . . ,

γ

j

where

γ

j

=

col

γ

1 j

, . . . ,

γ

N j

,weobservethatwecanexpress

γ

j intheform

γ

j

=

A

j

σ

˜

j

,

where

A

j



diag



j

⊗ 

1

, . . .



j

⊗ 

j

+

2rjrTj

, . . . , 

j

⊗ 

N



. Furtherdefining

A



diag

{

A

1

, . . . ,

AN

}

,wearriveatthe represen-tation bvec

{} =

A

bvec

{ ˜} =

A

(

M



M

)(

GT



GT

)

σ

.

(29) Substituting(29)to(28)yields bvec

{



} = ((

IM N



IM N

)

− (

M



IM N

)

− (

IM N

 

M

)

+

A

(

M



M

)) (

GT



GT

)

σ

.

(30) ThetermE



vtTUtTM GT



G M Utvt



in(26)canbeverifiedtobe E



vtTUTtM GT



G M Utvt



=

E



Tr

{

vTtUTtM GT



G M Utvt

}



=

E



Tr

{

G M UtvtvTtUTtM GT

}



=

Tr

{

G M H M GT

},

(31) where we have defined H

=

E



UtvtvtTUtT



. We observe that H has the

(k,

l)th block Hkl

=

σ

v2,k



k

δ

kl, which yields H

=

(

v

IM) ,where



v



E



vtvTt



.Thus(31)becomes E



vtTUTtM GT



G M Utvt



=

Tr

{

G M

(

v

IM

)

M GT

}

= ((

G M



G M

)

bvec

{(

v

IM

) })

T

σ

.

(32)

SimilarlytheremainingtermsintheRHSof(26)canbeverifiedto be

(6)

E



α

T t+1GTC



G

˜φ

t+1



= ((

G



GC

)

bvec

{

E

[

α

t+1

˜φ

T t+1

]})

T

σ

,

E



α

Tt+1GTC



GC

α

t+1



= ((

GC



GC

)

bvec

{

E

[

α

t+1

α

t+T 1

]})

T

σ

.

(33) Definingthequantities

bt

 (

G M



G M

)

bvec

{(

v

IM

) } + (

G



GC

)

bvec

{

E

[

α

t

˜φ

T t

]}

+ (

GC



GC

)

bvec

{

E

[

α

t

α

Tt

]},

F

 ((

IM N



IM N

)

− (

M



IM N

)

− (

IM N

 

M

)

+

A

(

M



M

)) (

GT



GT

),

(34)

andfurtherusingtheshorthandE

 ˜

wt



2σ forE

 ˜

wt



2bvec−1(σ),yields thefollowingcompactformfortheweightedenergyrecursion:

E

 ˜

wt+1



2σ

=

E

 ˜

wt



2Fσ

+

bTt+1

σ

(35)

Remark. We note that the expectations E

[

α

t+1

˜φ

T t+1

]

and E

t+1

α

tT+1

]

presentsome difficulty forfurtheranalytical simpli-fications in closed form, in exact or approximate terms. This is causedbythelargedegreewithwhichthequantizationerrorterm

α

t iscoupledwithitselfaswellastheintermediaryparameter

de-viation

˜φ

tnonlinearlythroughthenon-deterministicreference lev-els

qi,t

}

t≤t against whichthelevel crossingeventsare checked,

whichevolvethrough(13)–(14).Wefurthernotethat invokingan approximationbasedonindependenceargumentsforE

t+1

˜φ

T t+1

]

, whichcapturesthecovariancesbetweentheintermediary param-eterdeviationsandthe quantizationerrorsover arbitrarypairs of nodes on the network, is not feasible in general unless further assumptions aremade onthe numberofquantizationlevels em-ployed so that the deviations become statistically less sensitive on the error terms. We stress that the lack of closed-form ex-pressionsfortheseexpectationsdoesnothamperouranalysisfor themean-squarestability,sincerequiringthattheaforementioned termsremainboundedissufficientforthepurposesofestablishing aboundforthe(weighted)mean-squaredeviationE

 ˜

wt



2σ .

Iterationof(35)yieldstherecursions

E

 ˜

wt+1



2σ

=

E

 ˜

wt



2Fσ

+

bTt+1

σ

E

 ˜

wt+1



2Fσ

=

E

 ˜

wt



2F2σ

+

btT+1F

σ

..

.

E

 ˜

wt+1



2 FN2 M2−1σ

=

E

 ˜

wt



2 FN2 M2σ

+

b T t+1FN 2M21

σ

.

(36)

Using Cayley–Hamilton theorem with characteristic polynomial

p(x)forFresultsin FN2M2

= −

pN2M21FN

2M21

− . . . −

p1F

p0

.

Substitutingto(36)thenresultsintheexpression

E

 ˜

wt+1



2 FN2 M2−1σ

= −

pN2M21E

 ˜

wt



2 FN2 M2−1σ

− . . . −

p0E

 ˜

wt



2 σ

+

bt+T 1FN2M2−1

σ

,

whichcanbeplacedintothestatespaceform

Wt

+1

=

FWt

+

Yt

+1

,

(37) where

Wt



E

 ˜

wt



2σ E

 ˜

wt



2Fσ

..

.

E

 ˜

wt



2 F(N2 M2−1)σ

,

Yt



bTt

σ

bTt F

σ

..

.

btTFN2M2−1

σ

(38)

F



0 1 0

. . .

0 0 0 1

. . .

0

..

.

..

.

..

.

. .

.

..

.

p0

p1

p2

. . .

pN2M21

.

(39)

Tomakethemean-square stabilityanalysismoretractable,we introducethefollowingassumption:

Assumption.ThequantizationerrorcovariancesE



α

t+1

˜φ

T t+1



and E



α

t+1

α

tT+1



remain bounded, with





E



α

t+1

˜φ

T t+1





F

,



E



α

t+1

α

tT+1



F

<

A forsome A

>

0 fortheFrobeniusnorms.

Usingtheassumption,weobtainaboundthenorm



bt



2as



bt



2

≤ (

G M



G M

)

bvec

{(

v

IM

) }

2

+ 

G



GC



2





bvec

{

E

[

α

t

˜φ

T t

]}





2

+ 

GC



GC



2





bvec

{

E

[

α

t

α

Tt

]}





2

≤ (

G M



G M

)

bvec

{(

v

IM

) }

2

+

A

(

P



2

+ 

PC



2

)



PC



2



B

.

Inspecting(39),weobservethattheboundednessof



bt



2 im-pliestheboundednessof



Yt



2,hence

C

>

0 s.t.



Yt



2

<

C

t.

Therecursion(37)canbesolvedfor

Wt

inclosedformas

Wt

=

F

t

W

0

+

t−1



n=0

F

n

Yt

n

.

(40)

Using(40),wecanobtainaboundfor



Wt



2as



Wt



2

≤ 

F



t2



W

0



2

+

t−1



n=0



F



n 2



Yt

n



2

≤ 

F



t2



W

0



2

+

C t−1



n=0



F



n2

= 

F



t2



W

0



2

+

C 1

− 

F



t2 1

− 

F



2 (41) where we have used the fact that since

F

is in the form of a companionmatrixforF ,theysharethesamesetofeigenvalues.

We note that requiringthat



Wt



2 remains boundedis suffi-cient toguaranteethemean-squarestabilityoftheoverallsystem since doing so ensures that E

 ˜

wt



2σ remainsbounded. Thus, by

(41),themean-squarestability conditionreducestothematrix F givenby(34)beingstable.HenceinordertoensureMSstability,it issufficientthatthestepsizes

μ

iarechosensuchthatthematrix F isstable.

7. Experiments

Inthissection,wedemonstratethesignificantreductioninthe communication load achievedby our algorithms while providing equalperformancewithrespecttothestateoftheart.

For the first part of the simulations, we consider a sample network consistingof N

=

10 nodes, whereeach node makes its observationthroughthelinearmodel

(7)

di,t

=

uTi,two

+

vi,t

,

i

=

1

, . . . ,

N

.

(42)

The regressor data ui,t are zero mean i.i.d. Gaussian with

stan-darddeviations

σ

u,i chosen randomlyfromtheinterval

(

0

.

3

,

0

.

8

)

.

The observationnoisesare generated froma Normal distribution withstandard deviations

σ

v,i chosen randomly fromthe interval

(

0

.

1

,

0

.

3

)

.InFig. 3,we depict thenetworktopology andthe net-work’s statistical profile to show how the signal power and the noisepowervaryacrossthenetwork.

Theunknown vector parameter wo with M

=

10 components

israndomlychosenfromaNormaldistributionandnormalizedto haveaunitenergy.Wechangedthesourcestatisticsinthemiddle ofthesimulations toobservehowwell theproposedalgorithmis abletotrackthesuddenchangesintheunknownparameter.

We useMetropolis combination ruleto generate the network matrixP suchthat

pi,j

=

2

M2max(1Ni,Nj) if i

=

j are linked,

0 for i and j not linked, 1



j∈N

i\ipi,j for i

=

j

usingtherandomlyselectednetworkadjacencymatrixgivenby

1 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 1 0 0 0 0 1 1 1 0 1 0 0 0 1 0 0 1 1 0 1 0 0 1 1 0 1 0 0 1 1 0 0 1 1 1 0 1 1 1 1 1 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 1 1 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1

.

Weconfigure thenodes such that they cyclethroughthe entries ofthe intermediary estimates

φ

i,t in a round-robin fashion, and

exchangeonlythisselectedL

=

1 dimensionoutofM inonetime instant. Forinstance, for a L

=

1

,

M

=

3 system at time instants

t

=

1

,

. . . ,

4,theithnodewillsenditsentriesoftheintermediary estimate

φ

i,t asin(43)

φ

1,i

=

φ

10,1,i 0

⎦ , φ

2,i

=

φ

20,2,i 0

⎦ , φ

3,i

=

00

φ

3,3,i

⎦ ,

φ

4,i

=

φ

10,4,i 0

⎦ ,

(43)

where

φl

,t,i isthelthdimension oftheintermediaryentry

φ

i,t of

theithnodeattimet thatissenttotheneighbors.

Weevaluatethe communicationreductionperformance ofthe proposed algorithm withrespect tothe algorithm in[23],where only one entry of intermediate estimates is exchanged by the nodesateachroundinasequentialorderasexplainedin(43).

In Fig. 4,the MSD performance of the proposed algorithm is demonstrated,where as a reference, we have considered the al-gorithm in [23] with an adaptive Lloyd–Max quantizerandwith a no-quantization (scalar diffusion) implementation of the sys-tem. Notethat both in scalar diffusionalgorithm andLloyd–Max quantizedalgorithm,whichisreferredasconventionallyquantized algorithmlater, nodesare exchanging the information ofone di-mension per communication round. However, in scalar diffusion algorithm, information of the exchanged dimension is diffused withfull precision, while in Lloyd–Max case, information of the exchangeddimension isquantizedwithafiniteprecision. We se-lectedthequantizationintervalsothatwedonotsufferfromany

Fig. 3. Network topology and statistical profile.

saturation effects andalsowe have chosen thenumber of quan-tization levels so that nofurther significant improvementcan be madeontheMSDperformanceofthealgorithmsbyincreasingthe numberoflevels.Weobservedthat 53 quantizationlevelsforthe LCalgorithmand31 quantizationlevelsfortheconventional algo-rithm were sufficient.We useastep sizeof

μ

=

0

.

05 during the

(8)

Fig. 4. The globalMSDcurvesoftheproposedalgorithm,displayedwiththelabel ‘LC’,incomparisonwiththeconventionalquantizationandthescalardiffusion al-gorithms(N=10,M=10).Magnifiedfigureprovidesthetransientperformanceof thealgorithms.Sourcestatisticschangeattimet=2×104.

Fig. 5. Time evolutionofthenumberofbitstransmittedacrossthenetwork.Sudden increaseinthe‘LC’curvecorrespondstothetimewherethesourcestatisticsare changed.

simulationsduetoitsgoodlearningrateandconvergenceresults. Theresultsthatweobtainedintheexperimentsareaveragedover 100independenttrials.

From thesesimulations,we observethat the convergencerate ofthescalar diffusionandtheconventionally quantizeddiffusion algorithmsaresuperiorcomparedtotheproposedalgorithm,while thesteady-stateMSDvaluesofallthreesystemsareidentical.We notethat itwasouraimtogetequalsteady-stateMSDvalues al-lowingafaircomparisonintermsoftheconvergencespeeds.Also, it is observed that the proposed algorithm is able to adapt well whenfacedwithasuddenchangeinthesourcestatistics.

In Fig. 5, we present the communication load that each al-gorithm incurs on the network. We exclude the scalar (infinite-precision) diffusion algorithm from this comparison since it re-quires an infinite number of bits to encode the information ex-changedamongthenodes.Weobserveasubstantialenhancement in the communication efficiency achievedby the proposed

algo-rithmintermsofthetotalnumberofbitsexchangedbetweenthe nodes across theentire adaptivenetwork withrespect to the al-gorithm that uses the conventional quantization. Particularly, for this N

=

10 node network, we note that the proposed algorithm provides 103 times less communication load over the reference implementation withthesame steadystate MSDvalues.We also observe that atthe time of changein thesource statistics, there is asudden increaseinthe numberofbitsusedby theproposed algorithm.Thisisbecausetherearemultiplelevelcrossingsoccur duetothesuddenchangeintheparameterofinterestatthattime, which requires more than two bits to encode. However, we ob-servethatthesystemquicklyadaptsitselftousingtwo-bitsagain. The same behavior is not present forthe conventional quantiza-tioncasesinceitalreadyencodestruevaluesofthelevelsatevery single time instant. We stress further that we achieve this im-provement withrelatively little complexity since we haveshown that usinga simplenon-adaptivequantizerissufficient torealize theimprovements.

In the second part of the experiments, we aim to observe theperformanceoftheproposedalgorithmoverhighdimensional data. Therefore, we have changed the former setup so that the unknown vectorparameter wo with M

=

100 componentsis

ran-domlychosenfromaNormaldistributionandnormalizedtohave aunit energy.Weusethesamedistributednetworkwith connec-tions giveninFig. 3c.Quantizationlevels forthealgorithmsagain chosen so that no further significant improvement can be made by increasing the numberoflevels. Weobserved that 53 quanti-zation levels forthe LC algorithm and 31 quantizationlevels for the conventional algorithm were sufficient. We againuse a step sizeof

μ

=

0

.

05 andtheresultsareaveragedover10independent trials.We havedecreasedthenumberofindependenttrialstobe averagedsinceprocessinghighdimensionaldatatakessignificantly moretime.

We presentthe MSD performance of the proposed algorithm incomparisonwiththesequentialvariantofthealgorithmin[23]

with the parameters M

=

100, L

=

1 in Fig. 6. We observe that in the high dimensional data case, the convergence rate of the proposedalgorithmisthesameasthecomparedalgorithms.They alsohavethesamesteady-stateMSDvalues.Theseresultsindicate thattheadaptationperformancesofthescalardiffusionalgorithm andtheconventionallyquantizeddiffusionalgorithmdecreasefor the highdimensional case since the nodes are allowed to share onlyonedimensionperround,whichpreventsthemfromquickly sending their entire intermediary estimates to their neighboring nodes.Therefore,we observethatforsuchsystems,theproposed algorithmperformssimilartothescalardiffusionandthe conven-tionallyquantizedalgorithms.

In Fig. 7, we illustrate thecommunicationload foreach algo-rithm.Weobserveanimprovementonthecommunication require-ments in a similar vein to the previous experiments. Ultimately, theproposedalgorithmincurs102 timeslesscommunicationload compared to the baseline, wherethe numberof transmitted bits is significantly reduced. The magnitude of this reduction is of a smaller scalecomparedto thenon-highdimensionalcase, onthe other hand,mainly due to the extra bits required to encode the higherdimensionsformultiplelevelcrossingsintheLC quantiza-tion.

In the third partof the experiments, in order to observe the possibleeffectsofnumberofquantizationlevels,wesimulatethe algorithms within an identical experimental setup – except that the number ofquantization levels are no longer optimized asin the previous cases. To this end, we have arbitrarily chosen 25 quantization levels for the LC algorithm and again 25 levels for theconventionalalgorithm.Weusethesamedistributednetwork connectionsgiveninFig. 3c.Wehaveusedastepsizeof

μ

=

0

.

05 andtheresultsareaveragedover100independenttrials.

(9)

Fig. 6. The globalMSDcurvesoftheproposedalgorithm,displayedwiththe la-bel‘LC’,incomparisonwiththeconventionalquantizationandthescalardiffusion algorithmsoverhighdimensionaldata(N=10,M=100).Magnifiedfigure pro-videsthetransientperformanceofthealgorithms.Sourcestatisticschangeattime t=104.

Fig. 7. Time evolutionofthenumberofbitstransmittedbythealgorithmsacross networkoverhighdimensionaldata(N=10,M=100).Suddenincreaseinthe‘LC’ curvecorrespondstothetimeatwhichthesourcestatisticsarechanged.

Wepresentthe MSDperformancesofthealgorithmsinFig. 8. We observe that when sub-optimal quantization levels are used, thecompared algorithms exhibit superior performance compared tothe proposed algorithmboth intermsof theconvergencerate andthesteady-stateMSD. Wealso note thatthe quantized algo-rithmscould notreachthesteady-stateperformanceofthescalar diffusion due to the deliberate poor selection of the number of quantizationlevels.

Theseresultsareobservedduetoafailureonthesystem’spart to satisfy the assumed quantization error model. The statistical model that we used for the quantization error

φ

qi assumes that ithaszeromeansuch that E

iq

]

=

0[24].However,whensucha lownumberofquantizationlevelsare selected,thismodelceases tobe applicableandthequantizedalgorithmsarenolonger

guar-Fig. 8. The globalMSDcurvesoftheproposedalgorithm,displayedwiththelabel ‘LC’,incomparisonwiththeconventionalquantizationandthescalardiffusion al-gorithmswithsub-optimalquantizationlevels(N=10,M=10).Sourcestatistics changeattimet=104.

Fig. 9. Time evolutionofthenumberofbitstransmittedbythealgorithmsacross networkwithsub-optimalquantizationlevels(N=10,M=10).Suddenincreasein the‘LC’curvecorrespondstothetimeatwhichthesourcestatisticsarechanged.

anteed to converge to the steady-state MSD values of the scalar diffusionalgorithm.

InFig. 9,wepresentthecommunicationloadofthealgorithms overthenetworkforthecaseofasub-optimallevelselection.We again observe a similar behavior where the proposed algorithm diffusesmorethan103 timeslessbitsthroughnetworkcompared tothebaseline.Wenotethatthedifferenceinthenumberofbits exchanged between the two algorithms is larger compared with thepreviousresults.Thiscanbeexplainedbythefactthatweuse fewer quantization levels forthe LC algorithm,which makes the occurrenceofmultiple levelcrossingsa rarerphenomenon.Thus, it becomes less likelyfor each node to send our morethan two bits of information fora given iteration. Ultimately, this particu-larexperimentillustratestheexistenceofatrade-offbetweenthe estimationperformance andthecommunicationloadimposedon thenetwork.

Şekil

Fig. 1. An example distributed network with bidirectional connections. Circular area represents the neighborhood of the ith node.
Fig. 2. Illustration of the operation of the LC quantizer. Blue dots represent the original node estimates, while red ones represent the quantized version of the  cor-responding estimates
Fig. 3. Network topology and statistical profile.
Fig. 4. The global MSD curves of the proposed algorithm, displayed with the label
+3

Referanslar

Benzer Belgeler

BACkgRoUNd ANd pURpoSE: Etiologic role, incidence, demographic, and response-to-treatment characteristics of urinary tract infec- tion (UTI) among neonates, its relationship

Sonuç olarak, esnek istihdam biçimleriyle ve çalışma sürelerinde esneklik politikalarıyla son derece esnek ve güvencesiz bir işgücü piyasasına sahip olan ve

Anahtar Sözcükler: Wechsler Çocuklar İçin Zeka Ölçeği-Yeniden Gözden Geçirilmiş Formu, WÇZÖ-R, zeka, Dikkat Eksikliği Hiperaktivite Bozukluğu, DEHB, profile SUMMARY:

Bu çalışmada, Türkiye Türkçesinde kullanılan atasözlerinin bazılarında, adil dünya inancı ve dinî inancın yansımaları incelenmeye çalışılmıştır. Bu amaçla ön-

However, in all studies, except for the latter study, the relative validity of expert versus lay risk assessments (in terms of the veracity of frequency estimates) has not been

Do˘gal görüntüler bir dönü¸süm uza- yında seyrek olarak ifade edilebildikleri için seyreklik önsellerinin bu problemleri etkili bir ¸sekilde düzenlile¸stirebildikleri ve

Compared to the conventional p-type bulk AlGaN EBL, the proposed structure features an increased hole injection, and thus an enhanced optical output power and EQE among the

As we will show below, the effect of resistive loss is to give a bit-rate capacity for a given line that is proportional to the cross-sectional area and inversely proportional to