Robust least squares methods under bounded data uncertainties

(1)

Contents lists available atScienceDirect

Digital

Signal

Processing

www.elsevier.com/locate/dsp

Robust

least

squares

methods

under

bounded

data

uncertainties

N. Denizcan Vanli

a

,

∗

,

Mehmet A. Donmez

b

,

Suleyman S. Kozat

a

a_Department_of_Electrical_and_Electronics_Engineering,_Bilkent_University,_Ankara,_Turkey

b_Department_of_Electrical_and_Computer_Engineering,_University_of_Illinois_at_{Urbana-Champaign,}_IL,_USA

a

r

t

i

c

l

e

i

n

f

o

a

b

s

t

r

a

c

t

Articlehistory:

Availableonline24October2014 Keywords: Dataestimation Leastsquares Robust Minimax Regret

We study the problem of estimating an unknown deterministic signal that is observed through an unknown deterministic data matrix under additive noise. In particular, we present a minimax optimizationframeworktothe leastsquaresproblems,where theestimatorhasimperfectdata matrix andoutputvectorinformation.Wedefinetheperformanceofanestimatorrelativetotheperformanceof theoptimalleastsquares(LS)estimatortunedtotheunderlyingunknowndatamatrixandoutputvector, which isdefined as theregret ofthe estimator. We thenintroduce an efficient robustLS estimation approach that minimizesthisregretfor the worstpossible data matrix and output vector, wherewe refrain fromanystructural assumptionsonthedata. Wedemonstratethatminimizingthisworst-case regret can becast as asemi-definite programming(SDP) problem. We thenconsider the regularized and structuredLSproblemsandpresentnovelrobustestimationmethodsbydemonstratingthatthese problemscan alsobe castas SDPproblems. Weillustratethe meritsoftheproposed algorithmswith respecttothewell-knownalternativesintheliteraturethroughoursimulations.

1. Introduction

Inthispaper,we investigateestimation ofan unknown deter-ministicsignalthatisobservedthroughadeterministicdatamatrix underadditive noise, whichmodels a widerange ofproblemsin signal processingapplications[1–14].In thisframework,thedata matrixandtheoutputvectorarenotexactlyknown,however, es-timates for both of them as well as uncertainty bounds on the estimates are given [2,8,15–19]. Since the model parameters are notknownexactly,theperformancesoftheclassicalLSestimators may signiﬁcantly degrade, especially when the perturbations on thedatamatrixandtheoutputvectorarerelativelyhigh[9,15,16, 20–22].Hence,robustestimationalgorithmsare neededtoobtain asatisfactory performance undersuchperturbations. Thisgeneric frameworkmodelsseveralreal-lifeapplications,whichrequire es-timationofasignalobservedthroughalinearmodel[9,16].Asan example,this setup models realistic channel equalization scenar-ios, where the data matrix represents a communication channel and the data vector is the transmitted information. The channel is usually unknown, especially for wireless communications ap-plications, and possibly can be time-varying. Hence, in practical applications,thecommunicationchannel isestimated,where this

*

Correspondingauthor.

E-mailaddresses:[email protected](N.D. Vanli),

[email protected]

(M.A. Donmez),

[email protected]

(S.S. Kozat).

estimate isusually subjectto distortions [9,16]. Undersuch pos-sible perturbations, robust equalization methods can be used to obtain a more consistent and acceptable performance compared to the LS (or MMSE) equalizer. In thissense, thisformulation is comprehensive andcan be usedin other applicationssuch asin feedbackcontrol systemstoestimateadesireddataunder imper-fectsystemknowledge.

A prevalent approachto findrobust solutions to such estima-tionproblemsistherobustminimaxLSmethod[8,9,16,23–27],in which the uncertainties in the data matrix and the output vec-tor are incorporated into optimization framework via a minimax residualformulationandaworst-caseoptimizationwithinthe un-certainty bounds is performed. Although the robust LS methods areabletominimizetheLSerrorfortheworst-caseperturbations, they usually provide unsatisfactory results on the average [15, 23–27]duetotheirconservativenature.Thisissueissignificantly exacerbated especially when the actual perturbations do not re-sult in significant performance degradation. Anotherwell-known approachtocompensateforerrorsinthedatamatrixandthe out-putvectoristhetotalleastsquaresmethod(TLS)[15],whichmay yieldundesirable resultssinceitemploys aconservativeapproach duetodatade-regularization.Ontheother hand,thedatamatrix usually hasa known specialstructure,such asToeplitz and Han-kel,inmanylinearregressionproblems[9,15].Hence,in[9,15],the authors illustrate that the performances of the estimators based onminimaxapproachesimprovewhensuchapriorknowledgeon data matrixstructure is integratedinto theproblemformulation.

http://dx.doi.org/10.1016/j.dsp.2014.10.004

(2)

Inallthesemethods,LSestimatorsunderworstcaseperturbations areintroducedtoachieverobustness.However,duetothis conser-vativeproblem formulation, inmany practical applications, these approachesyieldunsatisfactoryperformances[2,8,18,28–30].

Inorder tocounterbalance thisconservative nature ofthe ro-bustLSmethods[9],weproposea novelrobustLSapproachthat minimizesa worst case“regret” that is definedasthe difference between the squared residual error and the smallest attainable squaredresidualerrorwithanLSestimator[2,8,18,28–30].Bythis regretformulation,weseekalinearestimatorwhoseperformance isascloseaspossibletothatoftheoptimalestimatorforall pos-sibleperturbations onthedatamatrixandtheoutputvector. Our maingoalinproposing theminimaxregretformulationisto pro-videatrade-offbetweentherobustLSmethodstunedtotheworst possibledataparameters (underthe uncertaintybounds)andthe optimal LS estimator tuned to the underlying unknown model parameters. Minimax regret approaches have been presented in signal processing literature to alleviate the pessimistic nature of the worst case optimization methods [2,8,18,28–30]. In [18,29], linearminimax regret estimatorsare introduced to minimize the mean squared error (MSE) under imperfect knowledge of chan-nelstatisticsandtrueparameters,respectively.In[28],aminimum meansquarederror(MMSE)estimationtechniqueunderimperfect channel and data knowledge is investigated. In [2], these robust estimation methods are extended to flat fading channels to per-formchannelequalization.Thesemethodsareshowntoprovidea betteraverageperformance comparedtothe minimaxestimators, whereas under large perturbations the robustness of the mini-maxestimatorsaresuperiortothesecompetitivemethods.Onthe other hand, in this paper, the optimization frameworks investi-gatedherearesignificantly differentthan[9,16,23–27],wherethe regrettermsaredirectlyadjoinedinthecostfunctions.In particu-lar,unlike[2,18,28,29],wheretheuncertaintiesareinthestatistics ofthetransmittedsignalorchannelparameters,inthispaper,the uncertaintyisbothonthedatamatrixandtheoutputvector with-outanystatisticalassumptions.Whilein[8],theauthorshave con-sideredasimilar framework,the resultsofthispaperbuild upon themandprovidea completesolutionto theregretbased robust LSestimationmethodsunlike[8].Weemphasizethatperturbation boundson thedatamatrixandtheoutput vectorheavily depend ontheestimationalgorithmsemployed toobtainthem.Sinceour methods are formulated for given perturbation bounds, different estimationalgorithms canbereadilyincorporatedintoour frame-workwiththecorrespondingperturbationbounds[16].

Ourmaincontributionsinthispaperareasfollows.i) We intro-ducea novelandefficient robustLSestimation methodinwhich we findthe transmitted signal by minimizing the worst-case re-gret,i.e., the worst-case difference betweenthe residual errorof theLSestimatorandtheresidualerroroftheoptimalLSestimator tunedtotheunderlyingmodel.Inthissense,wepresentarobust estimationmethodthatachievesatradeoffbetweentherobustLS estimationmethodsandthedirectLSestimationmethodtunedto the estimates of the data matrix and output vector. ii) We next propose a minimax regret formulation forthe regularized LS es-timation problem. iii) We then introduce a structured robust LS estimation method in which the data matrix is known to have a special structure such as Toeplitz or Hankel. iv) We demon-stratethattherobustestimationmethodsweproposecanbecast as SDP problems, hence our methods can be efficiently imple-mented inreal-time [31].v) In our simulations, we observe that our approachesprovide better performance compared to the ro-bustmethods that are optimized with respect to the worst-case residual error[9,32], and the conventional methods that directly solvetheestimationproblemusingtheperturbeddata.

Theorganizationofthepaperisasfollows.Anoverviewtothe problemisprovidedinSection2.InSection3.1,weﬁrstintroduce

the LS estimation method based on our regret formulation, and thenpresenttheregularizedLSestimationapproachinSection3.2. We then consider the structured LS approach inSection 3.3and provide the explicit SDP formulations for all problems. The nu-mericalexamplesaredemonstratedinSection4.Finally,thepaper concludeswithcertainremarksinSection5.

2. Systemoverview

2.1. Notation

In this paper, all vectors are column vectors and represented byboldfacelowercaseletters.Matricesarerepresentedbyboldface uppercase letters.Fora matrixH, HH _is _the_conjugate_transpose,

H

is the spectral norm, H+ is the pseudo-inverse, H

>

0 rep-resentsa positivedeﬁnite matrixandH

≥

0 represents apositive semi-deﬁnitematrix.ForasquarematrixH,Tr

(

H

)

isthetrace. Nat-urally,foravector x,

x

=

√

xH_{x is}_the

2_-norm._Here,_{0 denotes} avectorormatrixwithallzeroelementsandthedimensionscan beunderstoodfromthecontext.Similarly,I representsthe appro-priatesizedidentitymatrix.Theoperatorvec

(

·)

isthevectorization operator,i.e.,itstacksthecolumnsofamatrixofdimensionm

×

n

intoan mn

×

1 columnvector. Finally,theoperator

⊗

isthe Kro-neckerproduct[33].

2.2. Problemdescription

We investigate the problemof estimating an unknown deter-ministic vector x

∈ C

n _which_is _observed_through _a_{deterministic} data matrix. However, instead of the actual data matrix andthe output vector, their estimates H

∈ C

m×n andy

∈ C

m and uncer-tainty boundsontheseestimatesare provided. Inthissense, our aimistoﬁndasolutiontothefollowingdataestimationproblem y

≈

Hx

,

suchthat

y

+

y

= (

H

+

H

)

x

,

for deterministic perturbations

H

∈ C

m×n,

y

∈ C

m. Although these perturbations are unknown, a bound on each perturbation isprovided,i.e.,

H

≤ δ

H and

y

≤ δ

Y

,

where

δ

H

,

δ

Y

≥

0.Inthissense, werefrainfromanyassumptions on the data matrix and the output vector, yet consider that the estimatesH andy areatleastaccurateto“somedegree”buttheir actualvaluesundertheseuncertaintiesarecompletelyunknownto theestimator.

Eveninthepresenceoftheseuncertainties,thesymbolvector x canbenaivelyestimatedbysimplysubstitutingtheestimatesH andy intotheLSestimator[10].FortheLSestimatorwehave

ˆ

x

=

H+y

,

whereH+isthepseudo-inverseofH[33].However,thisapproach yields unsatisfactory results,when the errors in theestimates of thedatamatrixandtheoutputvectorarerelativelyhigh[9,18,29, 32].A commonapproachtoﬁndarobust solutionistoemploya worst-caseresidualminimization[9]

ˆ

x

=

arg min

x∈Cn H≤δmaxH,y≤δY

(

y

+

y

)

− (

H

+

H

)

x

2

,

where x is chosen to minimize the worst-case residual error in theuncertaintyregion. However,since thesolutionisfound with

(3)

respecttotheworstpossibledatamatrixandoutputvectorinthe uncertaintyregions,itmaybehighlyconservative[15,18,29].

Here, we propose a novel LS estimation approach that pro-videsa tradeoffbetweenperformance androbustness inorderto mitigatethe conservativenature ofthe worst-caseresidual mini-mizationapproach aswell asto preserve robustness [18,29].The regretfornotusingtheoptimalLSestimatorisdeﬁnedasthe dif-ference betweentheresidual errorwithan estimateof theinput vectorandtheresidualerrorwiththeoptimalLSestimator,i.e.,

R

(

x

;

H

,

y

)

(

y

+

y

)

− (

H

+

H

)

x

2

−

min

w∈Cn

(

y

+

y

)

− (

H

+

H

)

w

2

.

(1)

By making such a regret deﬁnition, we force our estimator not to construct the symbol vector according to the worst possible scenarioconsidering that it maybe tooconservative.Instead, we deﬁnetheregretofanyestimatorbythedifferenceinthe estima-tionperformances ofthatestimator andthe“smartest” estimator knowingbothdatamatrixandoutput vectorinhindsight,sothat we achieve atradeoff betweenrobustness andestimation perfor-mance.

Weemphasizethattheregretdeﬁnedin(1)iscompletely dif-ferentthantheregretformulationintroducedin[18,29].In(1),the uncertaintyisonthedatamatrixwherethedesireddatavector x iscompletelyunknown,unlike[18,29].Weemphasizethatweuse theresidualerror

(

y

+

y

)

− (

H

+

H

)

x

2insteadofthe estima-tionerror

ˆ

x

−

x

since theestimationerrordirectly dependson thevectorx andcannotbeusedintheregretformulationsincex isassumedto be unknownin thepresence ofdatauncertainties. Moreover, in our formulation, the estimate x is

ˆ

not constrained to be linear unlike [18,29] since our regret formulation is well-deﬁnedwithoutanylimitationsontheestimatedx.

ˆ

In the next sections, the proposed approaches to the robust LS estimation problems are provided. We ﬁrst introduce the re-gret based unstructured LS estimation method. We next present theunstructuredregularized LSestimationapproachinwhichthe worst-case regret is optimized. Finally, we investigate the struc-turedLSestimationapproach.

3. Robustleastsquaresestimationmethods

3.1. Unstructuredrobustleastsquaresestimation

Inthissection,weprovideanovelrobustunstructuredLS esti-matorbasedonacertainminimaxcriterion.Weconsiderthemost genericestimationproblem

min

x∈Cn_H_≤δmax H,y≤δY

R

(

x

;

H

,

y

),

(2)

where

R(

x

;

H

,

y

)

isdeﬁnedasin(1).Nowconsideringthe sec-ondterm in(1),we deﬁne H

˜

H

+

H,y

˜

y

+

y,where H is

˜

afull rankmatrix,anddenotetheestimationperformance ofthe optimalLSestimatorforsomegivenH and

˜

y by

˜

f

( ˜

H

,

y

˜

)

min

w∈Cn

˜

y

− ˜

Hw

2

_.

Sinceweconsideranunconstrainedminimizationoverw,wehave

[10]

w∗

arg min

w∈Cn

˜

y

− ˜

Hw

2

= ˜

H+y

˜

,

(3)

astheoptimaldatavectorminimizingtheresidualerror.Thenwe have

f

( ˜

H

,

y

˜

)

=

˜

y

− ˜

Hw∗

2

=

y

˜

− ˜

Hw∗

H

y

˜

− ˜

Hw∗

= ˜

yH

y

˜

− ˜

Hw∗

= ˜

yHP

˜

y

˜

,

where the third line follows from H

˜

H_Hw

˜

∗

_{= ˜}

_HH_y

_˜

_[10] _and _P

˜

I

− ˜

HH

˜

+istheprojectionmatrixofthespaceperpendiculartothe rangespaceof H.

˜

Ifwe usetheTaylorseriesexpansion basedon Wirtingercalculus[33] for f

( ˜

H

,

y

˜

)

aroundH

˜

=

H andy

˜

=

y,then

f

( ˜

H

,

y

˜

)

=

f

(

H

,

y

)

+

2 Re

Tr

∇

f

( ˜

H

,

y

˜

)

|

H_˜

H=H,y˜=y

[

H

y

]

+

O

[

H

y

]

2

.

(4)

Note that the ﬁrst order Taylor approximation is introduced in order to obtain a tractable solution. Clearly, the effect of using thisapproximationvanishes as

[

H

y

]

decreases andfor dis-tortions withlarger

[

H

y

]

, one can easily use higherorder approximations instead.However, we observe through our simu-lations that even for relatively large perturbations, a satisfactory performanceisobtainedusingthisapproximation.

Wenow introducethefollowinglemmainordertoobtainthe ﬁrstorderTaylorapproximationin(4)inaclosedform.

Lemma1.LetH

˜

=

H

+

H beafullrankmatrixandy

˜

=

y

+

y,where

˜

H

∈ C

m×n_and_y

_˜

_{∈ C}

m_._Then_deﬁning _f

_{( ˜}

_H

_,

_y

_˜

₎

_˜

_yH_P

˜

_y,

_˜

_where_P

˜

_I

₋

˜

HH

˜

+,wehave

∂

f

( ˜

H

,

y

˜

)

∂ ˜

H

_˜ H=H,y˜=y

= −

Py

H+y

H

,

and

∂

f

( ˜

H

,

y

˜

)

∂

y

˜

_˜ H=H,y˜=y

=

Py

,

whereP

I

−

HH+.

ProofofLemma1. Since H is

˜

full rank and m

≥

n, the pseudo-inverseofH is

˜

foundby[33]

˜

H+

˜

HHH

˜

−1H

˜

H

.

Hence,wehave[33] D

=

∂

∂ ˜

H

˜

yHy

˜

− ˜

yHH

˜

HHH

˜

−1H

˜

Hy

˜

˜ H=H,˜y=y

=

H

HHH

−1HHyyHH

HHH

−1

−

yyHH

HHH

−1

=

HH+y

H+y

H

−

y

H+y

H

= −

Py

H+y

H

,

(5) and b

=

∂

y

˜

yHy

˜

− ˜

yHH

˜

HHH

˜

−1H

˜

Hy

˜

˜ H=H,y˜=y

=

Py

,

(6)

where the last lineof the equality followssince HH+ is a sym-metric matrix according to the deﬁnition of the pseudo-inverse operation.ThisconcludestheproofofLemma 1.

2

Nowturningourattentionbackto(4),wedenote D

∂

f

( ˜

H

,

y

˜

)

∂ ˜

H

_˜ H=H,y˜=y

,

(4)

and b

∂

f

( ˜

H

,

y

˜

)

∂

y

˜

_˜ H=H,y˜=y

,

where we emphasize that the closed form deﬁnitions of D and b canbe obtained fromLemma 1.We then approximate(4) and obtaintheﬁrstorderTaylorapproximationasfollows

f

( ˜

H

,

y

˜

)

≈

f

(

H

,

y

)

+

2 Re

Tr

[

D b

]

H

[

H

y

]

=

κ

+

2 Re

vec

(

D

)

Hvec

(

H

)

+

bH

y

=

κ

+

dH

h

+

hHd

+

bH

y

+

yHb

,

(7)

where

κ

f

(

H

,

y

)

,d

vec

(

D

)

,and

h

vec

(

H

)

.Hencewecan approximatetheregretin(1)asfollows

R

(

x

;

H

,

y

)

≈ ˜

y

− ˜

Hx

2

−

κ

+

dH

h

+

hHd

+

bH

y

+

yHb

.

(8)

In the following theorem, we illustrate how the optimization (orequivalently estimation) problemin(8)can beputin an SDP form.

Theorem1.LetH

∈ C

m×nandy

∈ C

mbetheestimatesofthedata ma-trixandtheoutputvector,respectively,bothhavingdeterministic addi-tiveperturbations

H

≤ δ

Hand

y

≤ δ

Y,respectively,i.e.,H

˜

=

H

+

H

andy

˜

=

y

+

y,whereH is

˜

thefullrankdatamatrix,y is

˜

theoutput vec-tor,andm

≥

n.Thentheproblem

min

R

(

x

;

H

,

y

),

(9)

where

R(

x

;

H

,

y

)

isdeﬁnedasin(8),isequivalenttosolvingthe followingSDPproblem min

γ

subject to

τ

1

≥

0

,

τ

2

≥

0

,

and

⎡

⎢

⎣

γ

+

κ

−

τ

1

−

τ

2

(

y

−

Hx

)

H

δ

YbH

δ

HdH y

−

Hx I

−δ

YI

δ

HX

δ

Yb

−δ

YI

τ

1I 0

δ

Hd

δ

HXH 0

τ

2I

⎤

⎥

⎦ ≥

0

,

(10)

whereX isthem

×

mn matrixdeﬁnedasX

xH

_⊗

_I. TheproofofTheorem 1isprovidedinAppendix A.

Remark1.IntheproofofTheorem 1,weuseProposition 1that re-liesonthelossless S-procedure.However, S-procedureis lossless withtwoconstraintswhenthecorresponding twoquadratic (Her-mitian)formsonthecomplexlinearspace[34].However,classical

S-procedureforquadraticformsis,ingeneral,lossywithtwo con-straintsintherealcase[35].Hence,Theorem 1cannotbeextended forreallinearspace.

Nowwe canconsidertwo importantcorollariesofTheorem 1. First,aspecialcaseofTheorem 1inwhichtheuncertaintyisonly inthedatamatrix.Weemphasizethattheperturbationerrorsonly inthe data matrixare also commonin a wide range ofreal life applications[10].Here,wecandeﬁnetheregretasfollows

R

(

x

;

H

)

y

− ˜

Hx

2

−

min

w∈Cn

y

− ˜

Hw

2

,

(11)

andsimilartotheprevious case,wecalculatetheoptimal estima-tionperformanceunderagivenuncertaintybound

f

( ˜

H

)

min w∈Cn

y

− ˜

Hw

2

≈

κ

+

2 Re

Tr

∇

f

( ˜

H

,

y

)

_HH_˜₌_H

H

=

κ

+

2 Re

vec

DH

vec

(

H

)

=

κ

+

dH

h

+

hHd

.

Henceweapproximatetheregretin(11)asfollows

R

(

x

;

H

)

≈

y

− ˜

Hx

2

−

κ

+

dH

h

+

hHd

.

(12)

Corollary1.LetH

∈ C

m×n andy

∈ C

m betheestimatesofthedata matrixandtheoutputvector,respectively,wherem

≥

n.Supposethereis aboundeduncertaintyonthefullrankdatamatrixH,

˜

i.e.,H

˜

=

H

+

H,

H

≤ δ

H.Thentheproblem

min

x∈Cnmax_H_≤δ H

R

(

x

;

H

),

(13)

where

R(

x

;

H

)

isdeﬁnedasin(12),isequivalenttosolvingthe follow-ingSDPproblem min

γ

subject to

τ

≥

0 and

_γ

₊

_κ

₋

_τ

₍

_y

₋

_Hx

₎

H

_δ

Hd y

−

Hx I

δ

HX

δ

Hd

δ

HXH

τ

I

≥

0

.

(14)

OutlineoftheproofofCorollary1. TheproofofCorollary 1canbe explicitlyderived from theproof ofTheorem 1by simply setting

δ

Y

=

0 and

τ

1

=

0,henceisomitted.

2

Second, we consider another special case of Theorem 1 in which the uncertainty is only in the output vector. We empha-size thatsimilar to theprevious case, thisone isalsoa common caseinawiderangeofreal-lifeapplications[10],andstudied un-derasimilarframeworkin[18].Here,wecandeﬁnetheregretas follows

R

(

x

;

y

)

˜

y

−

Hx

2

−

min

w∈Cn

˜

y

−

Hw

2

_,

₍₁₅₎

andsimilartothepreviouscase,wecalculatetheoptimalalso per-formanceunderagivenuncertaintybound

f

(

y

˜

)

min w∈Cn

˜

y

−

Hw

2

≈

κ

+

2 Re

Tr

∇

f

(

H

,

y

˜

)

H_y_˜₌_y

y

=

κ

+

2 Re

bH

y

=

κ

+

bH

y

+

yHb

.

Henceweapproximatetheregretin(15)asfollows

R

(

x

;

y

)

≈ ˜

y

−

Hx

2

−

κ

+

bH

y

+

yHb

.

(16)

Corollary2.LetH

∈ C

m×nandy

∈ C

mbetheestimatesofthedata ma-trixandtheoutputvector,respectively,wherem

≥

n.Supposethereisa boundeduncertaintyontheoutputvectory,

˜

i.e.,y

˜

=

y

+

y,

y

≤ δ

Y.

Thentheproblem

min

x∈Cnmax_y_≤δ_Y

R

(

x

;

y

),

(17) where

R(

x

;

y

)

isdeﬁnedasin(16),isequivalenttosolvingthe follow-ingSDPproblem

(5)

min

γ

subject to

τ

≥

0 and

_γ

₊

_κ

₋

_τ

₍

_y

₋

_Hx

₎

H

_δ

YbH y

−

Hx I

−δ

YI

δ

Yb

−δ

YI

τ

I

≥

0

.

(18)

OutlineoftheproofofCorollary 2. TheproofofCorollary 2canbe explicitlyderived fromthe proof ofTheorem 1by simply setting

δ

H

=

0 and

τ

2

=

0,henceisomitted.

2

Remark2.Corollaries 1 and 2follow fromtheproofofTheorem 1, which relies on the lossless S-procedure. Under the frameworks presentedinCorollaries 1 and 2,one cansafelyextendthesame conclusions for the real case also, since S-procedure is lossless forquadraticformswithone constraintbothincomplexandreal spaces[36,37].

3.2. Unstructuredrobustregularizedleastsquaresestimation

Inthis section, we introduce a worst-caseregret optimization approach tosolve the regularized LS estimationproblemin [32]. The regret for not using the optimal regularized LSestimator is deﬁnedby

R

(

x

;

H

,

y

)

˜

y

− ˜

Hx

2

+

μ

x

2

−

min w∈Cn

˜

y

− ˜

Hw

2

+

μ

w

2

,

(19)

where

μ

>

0 is theregularization parameter. Weemphasize that therearedifferentapproachestochoose

μ

,however,forthefocus of this paper, we assume that it is already set before the opti-mizationsothatthesemethodscanbereadilyincorporatedinour framework.Hence,wesolvetheregularizedLSestimationproblem foran arbitrary

μ

>

0 andnote thatwehavealreadycoveredthe

μ

=

0 caseinSection3.1.

Similartothepreviouscase,wedenotetheestimationerrorof the optimal LSestimator for some estimated data matrix H and outputvectory by f

(

H

,

y

)

min w∈Cn

y

−

Hw

2

₊

_μ

_w

2

=

P−1y

2

=

yHP−1y

,

where P

I

+

μ

−1_HHH_. _Considering _the _ﬁrst_order _Taylor _series expansionbasedonWirtingercalculus[33]for f

( ˜

H

,

y

˜

)

aroundH

˜

=

H andy

˜

=

y

f

( ˜

H

,

y

˜

)

≈

κ

+

2 Re

Tr

∇

f

( ˜

H

,

y

˜

)

H_H_˜₌_H_,_y_˜₌_y

[

H

y

]

,

=

κ

+

dH

h

+

hHd

+

bH

y

+

yHb

,

whered

vec

(

DH

₎

_,

_h

_vec

₍

_H

₎

_,

D

∂

f

( ˜

H

,

y

˜

)

∂ ˜

H

_˜ H=H,y˜=y

= −

P−1yyHP−1H

,

(20) and b

∂

f

( ˜

H

,

y

˜

)

∂

y

˜

_˜ H=H,y˜=y

=

P−1y

,

where the last line follows since P is symmetric. Hence we can approximatetheregretin(19)asfollows

R

(

x

;

H

,

y

)

≈ ˜

y

− ˜

Hx

2

+

μ

x

2

−

κ

+

dH

h

+

hHd

+

bH

y

+

yHb

,

(21)

similarto(8).Inthefollowingtheorem,weillustratehowthe op-timizationproblemin(21)canbeputinanSDPform.

Theorem2.LetH

∈ C

m×n_and_y

_{∈ C}

m_be_the_estimates_of_the_data

ma-trixandtheoutputvector,respectively,bothhavingdeterministic addi-tiveperturbations

H

≤ δ

Hand

y

≤ δ

Y,respectively,i.e.,H

˜

=

H

+

H

andy

˜

=

y

+

y,whereH is

˜

thefullrankdatamatrix,y is

˜

theoutput vec-tor,andm

≥

n.Thentheproblem

min

R

(

x

;

H

,

y

),

(22)

where

R(

x

;

H

,

y

)

γ

subject to

τ

1

≥

0

,

τ

2

≥

0

,

and

⎡

⎢

⎣

γ

+

κ

−

τ

1

−

τ

2

(

y

−

Hx

)

H xH

δ

YbH

δ

HdH y

−

Hx I 0

−δ

YI

δ

HX x 0

μ

I 0 0

δ

Yb

−δ

YI 0

τ

1I 0

δ

Hd

δ

HXH 0 0

τ

2I

⎤

⎥

⎦

≥

0

.

(23)

ProofofTheorem2. TheproofofTheorem 2followssimilarlines totheproofofTheorem 1,henceisomittedhere.

2

Remark3. Under the framework introduced in this section, one canstraightforwardlyobtainthecorollariessimilartoCorollaries 1 and 2byconsideringcasesinwhichtheuncertaintyiseitheronly on thedata matrixoronly onthe outputvector, i.e.,

δ

Y

=

0 and

δ

H

=

0 cases, respectively. The derivations follow similar lines to

Corollaries 1,2andTheorem 2,henceisomitted.However,similar results canbe readilyderived fromtheresultinTheorem 2 with suitablechangesintheSDPformulations.

3.3. Structuredrobustleastsquaresestimation

There arevariouscommunicationsystemswherethedata ma-trix and the perturbation on it have a special structure such as Toeplitz, Hankel,or Vandermonde [9,15].Incorporating this prior knowledge intotheestimationframeworkcouldimprovethe per-formanceoftheregretbasedminimaxLSestimationapproach[9, 15]. Hence, in this section, we investigate a special case of the problem in (2), where the associated perturbations for the data matrix H and the output vector y have special structures. The structureontheperturbationsisdeﬁnedasfollows

H

=

p

i=1

α

iHi

,

(24) and

y

=

p

i=1

β

iyi

,

(25)

where Hi

∈ C

m×n, yi

∈ C

m, and p are known but

α

i

,

β

i

∈ C

,

i

=

1

,

. . . ,

p, are unknown. However, the boundson the norm of

α

[

α

1

,

. . . ,

α

p

]

H and

β

[β

1

,

. . . ,

β

p

]

H areprovidedas

α

≤ δ

α

and

β

≤ δ

β, where

δ

α

,

δ

β

≥

0. We emphasize that this formu-lation canrepresenta wide rangeofconstraints onthe structure ofperturbationsofthedatamatrixandtheoutputvectorsuchas

(6)

ToeplitzandHankel[9,10].Ouraimistosolvethefollowing opti-mizationproblem min x∈Cn_α_≤δmax_α_,_β≤δ β

R

(

x

;

H

,

y

),

where

R

(

x

;

H

,

y

)

˜

y

− ˜

Hx

2

−

min w∈Cn

˜

y

− ˜

Hw

2

_,

₍₂₆₎

˜

H

+

H

=

H

+

p

i=1

α

iHi

,

(27)

˜

y

+

y

=

y

+

p

i=1

β

iyi

.

(28)

After following similar lines to Section 3.1, and introducing theﬁrstorder Taylorapproximation to f

( ˜

H

,

y

˜

)

around

α

=

0 and

β

=

0,weobtain

f

( ˜

H

,

y

˜

)

≈

κ

+

2 Re

Tr

∇

f

( ˜

H

,

y

˜

)

_αH₌₀_,β₌₀

[

α

β

]

,

(29)

where f

( ˜

H

,

y

˜

)

= ˜

yH_P

˜

_{y and}

_˜

_P

˜

₌

_I

_{− ˜}

_H_H

˜

+_. _We _next _introduce _the followinglemmatocalculatethe ﬁrstorderTaylorapproximation in(29)inaclosedform.

Lemma2.LetH

˜

=

H

+

H beafullrankmatrixandy

˜

=

y

+

y,where

˜

H

∈ C

m×n_,_y

_˜

_{∈ C}

m_,

_{H and}

_{y are}_deﬁned_as_in₍₂₄₎_and_(25),

respec-tively.Thendenoting f

( ˜

H

,

y

˜

)

˜

yHPy,

˜

whereP

˜

I

− ˜

HH

˜

+,wehave

∂

f

( ˜

H

,

y

˜

)

∂

α

α=0,β=0

=

−

yHPHH1H+y

, . . . ,

−

yHPHHpH+y

H

,

(30) and

∂

f

( ˜

H

,

y

˜

)

∂β

_α₌ 0,β=0

=

yHPy1

, . . . ,

yHPyp

H

,

(31) whereP

I

−

HH+.

ProofofLemma2. Notethatthederivativeof f

( ˜

H

,

y

˜

)

istakenwith respectto

[

α

β

]

,hencewecanusetheChainRuletocalculatethe derivativesbyusingtheresultswehaveobtainedinLemma 1.

First,weconsider thederivative of f

( ˜

H

,

y

˜

)

withrespect to

α

i,

i

=

1

,

. . . ,

p,i.e., di

∂

f

( ˜

H

,

y

˜

)

∂

α

i

_α₌ 0,β=0

=

Tr

∂

f

( ˜

H

,

y

˜

)

∂ ˜

H

∂ ˜

H

∂

α

i

_α₌ 0,β=0

=

Tr

−

H+yyHPHHi

= −

yHPHHiH+y

,

wherethe last line follows fromthe cyclicproperty of the trace operator.

Similarly, we next consider the derivative of f

( ˜

H

,

y

˜

)

with re-spectto

β

i,i

=

1

,

. . . ,

p,i.e., bi

∂

f

( ˜

H

,

y

˜

)

∂β

i

α=0,β=0

=

Tr

∂

f

( ˜

H

,

y

˜

)

∂

y

˜

H

∂

y

˜

∂β

i

α=0,β=0

=

yHPyi

.

ThisconcludestheproofofLemma 2.

2

Nowturningourattentionbackto(29),wedenote d

∂

f

( ˜

H

,

y

˜

)

∂

α

_α₌ 0,β=0

,

and b

∂

f

( ˜

H

,

y

˜

)

∂β

_α₌ 0,β=0

,

wherewe emphasizethat theclosed formdeﬁnitionsofd andb can be obtained from Lemma 2. We then approximate (29) and obtaintheﬁrstorderTaylorapproximationasfollows

f

( ˜

H

,

y

˜

)

≈

κ

+

dH

α

+

α

H_d

₊

_bH

_β

_{+ β}

H_b

_.

Therefore,wecanapproximatetheregretin(26)asfollows

R

(

x

;

H

,

y

)

≈ ˜

y

− ˜

Hx

2

−

κ

+

dH

α

+

α

H_d

₊

_bH

_β

_{+ β}

H_b

_.

₍₃₂₎ In the following theorem,we illustrate how the optimization problemin(32)canbeputinanSDPform.

Theorem3.LetH

,

H1

,

. . . ,

Hp

∈ C

m×n,y

,

y1

,

. . . ,

yp

∈ C

m,

δ

H

,

δ

Y

≥

0,

m

≥

n,whereH is

˜

thefullrankdatamatrixdeﬁnedasin(27),y is

˜

the outputvectordeﬁnedasin(28),withthecorrespondingestimatesH and y,respectively.Thentheproblem

min

x∈Cn_α_≤δmax_α_,_β≤δ

β

R

(

x

;

H

,

y

),

(33)

where

R(

x

;

H

,

y

)

γ

subject to

τ

1

≥

0

,

τ

2

≥

0

,

and

⎡

⎢

⎣

γ

+

κ

−

τ

1

−

τ

2

(

y

−

Hx

)

H

δ

αdH

δβ

bH y

−

Hx I

−δ

αG

δβ

Q

δ

αd

−δ

αGH

τ

1I 0

δβ

b

δβ

QH ₀

_τ

2I

⎤

⎥

⎦ ≥

0

,

(34) whereG

[

H1x

,

. . . ,

Hpx

]

andQ

[

y1

,

. . . ,

yp

]

.

ProofofTheorem3. The proofofTheorem 3followssimilar lines totheproofofTheorem 1,henceisomittedhere.

2

Remark4. Under the framework introduced in this section, one canstraightforwardlyobtainthecorollariessimilartoCorollaries 1 and 2byconsideringcasesinwhichtheuncertaintyiseitheronly on thedata matrixoronly onthe output vector, i.e.,

δ

β

=

0 and

δα

=

0 cases,respectively. The derivations followsimilar lines to

Corollaries 1,2andTheorem 3,henceisomitted.However,similar resultscan be readilyderived fromtheresultinTheorem 3 with suitablechangesintheSDPformulations.

Remark5. The proofs ofTheorem 2 andTheorem 3 followfrom theresultsofTheorem 1,whichreliesonthelossless S-procedure.

However, S-procedure is lossless with two constraints when the corresponding two quadratic (Hermitian) forms on the complex linear space [34]. However, classical S-procedure for quadratic forms is, in general, lossywith two constraints in the real case

[35]. Hence, Theorem 2 and Theorem 3 cannot be extended for real linear space. On the other hand, under the frameworks de-scribed in Remark 3 and Remark 4, one can safely extend the sameconclusionsfortherealcasealso,since S-procedureis loss-lessforquadratic formswithone constraintbothincomplexand realspaces[36,37].

(7)

Fig. 1. Sortedresidualerrorsforthergrt-LS,rbst-LS,LS,andTLSestimatorsover 1000 trialswhenδH= δY=1.2,m=5,andn=3.

4. Simulations

Weprovidenumericalexamplesindifferentscenarios inorder toillustratethemeritsoftheproposed algorithms.Intheﬁrstset of the experiments,we randomly generate a data matrixof size

m

×

n,andan outputvector ofsizem

×

1,whichare normalized tohaveunitnorms.Then,wegenerate1000 randomperturbations

H,

y, where

H

≤ δ

H,

y

≤ δ

Y,m

=

5,n

=

3, and

δ

H

=

δ

Y

=

1

.

2. Here,welabelthealgorithm inTheorem 1as“rgrt-LS”, therobustLSalgorithm of[9]as“rbst-LS”,thetotal LSalgorithm

[9]as“TLS”,andﬁnallytheLSalgorithmtunedtotheestimatesof thedatamatrixandtheoutput vectoras“LS”,wherewedirectly usex

ˆ

=

H+y.

Foreachalgorithmandforeach randomperturbation,we ﬁnd the corresponding x and

ˆ

calculatethe error

˜

Hx

ˆ

− ˜

y

2. Afterwe calculatetheerrorsforeachalgorithm andforallrandom pertur-bations,weplotthecorrespondingsortederrorsinascendingorder inFig. 1for1000perturbations. Sincetherbst-LSalgorithm opti-mizestheworst-caseresidualerrorwithrespecttoworstpossible disturbance, it usually yields the smaller worst-case residual er-roramongallalgorithmsforthesesimulations.Ontheotherhand, sincetheLSalgorithmdirectlyusestheestimates,itusuallyyields thesmallerresidualerrorwhentheperturbationsonthedata ma-trixandtheoutputvectoraresigniﬁcantlysmall.

TheseresultscanbeobservedinFig. 1,whereinoneextreme, thelargestresidualerrors areobserved as2

.

9762 fortheTLS es-timator, 2

.

2557 for the LSestimator, 1

.

9275 for the rbst-LS esti-mator,and1

.

9325 forthergrt-LSestimator.Intheotherextreme, i.e., when there is almost no perturbation, the smallest estima-tion errors are observed as 0

.

3035 for the LS estimator, 0

.

4036 fortheTLSestimator,0

.

8727 fortherbst-LSestimator,and0

.

6387 for the rgrt-LS estimator. While the LS estimator can be prefer-ablewhenthereisrelativelysmallerperturbationsandtherbst-LS estimatorcanbepreferablewhenthereissigniﬁcantlyhigher per-turbations,the introduced algorithm provides a tradeoffbetween thesealgorithmsandachieve asigniﬁcantlysmalleraverage error performance.Theaverageresidualerrorofthergrt-LSestimatoris observed as 1

.

1928, whereas this value is 1

.

2180 for the LS es-timator, 1

.

2708 for therbst-LS estimator,and 1

.

3826 for theTLS estimator.Hence,thergrt-LSestimatorisnotonlyrobustbutalso eﬃcientintermsoftheaverageerrorperformancecomparedtoits well-knownalternatives. Owingtothecompetitive formulationof ourestimators,we achieve suchaverage performancegains espe-ciallywhentheperturbationsaremoderate.

Fig. 2. Averagedresidualerrorsforthergrt-LS,rbst-LS,LS,andTLSestimatorsover 2000 trialsform=5 andn=3,whenδ= δH= δY∈ [0.5,1].

Fig. 3. Averagedresidualerrorsforthergrt-LS,rbst-LS,LS,andTLSestimatorsover 2000 trialsform=5 andn=3,whenδH∈ [0.5,1]andδY=1.

In the second set of experiments, we illustrate the perfor-mancesofthe proposedalgorithms undervarious

δ

H and

δ

Y val-ues. For theseexperiments, we generate 2000 random perturba-tions

H,

y, where

H

≤ δ

H,

y

≤ δ

Y, m

=

5, n

=

3 for differentperturbationboundsandcomputetheaveragederrorover 2000 trialsforthergrt-LS,LS,rbst-LS,andTLSalgorithms.InFig. 2, wepresenttheaveragedresidualerrorsofthesealgorithmsfor dif-ferent values of perturbation bounds, i.e.,

δ

= δ

H

= δ

Y

∈ [

0

.

5

,

1

]

. Weobservethattheproposedrgrt-LSalgorithmhasthebest aver-ageresidualerrorperformanceoverdifferentperturbationbounds compared tothe LS, therbst-LS andtheTLS algorithms. Further-more,inFig. 3andFig. 4,wepresenttheaveragedresidualerrors of these algorithms for different perturbation bounds, i.e., when

δ

H

= δ

Y.Particularly,inFig. 3,we set

δ

H

∈ [

0

.

5

,

1

]

,

δ

Y

=

1 andin

Fig. 4,weset

δ

H

=

1,

δ

Y

∈ [

0

.

5

,

1

]

.

As can be observed from Fig. 2, as the perturbation bounds increase, the performancesof the LSandthe TLS estimators sig-niﬁcantly deteriorate, whereas the rgrt-LS estimator provides an excellent performance.Theresidualerroroftherbst-LSestimator, on the other hand, slightlyincreases asthe perturbation bounds increase,i.e.,itisthemostrobustalgorithmagainstthe

(8)

perturba-Fig. 4. Averagedresidualerrorsforthergrt-LS,rbst-LS,LS,andTLSestimatorsover 2000 trialsform=5 andn=3,whenδH=1 andδY∈ [0.5,1].

Fig. 5. Sortedresidualerrorsforthestr-rgrt-LS,str-rbst-LS,SLS-BDU,andLS estima-torsover1000 trialswhenδH= δY=0.75,m=5,andn=3.

tions dueto its highly conservative nature. Yet, the performance ofthis estimatoris signiﬁcantly inferior to the rgrt-LS estimator. Furthermore,the rgrt-LSestimatorprovidesthebest performance under different

δ

H and

δ

Y values. Particularly, in Fig. 3, we ob-serveasimilarbehaviortotheoneinFig. 2,whereouralgorithm provides a robust performance while also providingthe smallest residualerror(especiallyforhigh

δ

H).Ontheotherhand,inFig. 4, weobservethat theperformanceofrgrt-LS estimatorisless sen-sitive tothe changes in

δ

Y compared tothe rbst-LS, LS, andTLS estimators.

In the next experiment, we examine a system identiﬁcation problem[15], which can be formulated as H0x

=

y0,where H

=

H0

+

W is the observed noisy Toeplitz matrixand y

=

y0

+

w is theobservednoisy outputvector. Here,theconvolution matrixH (whichisToeplitz)constructedfromh whichisselectedasa ran-domsequenceof

±

1’s.Wethengenerate1000 randomstructured perturbations forH0 andy0, where

α

≤

0

.

75

H0

,and plotted thesortedestimationerrorsinascendingorderinFig. 5.

Theaverageresidualerrorsareobservedas1.1155forthe struc-turedregretLSestimator“str-rgrt-LS”ofRemark 4,1.1807 forthe structuredrobust LSalgorithm“str-rbst-LS”, 1.1138forthe LS

es-Fig. 6. Sortedresidual errorsfor rgrt-reg-LS,rbst-reg-LS,and LSestimatorsover 1000 trialswhenδH= δY=0.65,μ=0.5,m=3,andn=2.

timator,and1.2576forthestructuredleastsquaresboundeddata uncertainties estimator“SLS-BDU” of[15]. Therefore,we observe thatthestr-rgrt-LSalgorithmyieldsa smalleraverageresidual er-rorwithrespecttootherrobust estimatorsandachievesthe aver-ageperformanceoftheLSestimator.Inaddition,weobservethat the maximumresidual errors are observedas 1.5554 forthe str-rgrt-LSestimator,whereasitis1.6659fortheLSestimator.Hence, the introducedalgorithm can beused toobtain robustness with-outsigniﬁcantlossesintheaverageestimationperformanceunlike theconventionalrobustestimationmethods.Nevertheless,we em-phasize that for a structured system, the performance of these algorithms are highly sensitive to the structures of the matrices and the vectors. If the perturbation bound is quite high, the ro-bustnessmaynotbepreservedunderlargeperturbations.

Inthefourthexperiment,i.e.,inFig. 6,weprovideerrorssorted inascendingorderforthealgorithminTheorem 2as“rgrt-reg-LS”, fortherobustregularizedLSalgorithmin[16]as“rbst-reg-LS”and ﬁnallyfortheregularizedLSalgorithmas“reg-LS”[10],wherethe experimentsetupisthesameasintheﬁrstexperimentexceptthe perturbationboundsaresetto0

.

65 andtheregularization param-eterischosenas

μ

=

0

.

5.InFig. 6,weobservethattherobustness and the performance tradeoff (between the rbst-reg-LS and the reg-LSalgorithms)oftheintroducedrgrt-reg-LSalgorithm.

Whenthereissmallperturbations onthedatamatrixandthe output vector, i.e., inthebest-case scenario, theresidual errorof the reg-LSestimator is 0.1045,whereas it is 0.2416 forthe rgrt-reg-LSestimator and0.4282forthe rbst-reg-LS estimator.Ascan beobservedfromFig. 6,forhigherperturbations,theperformance ofthereg-LSestimatorsigniﬁcantlydeteriorates,whereasthe rgrt-reg-LS and rbst-reg-LS algorithms provide a robust performance. Onthe otherhand,the rgrt-reg-LSestimatorsigniﬁcantly outper-formstherbst-reg-LSestimatorintermsoftheaverageerror per-formance and achieveseven a more desirable errorperformance comparedtothe reg-LSestimator.Theaverageresidualerrors are calculated as0.9059 forthe rgrt-reg-LSestimator, 0.9177 forthe reg-LSestimator,and1.0316fortherbst-reg-LSestimator.This ex-periment illustrates thesensitivity ofthe reg-LSestimator tothe perturbations. Onthe otherhand, thergrt-reg-LSandrbst-reg-LS estimators provides more robust performances compared to the reg-LSestimator.Yet,thehighlypessimisticnatureofthe rbst-reg-LSestimatordeterioratesitsestimationperformanceandyieldsan unacceptableperformance. Ouralgorithm,ontheother hand,not onlyyields arobust performance comparedtothe reg-LS

(9)

estima-Fig. 7. BERperformancesofthergrt-LS,rbst-LS,andTLSestimators(equalizers)over 1 000 000trialsundervariousSNRs,whenm=3 andn=2.

torbutalsodoesnotcauseanyaverageperformancedegradations unliketheconventionalrobustestimationmethods.

Finally,weillustratethepossibleapplicationsofouralgorithm into different frameworks. Particularly, we consider the channel equalizationproblemandillustratethebiterrorrate(BER) perfor-manceofouralgorithmwithrespecttoitswell-knownalternatives intheliteratureasfollows.

Inthesesimulations,we deﬁne thesignal-to-noise ratio(SNR) asfollows SNR

=

20 log

x

δ

,

where

H

=

1 andlog

(

·)

isthecommon(i.e.,base10)logarithm. ForagivenSNR,wegenerate1 000 000symbolvectorsofx (having length 2) froma binary alphabetand 1 000 000 estimates ofthe (MIMO) channel matrix H (sized 3

×

2) both having unit norms, randomly. For every symbol vector andchannel estimate couple, werandomlygenerateperturbations

H and

y,calculatethe cor-respondingperturbed output vector, andfeedthisinformationto thealgorithms. We quantizetheestimate ofthe symbolvector x

ˆ

andconsiderthenumberofincorrectbitsastheBER(i.e.,we con-sidertheBERratherthanthesymbolerrorrate).

InFig. 7,weprovidetheBERsforvariousSNRs.Weobservethat the proposed algorithm outperforms its competitors in terms of equalizationperformance andsuccessfullyreconstructs the trans-mitted bits. While Fig. 7 illustrates the BER of the proposed al-gorithmsaveraged over a huge numberof channel uses,we also illustrate the robustness of our algorithm over small number of channelusesinFig. 8andFig. 9.Intheseexperiments,weperform 100 independenttrialsineachofwhich10 000symbolvectorsand channelmatrixestimatesaregeneratedandsentoverthechannel asinthepreviousexperimentforSNR

=

20 andSNR

=

25, respec-tively.

In Fig. 8 andFig. 9, we observe that our algorithm not only providesasuperioraveragedperformancewithrespecttoits well-known alternatives but also provides a robust performance. The conventional robust LS estimators provide unsatisfactory results since these algorithms adapt themselves to the worst-case sce-nario. However, the rgrt-LS estimator has a signiﬁcantly smaller BER compared tothe rbst-LS andTLS estimators,since our algo-rithmdoesnot tuneitselfto theworst possibleperturbation,but considerstheworst possibleregret.Particularly,whenthe pertur-bationontheestimatesarerelativelysmall,ouralgorithmprovides

Fig. 8. SortedBERsforthergrt-LS,rbst-LS,andTLSestimators(equalizers)over100 trials,whereineachtrial10 000symbolvectorsaresendforSNR=20,m=3,and n=2.

Fig. 9. SortedBERsforthergrt-LS,rbst-LS,andTLSestimators(equalizers)over100 trials,whereineachtrial10 000symbolvectorsaresendforSNR=25,m=3,and n=2.

signiﬁcant performance improvements compared to the conven-tionalmethodsascanbeseeninFig. 8andFig. 9.

5. Conclusion

Inthispaper,we introducearobustapproachtoLSestimation problems underboundeddatauncertainties basedon anovel re-gret formulation. We studytherobust LSestimation problemsin the presence of unstructuredand structured perturbations under residual and regularized residual error criteria. In all cases, the data vectors that minimize the worst-case regrets are found by solvingcertainSDPproblems.Inoursimulations,weobservedthat theproposedestimationmethodsprovidean eﬃcienttradeoff be-tweentheperformanceandrobustness.Owingtotheregretbased formulation of the proposed method, we obtain signiﬁcant im-provements interms oftheaverageestimationperformance with respect to the conventional robust minimax estimation methods, whilemaintainingtherobustnessasshowninourexperiments.