• Sonuç bulunamadı

Estimating network structure via random sampling: cognitive social structures and the adaptive threshold method

N/A
N/A
Protected

Academic year: 2021

Share "Estimating network structure via random sampling: cognitive social structures and the adaptive threshold method"

Copied!
16
0
0

Yükleniyor.... (view fulltext now)

Tam metin

(1)

ContentslistsavailableatSciVerseScienceDirect

Social

Networks

j o u r n al hom ep a g e :w w w . e l s e v i e r . c o m / l o c a t e / s o c n e t

Estimating

network

structure

via

random

sampling:

Cognitive

social

structures

and

the

adaptive

threshold

method

Michael

D.

Siciliano

a,∗

,

Deniz

Yenigun

c

,

Gunes

Ertan

b aUniversityofIllinoisatChicago,Chicago,IL,UnitedStates

bUniversityofPittsburgh,Pittsburgh,PA,UnitedStates cBilkentUniversity,Ankara,Turkey

a

r

t

i

c

l

e

i

n

f

o

Keywords: Networksampling Cognitivesocialstructures Cross-networkresearch Socialnetworks

a

b

s

t

r

a

c

t

Thispaperintroducesandtestsanovelmethodologyformeasuringnetworks.Ratherthancollecting datatoobserveanetworkorseveralnetworksinfull,whichistypicallycostlyorimpossible,we ran-domlysampleaportionofindividualsinthenetworkandestimatethenetworkbasedonthesampled individuals’perceptionsonallpossibleties.Wefindthemethodologyproducesaccurateestimatesof socialstructureandnetworklevelindicesinfivedifferentdatasets.Inordertoillustratetheperformance ofourapproachwecompareitsresultswiththetraditionalrosterandegonetworkmethodsofdata collection.Acrossallfivedatasets,ourmethodologyoutperformsthesestandardsocialnetworkdata col-lectionmethods.Weofferideasonapplicationsofourmethodology,andfinditespeciallypromisingin cross-networksettings.

© 2012 Elsevier B.V. All rights reserved.

“Forthelastthirty years,empirical social researchhasbeen dominatedbythesamplesurvey.Butasusuallypracticed,..., thesurveyisasociologicalmeatgrinder,tearingthe individ-ualfromhissocialcontextandguaranteeingthatnobodyinthe studyinteractswithanyoneelseinit.”

AllenBarton,1968(quotedinFreeman,2004)

1. Introduction

Mostcross-grouporcross-organizationalresearchstudiesrely onrandomsamplingforthecollectionofdataoneconomicand organizational variables. Such an approach precludes the mea-surementofthenetwork withineach organizationas complete or near complete participation rates are needed (Wasserman andFaust, 1994).Weofferadvancesindatacollectionmethods toenableresearcherstomaintaina randomsample framework while alsocollecting network data onthe relationshipsamong the individuals in the organizations under study. Our method begins by randomly sampling a portion of individuals in the networkandthenestimatesthecompletenetworkbasedonthe sampledindividuals’ perceptions of allpossible ties, which are referred toascognitiveslices.Thus, rather thancollectingdata fromeachactorintheorganizationtoobservethenetworkinfull,

∗ Correspondingauthorat:412SouthPeoriaSt.,140CUPPAHall(M/C278), Chicago,IL60607-7064,UnitedStates.Tel.:+14124279621.

E-mailaddress:sicilian@uic.edu(M.D.Siciliano).

which istypicallycostlyor impossibleina cross-organizational settinginvolvingmultiplenetworks,weprovideamethodologyto aggregatesampledindividuals’perceptionsofthefullnetwork.

Thereare two interrelatedareas of methodological research onnetworksassociatedwithourcurrentagenda: network sam-pling(Butts,2003;Frank,2005;Heckathorn,1997)andnetwork measurementunderconditionsofmissingdata(seeforinstance Butts,2003;CostenbaderandValente,2003;Borgattietal.,2006). Thisstudyspeakstotheseongoingresearchareasbutaddresses themfromadistinctlydifferentangleasourgoalistorecreatean accuratenetworkrepresentationfromasmallsampleofnetwork members.

Thefollowingsectionwillofferabriefjustificationofour meth-ods.Section3providesanoverviewofcognitivesocialstructures andcombinationmethodstodealwiththree-waydata.Section4 willdiscussourestimationandaggregationmethodsforsampling andcombiningcognitiveslicestoproduceaccuraterepresentations ofthe“true” network. Section5willintroduce thedatasetswe analyzeandprovidetheresultsofouranalysisinvolvinga compar-isonofourmethodology’sperformanceagainstthestandardroster approachandtheegonetworkapproach.Section6offersthoughts onimplementationandpotentiallimitations.

2. Rationale

Perhapsthemostchallengingstepforresearcherswishingto measureanetworkisdatacollection.Thedatacollectionphaseis especiallydifficultinacross-networkstudyasonehastomeasure 0378-8733/$–seefrontmatter © 2012 Elsevier B.V. All rights reserved.

(2)

a number of networks. Typically network researchers employ oneoftwoapproachestodatacollection.Thefirstistoattempt tosampleevery individualin theorganizationand collectdata onthedirect networkties. Thesecondistosamplea subsetof individualsandcollectegonetworkdata(i.e.directtiesaswellas thetiesamongthealters).Eachmethodpresentsadifferentsetof problems.

Thefirstmethod,utilizingastandardrostersurvey,requiresthe questionnairetobedistributedtoeachactorineachnetwork.This isatimeconsumingprocessfortheresearcherandonereasonwhy networkstudieswithacomparativeorcross-networkframework tendtoincorporateonlya few networks.Thetraditional roster methodalsorequireshighparticipationratesinordertoproduce validnetworkdata(WassermanandFaust, 1994).Recent meta-analysesonsurveyresponseratesindicateorganizationalresearch achievesanaverageresponserateofjust over50%(Baruchand Holtom,2008;Anseeletal.,2010).Issuesofmissingdatapotentially createstatisticalpowerissuesbyreducingthetotalsamplesizeas someorganizationswithlowresponserateswillbeunsuitablefor analysisgivenconcernsoverthevalidityofthenetworkstructure. Forinstance,Sparroweetal.(2001)hadtoeliminatenearly20%of thegroupsintheiranalysisduetolowresponserateswithinthose groups.Intermsoftheaccuracyofthenetworkforasinglegroup, StorkandRichards(1992)notethatinnetworkof60actorsa75% responserateprovidescompletedataforonly55%ofthe relation-shipsinthenetwork.Theremainingrelationshipshaveonlypartial dataornodataatallandthereforedeterminingtheexistenceor inexistenceofa socialtiebecomesproblematic.Theaccuracyof boththewholenetworkandindividuallevelmeasuresinthe ros-termethodarecompletelydependentontheresponserate.Asthe responserateincreasestheaccuracyandvalidityofthemeasures increase.Therefore,researchersemployingthismethodmustexpel additionaltimeandresourcesinsendingreminders,offering par-ticipationincentives,andfollowingupwithparticipantstoimprove responserates.

While the roster method is more suited to organizational researchgiventheboundednetwork,asecondapproach,usingego networkdata,couldbeemployed.Forexample,onecouldestimate anetworkvariable,suchasdensity,ineachego-netandaverage acrossthevaluesoftheegossampledtoproduceanestimateof theglobaldensitymeasureinaparticularnetwork.Thereliance onaveragingacrossindividuals alleviatessomeoftheproblems associatedwithtimeandresponserates.Ifoneisrandomly sam-plingegonetworkswithinanorganization,thendemandforahigh responserateisreducedandthereforetheneedformultiple follow-upsoradditionalincentivesfororganizationalactorstoparticipate islessened.However,thecapabilityofegonetworkdatato pro-duceaccuratewholenetworkorglobalmeasuresisuncertain,asa randomsampleofego-networksmaynotresultininformation con-cerningallareasofthenetworkunderstudy.Consequently,anego networkapproachdoesnotallowfortheestimationofthe over-allnetworkstructureasitdoesnotattempttogatherinformation onallactors.Thatsaid,inanorganizationalresearchsetting,one couldemployapersonalnetworkstrategytocaptureglobal proper-tiesandthereforeweincludetheegonetworkapproachasanother pointofcomparisonwiththenovelmethodologywepresentinthis paper.

Themethodweproposerequires onlya smallrandom sam-pletoproduceaccurateestimatesofbothnetworkstructureand globalnetwork measures.Thepowerfulcombination ofrandom samplingandnetworkdatacollectionprovidearesearcherwith theopportunitytofruitfullyexplorenotonlyeconomicand orga-nizationalfactorsrelatedtoanoutcomeofinterest,butalsothe socialstructureofthehumanassociationsunderstudy.Because ourmethodologyrelies,inpart,onperceptionsoftiesinanetwork, webeginwithadiscussionofcognitivesocialstructures.

Knowledge Perception A A B C D E A 0 0 1 0 1 B 0 0 1 0 0 C 1 1 0 0 0 D 0 0 0 0 0 E 1 0 0 0 0

Fig.1.CognitivesocialstructureforactorA,knowledgeversusperception.

3. Networkdataandcognitivesocialstructures

Ourmethodsutilizecognitivesocialstructures(definedbelow) asanalternativemeansofdatacollectiontoaddressissuesoftime, responserate,andaccuratemeasurementthatarisewhen collect-ingnetwork dataforalargenumberofgroupsororganizations. Specificallywe ask,canwe randomlysamplea small subsetof individualperceptionsofthenetworktoproduceanaccurate rep-resentationoftheoverallnetworkthoseindividualsareembedded in?Ifaccuratenetworkrepresentationscanbeproducedfromonly a fewindividuals, then datacollectioncan bemoreeasily con-ductedbyinterviewingonlyahandfulofrandomlyselectedactors withinanorganization,and,underanyresearchsettings,issuesof responseratecanbeminimized.

3.1. Cognitivesocialstructures

Cognitivesocialstructures(CSS)arethreedimensionalnetwork structures,whichrepresenteachindividualactor’sperceptionof theentirenetwork(Krackhardt,1987).Thus,unliketraditionaldata collectionmethods askinganactortoindicate onlyhis/her ties withotheractorsinthenetwork,when collectingCSSdatathe actorneedstoprovideinformationonallofthepossibletiesinthe network.ThesestructuresarerepresentedasRi,j,m,whereiisthe

senderoftherelation,jisthereceiveroftherelation,andmisthe perceiveroftherelation(Krackhardt,1987).HereRi,j,m=1means

thatpersonmperceivesarelationtoexistfromactoritoactorj.If

Ri,j,m=0thenpersonmperceivestherelationtonotexist.Therefore,

afullCSSnetworkofsizeN,wouldbeanN×N×Narraycontaining 0or1entriesrepresentingtheexistenceofpossibleties.1

Inadirectednetworkconsistingof5actors,CSSdatacollection wouldrequireeachactortomakeajudgmentabout20possible ties.SoinanyCSSmatrix,sayforactorA,therearetwotypesof data.Thefirsttypeinvolvesinformationconcerningthetiesthat involveactorA,whichwecallknowledge.Thesecondtypeinvolves actorAsopinionofthetiesbetweentheotheractorsinthenetwork, whichwecallperception.Forexample,Fig.1representsacognitive socialstructureforactorAina5actornetwork,alsoreferredto asAscognitiveslice.Asnoted,theknowledgedataprovidedbyA isintheshadedrowandcolumnandtheperceptiondataisinthe non-shadedareas.

Todate, thestudyof CSShasfocusedprimarilyonthe psy-chologicalaspectsofperceptionandcognitiveaccuracy.Cognitive accuracyhasbeendefinedas“thedegreeofsimilarity between anindividual’s perceptionof thestructureof informal relation-shipsin agivensocialcontextand theactualstructureofthose relationships”(Casciaroetal.,1999,p.286).ResearchonCSShas exploredtheconnectionsbetweencognitiveaccuracyandpowerin anorganization(Krackhardt,1990)aswellasthestructuraland/or psychologicalreasonforvariationsincognitiveaccuracy(Pattison, 1994;Casciaro,1998).

1ItispossibletohavevaluedCSS,butforourcurrentpurposesweareonly interestedinbinarynetworks.

(3)

Ofimportance for thecontext of this paper,is thefact that researchonCSShasshownthatindividualshavelargevariationsin theirabilitytoaccuratelyperceivethenetwork.Individualsmake errors ofomission(claiming a tiedoesnot exist whenit does) anderrorsofcommission(claimingatieexistswhenitdoesnot). Thus,anysingleindividual’sreconstructionofthenetworkwillbe flawedandwillgenerallyprovideapoorrepresentationofthetrue network.Ourhypothesisguidingtheresearchobjectiveisthatby combiningasmallnumberofflawedcognitivesocialstructuresthe errorsofomissionandcommissionuniquelyassociatedwitha sin-gleindividualcanbewashedawaythroughaggregationwithother individuals’flawedperceptions.

3.2. CSSreduction/aggregationmethods

Inordertoworkwithcognitivesocialstructures,becausethey arethree-dimensionaldatasets,oneneedstoengageinsomeform ofdatareductionordataaggregation.Krackhardt(1987)provided threemethodsofaggregatingcognitivesocialstructuresinorderto transformthree-dimensionaldataintotwo-dimensionaldata.The threemethodsdiscussedwereslices,locallyaggregatedstructures, andconsensusstructures.Slicesaredefinedasoneindividual’s per-ceptionofthenetwork.Thus,itindicatesalltiesbetweeniandj, holdingtheperceiverconstant.

Ri,j=Ri,j,m

wheremisaconstant,indicatingtheperceiver.Locallyaggregate structures(LAS)aretraditionalmeansofnetworkdatacollection relyingonlyoninformationprovidedbythereceiverorsenderofa particulartie.Therationalebeingthattheindividualsbestsuitedto determineifatieexistsarethemembersofthedyadunder ques-tion.Suchstructuresaretermedlocallyaggregatedbecause“the resultingrelationbetweeniandjdependsoninformationprovided bythemostlocalmembersinthenetwork,namelyiandj them-selves”(Krackhardt,1987,p.116).BecausetheCSSdatacontains informationonbothi’sandj’sknowledgeoftheirindividualties toothersinthenetwork,itispossibletocombinetheirknowledge throughanintersectionrule

Ri,j={Ri,j,i∩Ri,j,j}

oraunionrule Ri,j={Ri,j,i∪Ri,j,j}.

Basedonthepreceding,foratiefromactoritoactorjtoexistin theaggregatednetworkundertheLASintersectionrule,bothactor iandactorjmustagreeonitsexistence.Alternatively,underthe LASunionrule,atiecanexistintheaggregatenetworkifeitheri orjclaimthetietoexist.Thus,theLASintersectionruleisamore conservativerulerequiringmutualagreementbybothactorsina dyad.Note,however,thatsuchmethodsuseonlyknowledgedata andignorealloftheperceptiondataprovidedbytherespondent.

Consensus structuresrely oninformation provided byevery individual’sperceptionofthetiebetweeniandjinthenetwork. Thus,asnotedbyKrackhardt,apracticalimplementationofa con-sensusstructureistosetathresholdvalue,whereatieisdefined toexistonceacertainpercentageofnetworkmembersclaimatie existsbetweeniandj.Thethresholdfunctioncanbedefinedas:

Ri,j=

1, if



m Ri,j,m<Threshold 0, otherwise

.

Othermethodsof aggregationforthree-way datahavebeen used,seeforinstanceBatchelderetal.(1997).

A A B C D E B A B C D E A 0 0 1 0 1 A 0 1 0 0 0 B 0 0 1 0 0 B 1 0 1 0 1 C 1 1 0 0 0 C 0 1 0 0 1 D 0 0 0 0 0 D 0 0 0 0 0 E 1 0 0 0 0 E 0 0 0 0 0 C A B C D E D A B C D E A 0 0 0 0 0 A 0 0 1 0 1 B 1 0 1 0 0 B 0 0 1 1 0 C 1 1 0 0 0 C 1 1 0 0 0 D 0 0 0 0 0 D 0 1 0 0 1 E 0 0 0 0 0 E 1 0 0 1 0 E A B C D E A 0 0 0 1 0 Knowledge B 0 0 0 0 1 C 0 0 0 0 0 Perception D 1 0 0 0 1 E 0 1 0 1 0

Fig.2. Samplecognitiveslicesfora5actornetwork.

4. Theadaptivethresholdmethod

Asindicated inthebriefdiscussiononcognitivesocial struc-tures,muchofthecurrentresearchincognitivenetworktheory focusesonperceptionasthephenomenatobeexplainedand com-paresindividualperceptionstoreality.Thispaperseekstocombine asmallsubsetofindividualperceptionstoconstructan approxi-mationofrealitybyestimatingthe“true”networkstructurefrom asampleofcognitiveslices.Themethodstoproduceasingle net-workfromarandomsampleofcognitivesliceswillbediscussedin detailbelow.Therearetwogeneralsteps:(i)samplingindividuals toobtaincognitiveslicesand(ii)aggregatingcognitiveslices.Once thesamplingandaggregationofsliceshavebeendiscussed,details ontheperformanceofourmethodsusingactualdatasetswillbe providedinSection5.

4.1. Samplingandaggregatingcognitiveslices

Givenourdesiretomaintainarandomsamplingframework, everyactorinthenetworkhasanequalprobabilityofbeingselected anduponbeingselectedwouldbeaskedtoprovidehisorher cog-nitivesocialstructure.Oncedatahasbeengatheredfromarandom sampleofsizen,aggregationofthecognitiveslicesreliesonatwo partprocedureutilizingbothtypesofdataavailableinanactor’s cognitivesocialstructure:knowledgeandperception.Asasimple exampleofaggregation,assumethereare5actorsinanetwork: A,B,C,D,andE.Eachactor’scognitivesocialstructureisgivenin Fig.2.NotethatthereisnorequirementthattheCSSbesymmetric. Following Krackhardt (1990), we derive the “true” network throughLASintersection.2Foraffectiverelations,thatcannotbe

observed orverifiedobjectively,themostinformed actors con-cerningtheexistenceofatiearethemembersofthedyadunder question.Thus,truthisbasedonanidealsituation,wherewehave informationabouteachdirectionaltiefrombothactorsinthedyad.

2Thereareotherwaystoconstructthe“true”networkgivenasetofinformant reports.Butts(2003)andRomneyetal.(1986)arguethattruthcanbeapproximated viaconsensusmethods.

(4)

WhendeterminingwhethertheRi,j tieexists,wecanassessthe

informationprovidedbyiandj.Forexample,imayclaimtohold afriendshiptiewithj,butjmaynotconfirmisclaim.Inthiscase Ri,j,i=1,Ri,j,j=0.Thereisnowaytoobjectivelydeterminewhois

cor-rect.AsKrackhardt(1990,1996)claims,ifbothiandjagreeonthe i–jtiethenitismorelikelytobetruethaniftheydonotagree.This approachofdefiningafriendshiptieonlywhenbothpartiesagree thatitexistshasobviousfacevalidity(Krackhardt,1990,p.349). Furthermore,theuseofLASintersectionoffersanotheradvantage. Whencomparingourmethodwiththetraditionalrostermethod inSection5,LASintersectionproducesacriteriongraphthatcan bereproducedbybothmethodologies.BecausetheLASapproach “mimicsthe typicalformin which network data arecollected” (Krackhardt, 1990, p. 349), if the informants in the traditional methodareaccurate,thentherow-dominatedrostermethodand theLASintersectionmethodwillproducesimilarresults.Thus,the LASintersectionapproachtothetruenetworkprovidesan achiev-ablecriterionstateforbothmethodologiesandthereforeoffersa validpointofcomparison.

Thegoalofourmethodologyistoproduceanaccurate repre-sentationofthis“true”networkwhenonlysamplingafewofthe individualsinthenetwork.Inotherwords,ifwewereonlyableto survey3actorsofthis5personnetwork,howaccuratelycouldwe combinetheknowledgeandperceptiondatacontainedineachof theircognitiveslicestoapproximatethe“true”networkasdefined byLASintersection?

Whenasampleisdrawn,theprimaryissueishowto deter-minethepropermeansofaggregatingthesampledcognitiveslices. Becausewearesamplingactorsweareforcedtodealwithboth knowledgeandperceptionasmanyofthedyadsinthenetwork containactorswhowerenotsampledandthushavenolocal knowl-edgetobearontheirrelations.Under oursamplingconditions, threedifferentscenariosforthedeterminationofapotentialtie intheaggregatednetworkexist.Thefirstiswhenknowledgedata ispresentforbothactorsinvolvedinthetie(i.e.bothactorswere sampled).Thesecondiswhennoknowledgeexistsforatie,and thusthereisonlyperceptionaboutthetie’sexistence(i.e.neither ofthetwo actorsinvolvedinthetieunderquestionwere sam-pled).Thethirdiswhenknowledgeexistsforonlyoneoftheactors involvedinthedyad(i.e.onlyoneofthetwoactorsinvolvedinthe tieweresampled).

Inordertoillustratethesethreescenarios,considerthecase whereA, D,and Eareselectedasasample. Eachactor, exclud-ingthediagonal,provideseightpiecesofknowledgeandtwelve piecesofperceptiondata.Inthefirstscenariowhenbothactorsin thedyadunderquestionaresampled,wecombinetheknowledge componentsofeachofthesampledactorsusingtheLAS intersec-tionrule.Thus,forinstance:(i)theD–EtiewasbothclaimedbyD andE,sothetieexistsintheaggregatenetwork,(ii)A–Etiewas claimedbyA,butdeniedbyE,sothetiedoesnotexistinaggregate network,and(iii)theA–DtiewasdeniedbybothAandD,sothe tiedoesnotexistinaggregatenetwork.Accordingtoourmethod, noperceptionscanchangetheexistenceornon-existenceofthese ties.

Inthesecondscenario,neitheroftheactorswassampledand sotheexistenceofthetiecanonlybedeterminedbythe percep-tionofothers.Thus,inoursampleofA,D,andE,theexistenceofa tiebetweenB–Cintheaggregatednetworkreliesonthesampled actors’perceptionthatsuchatieexistsbecauseneitheractorBnor actorCweresampled.Thecriticalquestionintheseinstancesis howmuchperceptionevidencemustbebroughttobearona par-ticulartiebeforeweclaimthatitexistsintheaggregatenetwork. Thisrequiresustosetanevidencethreshold,k,whereifkormore sampledactorsperceiveatietoexist,thenthetiewillbecreated intheaggregatenetwork.Discussionabouthowkisdeterminedis detailedinSection4.2. TRUE A B C D E EST. A B C D E A 0 0 0 0 0 A 0 0 1 0 0 B 0 0 1 0 1 B 0 0 1 0 0 C 1 1 0 0 0 C 1 1 0 0 0 D 0 0 0 0 1 D 0 0 0 0 1 E 0 0 0 1 0 E 0 0 0 1 0

Fig.3.TruenetworkderivedviaLASintersection(TRUE)andtheestimatednetwork (EST).

Inthethirdscenario,whereonlyoneindividualofapossibletie issampled,wemustalsorelyonperception.Forexample,Aclaims atiewithC,butCwasnotsampled.WecannottreatthisastheA–E case,sinceCwasnotsampled,soCdidnothaveachancetoaccept ordenythistie.Therefore,wetreatthisasperception,andthe exis-tenceoftheA–Ctiehasonepieceofperceptionevidenceindicating thatthetieexists.Insuchacase,ifourperceptionthresholdisk, thenk− 1additionalperceptions fromtheothersampledactors mustbepresenttoconcludethatA–Ctieexists.

Followingthisapproach,thenetworkestimatedfromsampleA, D,andE,andthe“true”networkobtainedbytheCSSintersectionof allfiveslicesaregiveninFig.3.TheestimatednetworkinFig.3was determinedbysettingasimpleevidencethreshold,k,toavalueof 2.

Inthissimpleexample,theestimatednetworkproducedboth anerrorofomission(B–Etie)andanerrorofcommission(A–Ctie). Itisevidentthatdeterminingthethresholdkbecomesthemost importantfactorforouraggregationmethodstoproducean accu-raterepresentationofthe“true”network.Inthisexample,kwas arbitrarilydefinedtobe2.Thismeansthatiftwoormoresampled individualsclaimatieexistedbetweenanytwounsampled individ-uals,thenthetiewillbeestablishedintheaggregatenetwork.This isanaivewaytodeterminek.Assamplesizesdrawnfromdifferent networkswillvary,astaticlevelofkwilltendtoperformpoorly (seeSection5).Inaddition,eachsamplewillcontainactorswith varyingcapacitytoaccuratelyperceivethenetworkandthus vary-ingpropensitiestocommiterrors.Amoresophisticatedapproach todefiningandadaptingthethresholdlevelforaparticularsample shouldtakeintoaccountboththesizeanderrorrate.

4.2. Settinganadaptivethreshold

From previous research on cognitive social structures it is knownthatindividualsvaryintheamountoferrorsofomission anderrorsofcommissiontheymake.Becauseofthis,anysample ofcognitiveslicesmaybemoreorlesspronetoerrorandhence moreorlesstrustworthyasawhole.Itwouldbehelpfultoprovide ameasureoftheaccuracyofthesampledrawnandusethe mea-sureofaccuracytosetthemostappropriatethresholdfork.This wouldallowktobeadjustedasameansofcontrollingtheamount oferrorthatoccurswhenaggregatingasampleofcognitiveslices. Borrowingterminologyfromstatistics,twoerrortypesare possi-blewhendeterminingtheexistenceofties:Type1errorsorerrors ofcommissionandType2errorsorerrorsofomission.Thenfora giventhresholdk,wehave

P(Type 1 error)

=P(Perceptionsaysthereisatie|Thereisnotie)=˛k, (1)

P(Type 2 error)

=P(Perceptionsaysthereisnotie|Thereisatie)=ˇk, (2)

wherePdenotesprobability.Wewillestimatetheseprobabilities fromtheobservedfrequencies.Becauseeachsampleofcognitive

(5)

slicescontainsbothknowledgeandperception,weareableto esti-matetheprobabilityoferrorfromthesampleitselfwithoutthe needforafulldataset.Inouradaptivethresholdmethodwefocus onType1errorsduetothelackofknowledgeabouttheoriginsof Type2errors.AType2errorcanoccurbecauseeitheranactordoes notbelievethatatieexistsbetweentwoindividualsorbecause theactorissimplyunawareofwhetheratieexistsandthedefault decisionwhenunawareofatiemaybetoclaimitsinexistence.This createsproblemswhenattemptingtomeasureandreduceType 2errorsduringtheaggregationofcognitiveslices.Type1erroris alsomoreimportantforaggregatingcognitiveslicesastheevidence thresholdintheconsensusmethodsforcognitivesocialstructures isbasedonthosewhoperceiveatietoexist,andthuserrorsof commissionarepotentiallymorecostly.

ForanexampleofaType1errorinthecontextofasingle cog-nitiveslice,returntoFig.2andviewthesampledslicesofA,D, andE.NotethatAclaimstonotsendatietoDandDclaimstonot receiveatiefromA.Thus,basedonlythesampledslicesweknow thatthedirectionaltieA–Ddoesnotexistinthe“true”network. AnexampleofaType1erroroccursinthesampledcognitiveslices becauseactorEperceivesthatatieexistsfromAtoDwheninfact weknowthatitdoesnot.Thus,actorEcommittedaType1error andwecanacknowledgethiserrorbasedsolelyonthesampled cognitiveslices.Thisisacrucialpoint.Wecanlocateandcountthe numberofType1errorscommittedbyacognitiveslicebasedonly oninformationfromindividualswhoweresampled.Withlarger networks,theopportunitiestocreateType1errorsincreaseand thusitispossibletodevelopareasonableestimateoftheoverall accuracyofthesampledactors’perceptions.Thisallowsktobe determinedbysettingatolerablelevelofType1errorandthen calculatingthethresholdvalueofkthatisnecessarytomeetthe pre-definedtolerancelevel.

Hence,kcanadjustbasedonthemeasureoftheaccuracyofeach sampledrawn.Formally,ourestimatorofaType1errorrate,˛ˆk,is

simply3:

ˆ ˛k=

numberofType 1 errorscommitedbythesample

numberofpossibleType 1 errorsinthesample , (3) wherenisthesamplesize.Type1errorsorthenumberof per-ceptiontiescanceledbyknowledgecanbeincreasedordecreased basedonthevalueofk.Forinstance,inasampleof8actors(say per-sonAthroughpersonH)fromalargernetworkwewoulddirectly observe56interactionsamongthosesampledindividuals.These interactionsarepiecesofknowledgeortiesthatareknownbased onlocalinteractionandthusknowntoexistornotexistinthe“true” network.Wecanthenobservehoweachofthesampledactors per-ceivesthosetiestobedistributedamongtheothersampledactors. Forinstance,assumethereisnotiefromactorAtoactorB.Under LASintersectionrules,thiswouldmeanthatactorAdeniedsending atietoBandactorBdeniedreceivingatiefromA.Wecanlooktosee howmanyoftheremainingsixsampledactors(CthroughH) per-ceivedatietoexist.Thesixothersampledactorswouldcollectively makeaType1errorifkormoreofthemperceivedthetietoexist. ByadjustingkwecanadjustthenumberofType1errors commit-tedbyanyparticularsampleofactors.Becausewecandetermine theaccuracyofeachsample,wecandeterminehowhighorlow ourthresholdlevelofkneedstobesettokeepthatsample’sType 1errorbelowsomepre-definedlevel.

3Ifweassumethatallknowntiesbetweensampledactors(i.e.,thelocal knowl-edgeofthetiesinthesample)areallzero,theninasampleofnactorsthereare n(n−1)(n−2)opportunitiestomakeaType1error.Thisisbecauseeachofthen sampledactorsisnotinvolvedin(n− 1)(n−2)ofthetiesandthereforeiscapable ofmakingaType1errorforthesedyadsonly.

Giventhattheactorsarerandomlysampled,theerrorrate cal-culatedfromthesampleofcognitiveslicesisassumedtobean accuraterepresentationoftheoverallerrorratethesampledactors wouldmakefortheentirenetwork.Thus,foragivensamplewecan setatolerableType1errorrateandlettheaccuracyor account-abilityofthesampledeterminethethresholdlevelknecessaryfor thesampletonotexceedtheType1errorrate.Thealgorithmto determinetheexactkforagivensampleoperatesasfollows:

Step1.Set˛,thetolerableerrorrate.Typicalvaluesare0.05,0.10, 0.15.

Step2.Drawarandomsampleofsizen.

Step3.Findthesmallestksuchthat˛ˆk<˛,anddenotethisbyk*.

Step 4.Compute theestimatednetwork using theaggregation methodwiththresholdk*.

Inwhatfollowswewillrefertothismethodologyastheadaptive thresholdmethod.Foraformulizedhandlingofthemethodologysee AppendixA.

5. Performanceoftheadaptivethresholdmethod

Inthis sectionweillustratetheperformanceoftheadaptive thresholdmethodthroughanextendedsimulationstudy.The sim-ulation study consistsof repeated sampling fromreal datasets wherethecompleteCSSisknownforeachindividual.Section5.1 describesthedatasets,Section5.2givesashortreviewoftheglobal networkmeasuresweassess,Section5.3providessomebasic prop-ertiesofourestimators,andSection5.4presentsthemainfindings comparingtheadaptivethresholdmethodwiththetraditional ros-terandegonetworkmethods.

5.1. Data

Totestourmethodofaggregationusingthealgorithmto adap-tivelydefinek,weanalyzefivedatasets.Thefivedatasetsare:(i) HighTechManagers– 21managersofmachinery firm,(ii) Sili-conSystems–36semiskilledproductionandserviceworkersfrom ansmall entrepreneurialfirm,(iii) PacificDistributors– 33key personnelfromtheheadquartersofanelectronicscomponents dis-tributor,(iv)GovernmentOffice– 36governmentemployeesatthe federallevel,and(v)ItalianUniversity–25researchersacrossthree interrelatedresearchcentersatauniversity.Eachdatasetcontains aCSSforallornearlyallactorsinthenetworkforbothfriendship andadvicerelationships.Inthisstudywefocusonfriendshipties.

Becausethedatasetsweusecontaincognitiveslicesforall indi-vidualsinthenetwork,weareabletorunmultipletrials.Eachtrail pullsvariousrandomsamplesandcreatesanapproximationofthe networkfromthesampledindividuals’cognitivedata.Every ran-domsampleproducesadifferentestimationofthenetworkandwe evaluatethevariabilityfromsampletosample.Moreimportantly, becausethedatasetscontaincognitiveslicesforallactors,weare abletogeneratethe“true”network,orcriteriongraph.By establish-ingthe“true”network,wecandependablyassesstheperformance ofourmethodology.Obviously,whenemployingourmethodology inpractice,aswithanysamplingmethodology,onewillneverhave dataonthefullpopulationunderstudyandthusneverknowthe truevalueofthevariableofinterest.Inourcase,thismeansthat acognitiveslicewillnotbeavailableforeveryactorasthegoalof themethodologyistorelyonasmallrandomsampleofcognitive slicestoestimatetheoverallnetwork.Theprimaryobjectiveofthis paperistodemonstratethebehaviorofourmethodsunder vari-oussamplingconditionswhenthe“true”structureisknown.This allowsustounderstandhowsuchprocedureswillbehaveinthe field.

(6)

5.2. Globalnetworkmeasures

Weevaluateourperformancethroughthecorrelationofthe esti-matednetworkwiththetruenetwork,andthroughourabilityto estimateseveralglobalnetworkmeasures,namely,density, clus-teringcoefficient,andaveragepathlength.Wegiveashortreview ofalltheseconcepts,butreferthereadertoWassermanandFaust (1994)forageneraltreatmentofnetworkconcepts.

FollowingKrackhardt(1990),weutilizeacorrespondence mea-suretoassesstheaccuracyoftheaggregatednetwork withthe true network. The correspondence measure, labeled as S14 by

GowerandLegendre(1986),calculatestheaccuracyofthe aggre-gatednetwork. Given that the matrices of relationships in the fivedatasetscontainonlyonesandzeros,therearefourpossible statesofrelationshipbetweentheaggregatednetworkandthetrue network:

• a–matchingzeros,meaningtheijcellinthetruenetworkiszero andthecorrespondingijcellintheaggregatednetworkiszero. • b–omissionerror,meaningtheijcellinthetruenetworkis1but

thecorrespondingijcellintheaggregatednetworkiszero. • c–commissionerror,meaningtheijcellinthetruenetworkis

zerobutthecorrespondingijcellintheaggregatednetworkis one.

• d–matchingones,meaningtheijcellinthetruenetworkisone andthecorrespondingijcellintheaggregatednetworkisone.

Giventhesedefinitions,correlationiscalculatedasfollows4:

S14=

ad−bc

(a+c)(b+d)(a+b)(c+d).

Networkdensityisameasureoftheconnectednessoftheoverall network.Densityofanetworkistheratioofexistingtiestothe numberofpossibleties.Fordirectedgraphstheformulafordensity is:

l/n(n−1),wherelisthenumberofexistingties,andn(n−1)is thenumberofpossibledirectedtiesamongthenactors.

The clustering coefficient, also referred to as transitivity, is theprobability that two neighborsof a randomly chosennode are themselvesconnected. In other words, lookingat a partic-ular node in the network, the clustering coefficient measures the interconnectedness among the alters of that node. The formula for the clustering coefficient of a single node i is definedas:

Ci= Ai

mi(mi−1)

whereAiisthenumberoftiesbetweennodeismiadjacentnodes

(Kilduffetal.,2008).5Theglobalclusteringcoefficientissimplythe

averageoftheindividualclusteringcoefficientsforallnodes. Theshortestpath lengthbetweentwo nodes, knownasthe geodesicdistance,isdefinedasthesmallestnumberoftiesneeded toconnecttwonodesinanetwork.Iftwonodescanbeconnected viaexistingtiesinthenetworktheyaresaidtobereachable.The

4 AsnotedbyKrackhardt(1990),S

14providesavaluethatisequaltowhatone wouldobtainifthematriceswerevectorizedandasimplePearson’scorrelation coefficientwascalculated.

5 Notetheformulafortransitivityforagraphisoftenwrittendifferently.Itis calculatedas3× thenumberoftriangles/numberofconnectedtriplesofnodes.The numeratoristhenumberofsubgraphsof3nodesallofwhichareconnected.The denominatoristhenumberofconnectedandnon-connectedsubgraphsof3nodes.

averagepathlengthissimplythemeanpathlengthofallpairsof reachablenodes,whichisdefinedas:

2 n(n−1) n



i=1 n



j=1 Lmin(i,j),

whereLmin(i,j)isthegeodesicdistancebetweennodeiandnodej

(Kilduffetal.,2008).6

5.3. Performanceofadaptivethresholdmethodandeffectof˛ Wenowillustratetheperformanceandcharacteristicsofthe proposed estimation method through graphs summarizing our simulationresults.For eachgraphinthissectionandinSection 5.4,thehorizontalaxisrepresentsthesamplesizetakenfromthe networkunderstudy,andtheverticalaxisrepresentsthevalue ofanestimatednetworkmeasure.Ahorizontallineineachgraph displaysthetruenetworkmeasure.Thecoloredlinesinthegraphs correspondtothe95%empiricalconfidenceintervals(CI)forevery samplesizeconsidered.Foragivensamplesize,sayn,an empiri-cal95%confidenceintervalisobtainedbyrandomlydrawing1000 samplesofsizen,estimatingthenetworkmeasureofinterestusing theindicatedmethodforeachsample,andreportingthe2.5thand 97.5thpercentilesoftheestimates.

Foraninitialassessment,webeginbycomparingastatic thresh-oldapproachwithouradaptivethresholdmethod.WeusetheHigh TechManagersdataandconsiderallsamplesizesbetween4and 21.Inthesimulationwedraw1000randomsamplesforeach sam-plesize,andestimatethenetworkdensityforeverysample.Fig.4a displaystheresultsofasimulationstudyforstatick=2,statick=5, andstatick=8.Fig.4bdisplaystheresultsoftheadaptivethreshold methodusing3differentlevelsof˛.

WecanseefromFig.4athatselectionofastaticthresholdk iscrucial,assmallkleadstooverestimationandlargekleadsto underestimationofthenetworkmeasure.Thethresholdofk=2is toolowwhensampling,say,12individuals.Ifjust2ofthose12 indi-vidualsmakethesameerrorofcommissionthenanerroneoustie isplacedinthenetwork.Thek=2curvebeginstodecreaseafterthe halfwaypointbecausemoreandmoretiesaredeterminedbylocal knowledgeandthustherearefewerchancesforerrorsof commis-siontooccur.Wefindaverydifferenteffectforthek=5andthek=8 lines,asthesethresholdtendtobetoohigh.Thissensitivityisnot presentfortheadaptivethresholdmethodasseeninFig.4b,asthe thresholdmethodspecifies˛asopposedtok,givingtheresearcher controloftheprobabilityofaType1error.Accordingly,withalow˛ levelof0.05,thethresholdvaluesarethehighestandthereforethe distributionofdensityestimatesisslightlylowerwhencompared withtheother˛levels.Asthe˛valuemovesto0.10and0.15the thresholdvaluesdrop,resultingindensitymeasureswithslightly largerpredictedvalues.Regardlessofthe˛level,thetruedensity isconsistentlycoveredwithintheconfidenceintervals,andasthe samplesizeincreases,thevariabilitydecreasesandtheestimated densityconvergestothetruevalue.Theseresultsprovideastrong motivationforutilizingtheadaptivethresholdmethod.

Thechoiceof˛isimportantintheadaptivethresholdmethod. For each dataset and for each measure, different ˛ levels will producedifferentresults.AswithsettingaType1errorratein tradi-tionalstatistics,therearenohardrulestofindingtheoptimallevel. Whiletheresultsarenotshownhere,ourinvestigationofvarious ˛levelsacrossthedatasetsandacrossnetworkmeasuresrevealed thatan˛setbetween0.08and0.12producedthemostreliable

6Fortwonodesthatareunreachable,theirvaluesareignoredinthecalculation oftheshortestpathlength.Thus,theaveragepathlengthcalculationonlyconsiders distancesbetweenreachablenodesinthenetwork.

(7)

Fig.4. Staticversusadaptivemethods. results.Thefinalchoiceofan˛rateisultimatelytheresearcher’s

anddependsontheimportanceplacedonerrorsofcommission anderrorsofomission.

5.4. ComparingtheadaptivethresholdmethodtostandardSNA methods

Inthissectionweincorporateallfivedatasetsandanalyzehow theadaptivethresholdmethodperformsatestimatingvarious net-worklevelindices.Foreachofthefivenetworks,simulationstudies were carried out in order to produce the 95% empirical confi-denceintervalsforeachofthenetworkmeasures.Wereportthe 95%empiricalconfidenceintervalsforthegraphcorrelationwith thetruenetwork,density,clusteringcoefficient,andaveragepath length.Inordertoassesshowwelltheadaptivethresholdmethod performswecompareit’sperformancewithtwootherstandard approaches.Theseapproachesarethetraditionalrostermethod, whichcollectsdataoneachactorinthenetworkabouthisorher directties,andtheegonetworkapproach,whichcollectsdataon thefocalactor’sdirecttiesaswellasthetiesamonghis/heralters. Forthetraditionalrosterapproach,thenetworkateachsample sizewasgeneratedbycombiningtheself-reportsofthesampled actor’sties. Incaseswhere someindividuals werenot sampled, missingdatawasimputedthroughsymmetrization.Forexample, ifactoriwassampledbutactorjwasnot,theninitiallytherowfor actorjwouldbeempty.Toestimatethestructureofjstiesweuseda simplestrategycommonlyemployedbynetworkresearcherswhen dealingwith(whatarebelievedtobe)logicallysymmetricrelations –themissing j–itieissettothesamevalueasnon-missingi–j tie.7Becauseweintendtocompareourmethodtothetraditional

rostermethodwhentherostermethodachievesmoderatetohigh responserate,imputingmissingdataonlyaffectsasmallproportion oftiesintheseranges.

Fortheegonetworkapproach,globalmeasureswereestimated bycalculatingthevalueofthemeasureineachsampledactor’s ego-networkandthenaveragingthosevaluesacrossallnsampled actors.Note, that because theego network approach does not

7Inresultsnotshownhere,weaggregatedtherosterapproachdatawithoutusing symmetrizationtoimputethemissingrelations.Suchanapproachleavesmissing actor’srowsblankandthusdoesaverypoorjobofestimatingstructure.Withno otherinformationtodeterminenon-sampledactors’ties,symmetrizingprovidesthe mostinformedapproachforestimatingstructure.Thereareotherwaystoimpute missingdata,includingtheuseofexponentialrandomgraphmodels,whichare beyondthescopeofthispaper.

produceanapproximationofthenetworkstructureitisnot dis-playedinthegraphicscomparinggraphcorrelation.Additionally, because theego network approach asks individuals to identify theiraltersandtheconnectionsamongtheiralters,bydefinition, allindividualsintheegonetworkmustbeadistanceofnogreater than2stepsawayfromeachother.Therefore,averagepathlength measureswerenotcalculatedforegonetworksamples.

Theresultsofourcomparisonbetweentheadaptivethreshold approachwith˛=0.1,thetraditionalrosterapproach,andtheego networkapproacharedisplayedinFigs.5and6.Fig.5presents theconfidence intervals for correlation and Fig.6 presents the confidenceintervalsandmeansquareerror(MSE)plotsforglobal networkmeasures.Meansquareerrorisastandardtoolin evaluat-ingtheperformanceofanestimatorintermsofbiasandvariability. Foragivenn,meansquareerrorforanetworkmeasureissimply computedasMSE=variance+bias2,wherevarianceistheobserved

variance oftheestimatednetwork measure,andbiasisthe dif-ferencebetweenthetruevalueofthenetworkmeasureandthe averageoftheestimatednetworkmeasure,basedon1000 repe-titions.Asstatedbefore,horizontallineintheconfidenceinterval graphsindicatesthetruevalue.FortheMSEgraphs,thehorizontal dashedyellowlineisprovidedasapointofcomparison,and indi-catestheMSEoftherostermethodwhena70%responserateis achieved.

InFig.5,bothmethodsofestimatingnetworkstructurehavean upwardlyslopingtrendassamplesizeincreasesindicatingacloser approximationtothetruenetwork.However,thetraditionalroster approachdoesnotachieve100%correlationinourtrials,asitdoes notconvergetothetruenetwork.Themainreasonforthislackof convergenceisthatunderthetraditionalrostermethod,actoriis onlyaskedaboutthoseindividualswithwhomhe/sheisfriends. Therefore,ifactoriclaimstobetiedtoactorjthenthei–jtieissaid toexistinthenetwork.Thispresentsasignificantopportunityfor errorastheexistenceofthistiedoesnottakeintoaccountactorj’s beliefofthei–jtie.Hence,theexistenceofatieisneververified; itsexistencesolelyreliesonactorisindividualclaim.8Whileitis

8Thefactthatiandjmaydifferintheirperceptionofthei–jfriendshiptiemay appearunintuitive.WereferthereadertoCarleyandKrackhardt(1996),which detailstheprocessesbywhichnon-symmetryandnon-reciprocationoccurwith interactionbasedbehaviors.Becauseofthis,wewouldadviseresearchersinterested inemployingthetraditionalrosterapproachtoalwaysaskaboutdirected relation-shipsinbothdirections.Whilethisoftentakesplacewithinstrumentalties,such asadviceseeking/adviceproviding,suchanapproachisalsonecessaryforaffective tiesthatappearlogicallysymmetric.

(8)

Fig.5.Graphcorrelations.

theoreticallypossiblefortherostermethodtoperfectlyreproduce thetruenetwork,inpractice,duetotherostersmethodsrelianceon rowdata,theresultingnetworktendstobebiasedduetoinaccurate selfreports.

In Fig.6,one shouldviewnas representingsample sizefor adaptive thresholdand ego-netmethods, andresponse ratefor traditionalrostermethod.Thus,onecancomparetheperformance ofthe rostermethodwitha 70% responserate totheadaptive

(9)

thresholdmethodandego-netmethodusinga30%samplesize. Fig.6clearlyindicatesthattheadaptivethresholdmethodprovides bettercoverageofthetruenetworkmeasurethaneitherofthetwo standardSNAapproaches.Evenwithverysmallsamplesizes,we findthattheadaptivethresholdmethodproducesa95%confidence

intervalwhichconsistentlycapturesthetrueglobalmeasurewith areasonableintervalwidth.Infact,acrossalldatasetsandallglobal measures,exceptforthepathlengthestimateintheGovernment Office,thetruevalueiswithintheconfidencebandsatallsample sizes. Wecanalsocomparethedifferentmethods’performance

(10)

Fig.6. (Continued)

throughMSEplots.BycomparingtheMSEplots,onecandetermine thepointatwhichtheadaptivethresholdmethodoutperformsthe othermethods.Forexample,theMSEplotforHighTechManagers networkdensityindicatesthat,nomatterhowsmallthesample sizeusedwiththeadaptivethresholdmethod,itperformsbetter than the traditional method with 70% response rate, in terms

ofMSE. Whenwelook attheMSE graphsforthe fivedatasets and three network measures,we see that overall the adaptive threshold method outperforms thetwo traditional methods in termsofMSE,especiallyfordensityandaveragepathlength.

The graphs reveal that the ego network and the roster methodapproachtendtooverestimatethedensityandclustering

(11)

Fig.6.(Continued)

coefficients.Onereasonfortheoverestimationofdensityisthat respondentstendtoover-reportthenumberoffriendstheyhave (Kumbasaretal.,1994).Whilethistendencyisstillpresentwith theadaptivethresholdmethodwhendeterminingties basedon knowledge,itsinfluenceisdiminishedbycheckingthevalidityif i’sclaimofthei–jtiewithj’sclaimofthei–jtie.Ininstanceswhere

neitherinorjaresampled,respondents’inclinationtoclaimmore tiesthanareactuallypresentiseliminatedduetotherelianceon theperceptionofothers.

Increasesindensitytendtoleadtoincreasesinclusteringsimply duetothelargernumberofties.Heider(1958)providesan addi-tional,psychologicalreason, whytheclusteringcoefficient may

(12)

Fig.6. (Continued)

beinflatedfortheegonetworkmethod.BasedonHeider’s(1958) balancetheory,individuals tendto viewrelationships asbeing transitive.IfactorJisfriendswithactorAandactorB,butunaware oftherelationshipbetweenAandB,actorJwilltendtoassumea tieexistsbetweenAandBtoformabalancedtriad.Freeman(1992)

discoveredthatalargenumberoftheerrorsinarespondent’srecall ofapreviouslyobservednetwork couldbeattributedtohis/her inclinationtocorrectintransitivity.Withregardtocalculatingpath length,theoverestimationofdensityresultsinanunderestimation oftheaveragedistancebetweenactors.

(13)

Fig.6.(Continued).

Overall,the comparisonsstrongly indicate thatthe adaptive thresholdmethodoutperformsthetraditionalSNAapproachesto datacollection.Inadditiontotheimprovedaccuracyin estimat-ing thenetwork structure, the adaptive threshold methodalso

providestheresearcherwiththebenefitsofrandomsamplinganda reducedneedforhighparticipationrates.Thesebenefitscangreatly enhancecomparativeresearchgroundedinthenetwork perspec-tive.Therearetwocasesinwhichtheadaptivethresholdmethod

(14)

doesnotperformaswell.Boththeclusteringcoefficientandthe averagepathlengthintheGovernmentOfficedatasetarepoorly estimatedatsmallsamplesizesrelativetotheothermethods.A primaryreasonforthepoorperformanceisthelargeconfidence intervalsarisingfromthevariationintheactors’perceptionofthe network.Theincorporationofasignificantnumberofadditional datasetswouldbenecessarytobetterunderstandthecontextual conditionsthatmaygiverisetolargevariationinactorperception. However,itisalsopossiblethattheresultsaresimplyidiosyncratic totheGovernment Office.Regardless of thefactorsinfluencing perception,aresearcherseekingtoameasureanetworkwould,on average,bebestservedbyusingtheadaptivethresholdmethod.

6. Thoughtsonimplementationandlimitations

Researchutilizing a network perspective (Brass, 1995;Cross etal.,2003)isseenasanimprovementupon traditional econo-metric or statistical models as it investigates social influence, andnotsolelyindividualattributes,asexplanationofsocial phe-nomena(Burt,1992, p.4). Previousresearch hasdemonstrated thatinformalsocialstructuresfacilitatecommunication, collabo-ration,knowledgetransfer,andinnovationwithinanorganization (Kilduff and Tsai, 2003; Kilduff and Krackhardt, 2008). Exper-imental work dating back to the 1950s has demonstrated the importance of communication patterns for group performance (Bavelas,1950;GuetzkowandSimon,1955;Leavitt,1951;Shaw, 1964).Theeffectofnetworkstructureoncollectiveoutcomeshas beendemonstratedforbankprofitability(KrackhardtandHanson, 1993),workgroupsofatelecommunicationsfirm(Cummingsand Cross,2003),mentalhealthnetworks(ProvanandMilward,1995; ProvanandSebastian,1998),artisticgroups(UzziandSpiro,2005), andelectronicproductdevelopmentprojects(Hansen,1999).Such cross-networkresearchcouldbeenhancedthroughtheadaptive thresholdmethodbyfacilitatingtheresearcher’sabilitytogather dataonalargernumberofnetworks.Furthermore,thereareseveral currentresearchareasthatcouldbenefitbycomparingstructure andperformanceacrossmultiplenetworks.

One example is the role of social networks in teacher and schoolperformance.Dalyetal.(2010,p. 363)statethat “teach-ersworkingincollaborationtendtohaveawiderskillvariety,be moreinformedabouttheircolleagues’workandstudent perfor-mance,reportincreasedinstructionalefficacy,andaremorelikely toexpresshigherlevelsofsatisfaction”.Schoolswithhigherlevels ofsocialcapitalhavebeenshowtohavehigherperformance(Leana andPil,2006).However,therelationshipbetweensocialcapitaland schoolperformancehasnotbeenrigorouslytestedusingnetwork methods.Acomparativestudycouldtesttherelationshipbetween theactualstructureofasocialnetworkinaschoolthatgivesrise tosocialcapitalandthecollectiveperformanceoftheteacher’sin theschool.Gatheringdatainthe50–100schoolsnecessaryfor sta-tisticalanalysismaybeprovedifficultfortheresearcherusinga standardroster approach.Withtheadaptive thresholdmethod, asmallrandomsampleofteachersineachschoolcouldprovide allofthenecessary structuralinformation. Inaddition,because adaptivethresholdusesarandomsample,additionalinformation concerningtheschool climate,organizational commitment,and otherimportantschool levelfactorscan beestimatedfromthe participatingteachers.

When researchers look to employ the adaptive threshold methodinastudysuchastheonedescribedabove,twokey deci-sionsmustbemade.Oneisdeterminingthe˛rateandtheother isdeterminingthesamplesizeneededfromeachorganizationor networkunder study.Asnoted above,˛rates in the0.08–0.12 rangetendedtoperformthebestforourmeasuresandforour net-works.BasedontheresultsinSection5,itappearsthatsampling

percentagesaslowas25–40%canproduceaccurateresultsand providestheresearcherwithalargeenoughsampletodealwith severalnon-respondents.Clearly,inanygiveninstance,the neces-sarysamplesizeisdependentupontheaccuracyoftheactorsinthe network.Ifindividualsinaparticularnetworkhavehigherquality perceptionsabouttherelationsaroundthem,thenevensmaller sample sizes will produce an accurate picture of the network. CasesliketheGovernmentOffice,wheretheclusteringcoefficient and averagepath length werepoorly estimated at low sample sizes,provideaperfectexampleindicatingthatourmethodsare sensitivetothequalityofperceptionslices.Whenthecorrelations betweentheslicesarelow,thevariancesofourestimatorsmay belargeforsmallsamplesizesasthefew‘knowledge’tiesarenot abletocorrectthefalse‘perception’tiesyet.Thisisillustratedby thelargeMSEvalues forsmallsample sizesintheGovernment Officedataset.Theresearchershouldbeawareofthissensitivity issuewhenselectingasamplesizeandathresholdvaluealpha.

Thispaperrepresentsonlyafirstattemptatprobingthe appli-cabilityofthesemethods.Futureworkonthistopiccanattemptto improveuponthecurrentperformanceofourestimator.A poten-tialmeansofimprovementwouldbetomeasuretheaccuracyfor eachindividualinthesampleratherthantheaccuracyoftheoverall sample.Giventhatindividualshavevaryingpropensitiesto com-mitType1andType2errors,suchinformationcouldbeusedto weighttheperceptionofeachindividual.Onecouldidentifythe individualswhotendtomakeerrorsofcommissionandthosewho tendtomakeerrorsofomissionandusethisinformationto bet-terdeterminetheexistenceoftiesbasedonperceptiondata.These methodsmayalsobeadaptedtoaidresearcheswithneedsthat extendbeyondarandomsampleorwishtoutilizeothersampling techniquestoimproveestimateaccuracy.Forexample,research hasshownthatpeoplewithhigherlevelpositionsincompanies tendtohavepoorercognitiveaccuracy(Casciaro,1998).Giventhis, aresearchermaychoosetooversampleindividualsinthelower ranks.Analternativeapproachwouldbetoapplyanadaptive sam-plingframework where oncean initialactor’sCSSis given,the researchercandeterminethenextindividualinthenetworkto samplebyselectinganindividualwhoisnotfriendswiththe ini-tialrespondent.Such amethodwouldprovidegreatercoverage inallareasofanetworkthatmaynotbeachievedthrough ran-domsamplingalone.Thecurrentresearchcouldalsobebroadened bylookingbeyondnetworklevelmeasurestotracktheaccuracy ofnetworkrepresentationonanindividuallevel(i.e.howoftenis themostcentralpersoninthenetworkcorrectlyidentified;seefor instancetheworkbyBorgattietal.,2006).

Anotherpromisingavenueofpotentialuseforourmethodology isintheareaofhardtoreachnetworks.Giventheinherent diffi-cultyofaccessingasignificantnumberofactorsinhiddenorhardto reachnetworks,networkdatacollectionisoftenimpossible. How-ever,itmaybefeasibletolocateafewindividualsinthehardto reachnetwork,workwiththemtoboundthenetwork,andthen useourtechniquestogenerateestimatesoftheactualstructureof thenetwork.

Therearelimitationstoourapproachaswell.Thesizeofthe network understudycontrolsthe applicabilityofour sampling methodology as a datacollection tool. As noted by Krackhardt (1987),incaseswherethenetworkisreasonablylarge,havinga respondentprovidehisorherperceptionofeverytieinthe net-workwouldbea difficulttask. However,for networksofsmall tomoderatesizes, cognitivestructurescanand havebeenused effectivelyasadatacollectionmethod.However,thereis opportu-nitytoexpandtheadaptivethresholdmethodtolargernetworks. Burtand Ronchi (1994) attemptedto measure thestructure of a largeintra-organizational network using a capture-recapture methodbyinterviewingonlya subsetofthepopulation under-study.Relationshipsbetweenindividualsnotdirectlyinterviewed

(15)

were determined based on the informant’s perception of the strengthofconnection.BurtandRonchi(1994)foundthatstrong relationstendedtobe“recaptured”,meaningtheywereperceived bymultipleinformantsinthenetworkasoccurring.Therefore,it might bepossibleto map relationshipsamong individuals in a largenetworkusingapartialcognitivesocialstructureapproach wherenoteveryindividualisaskedabouteverytie.Inanetworkof 500people,a20%samplesizeprovides100piecesofevidencefor everytie.Thismaybemoreevidencethanisnecessarytoaccurately determinearelationshipandclearlythedemandsonthesampled informantswouldbemuchtoogreattoactuallyattempttogather cognitivesocialstructuredataonanetworkthatsize.

Onepotentialsolutiontothedemandsplacedonarespondent toacognitivesocialstructurequestionnaireinalargernetwork isthelinksamplingdesigndiscussedbyButts(2003).Withalink samplingdesign,individualsarenotrequiredtoprovide informa-tiononallpossiblerelationshipsinhis/hernetworkbutratheron onlyasubsetofthem.Findingtheproperbalancebetweenthesize ofthesampleandthesizeofthesubsetoflinksanindividualinthe samplewouldberequiredtoprovideinformationonisan interest-ingnextstepforthisapproach.Locatingthebalancenecessaryto reducerespondentburdenandmaintainaccuratenetwork repre-sentationswouldgreatlyexpandthepotentialrangeofapplication forourmethod.

7. Conclusion

Theadaptivethresholdmethodproposedinthisstudyiswell suited for network research where complete data collection is costlyorimpossible,particularlyforcross-networkstudies.Large scalecross-networkstudiesarerareduetothetimeandexpense requiredtocollectnetworkdataonalargenumberofnetworksfor statisticalanalysisaswellasconcernsoverthevalidityofnetwork structurewhenmissingdataispresent.Oursamplingmethodscan drasticallyreduceresearchertime andeffortneededtouncover networkstructures,and,asdemonstratedinthepaper,arecapable ofproducingaccuraterepresentationsofthetruenetwork.More importantly,ourmethodsprovedtobemorereliablethaneitherof thetwoalternativemeasurestocollectnetworkdata.

Acknowledgements

TheauthorswishtothankDavidKrackhardtandTiziana Cas-ciaroforgraciouslyprovidingthedatasetsusedintheanalysisin Section5.TheauthorswouldalsoliketothankClaytonWukichand AnnemieMaertensfortheirhelpfulcommentsonearlierdraftsof thispaper.

AppendixA. Formulationoftheadaptivethreshold methodology

Inordertoformulizethemethodologydescribedinthetext,we introducethefollowingnotationandsummarizethemethodology inathree-stepprocedure.ConsideranunknownN×Nnetwork. Assumewerandomlyselectedn N×NCSSslicesfromthis net-work,sayX1,X2,...,Xn.Fora givenvectort,let1(t)denotean

N×Nmatrixwhoserowandcolumnentriescorrespondingtotare 1sandtherestare0s.Similarly,let1(−t)denoteanN×Nmatrix whoserowandcolumnentriescorrespondingtotare0sandthe restare1s.Letsdenoteavectorcontainingtheindexnumbersofthe sampledslices.Forexample,inthe5×5×5exampleinSection4, whereA,D,andEaresampled,s={1,4,5}.Wedenotethesampled andunsampledportionsofthenetworkbySandUrespectively, where

S=1(s),

U=1(−s),

LetIt(A)denoteafunctionwhichassigns1stoalltheentriesof

amatrixAthataregreaterthanagivenconstantt,andassigns0s toalltheremainingentries.Let“*”denotetheelementbyelement multiplicationoftwomatrices.Foragiventhresholdk,wewould liketofind ˆ,anestimateof,basedontheCSSslicesX1,X2,...,Xn.

Step1:Findexactentriesof ˆ

Foralli=1,...,n,decomposeXisuchthat

Xi=XK,i+XK,i,

whereXK,idenotestheknowledgeportionofXi,andXp,idenotes

theperceptionportionofXi.Wehave

XK,i=1(i)∗Xi,

Xp,i=1(−i)∗Xi.

The combined knowledge and combined perception in the observedsample,denotedbyKandPrespectively,aregivenas

K= n



i=1 XK,i, P= n



i=1 Xp,i

Thentheexactentriesof ˆ,computedfromtheknowledgeand willnotbechangedbyotherperceptions, arecontainedin the matrixEgivenby

E=I1(S∗K)

Step2:Findperceptionentriesof ˆ

AswediscussedinSection4,unverified(orundenied) knowl-edge ties willbetreated as a perception.Thentheperception contributionofknowledgefromStep1iscontainedinthematrix C,where

C=U∗K

WewilldecomposeperceptionPintotwoparts,theactivepart (denotedbyPA)andtheinactivepart(denotedbyPI).Wewilluse

PAtofindtheperceptionentriesof ˆ.PIwillbeusedtofind˛.ˆ We

have PA=U∗P,

PI=I1(S∗P).

CombiningC andPA,wehavethefinalperceptionmatrix PF,

givenby PF=IK(PA+C)

whichcontainstheperceptionentriesof ˆ. Step3:CombineSteps1and2tofind ˆ

(16)

Theestimatednetworkisgivenby ˆ =E+PF

Recallthatintheadaptivethresholdmethodweestimatethe Type1errorprobabilityforagivenk,anddenotethisquantityby ˆ

˛k.Usingtheabovenotationwehave

ˆ

˛k= sum(PI

) sum(S)−sum(E)

wheresum(A)denotesthesumofallentriesofagivenmatrixA. NotethatthisisequivalenttoEq.(3)giveninSection4.

References

Anseel, F., Lievens, F., Schollaert, E., Choragwicka, B., 2010. Response rates inorganizational science, 1995–2008: a meta-analyticreview and guide-lines for survey researchers. Journal of Business and Psychology 25, 335–349.

Barton,A.,1968.Bringingsocietybackin:surveyresearchandmacro-methodology. AmericanBehavioralScientist12,1–9.

Baruch,Y.,Holtom,B.C.,2008.Surveyresponseratelevelsandtrendsin organiza-tionalresearch.HumanRelations61,1139–1160.

Batchelder,W.,Kumbasar,E.,Boyd,J.P.,1997.Consensusanalysisofthree-waysocial networkdata.TheJournalofMathematicalSociology22,29–58.

Bavelas,A.,1950.Communicationpatternsintask-orientedgroups.TheJournalof AcousticalSocietyofAmerica22,271–282.

Borgatti, S.P., Carley, K., Krackhardt, D., 2006. On the robustness of cen-tralitymeasures under conditions ofimperfect data. Social Networks 28, 124–136.

Brass,D.J.,1995.ASocialNetworkPerspectiveonHumanResourcesManagement, ResearchinPersonnelandHumanResourcesManagement.JAIPress,Greenwich, CT,pp.39–79.

Burt,R.S.,1992.StructuralHoles:TheSocialStructureofCompetition.Harvard Uni-versityPress,Cambridge,MA.

Burt,R.S.,Ronchi,D.,1994.Measuringalargenetworkquickly.SocialNetworks16, 91–135.

Butts,C.T.,2003.Networkinference,error,andinformant(in)accuracy:aBayesian approach.SocialNetworks25,103–140.

Carley,K.M.,Krackhardt,D.,1996.Cognitiveinconsistenciesandnon-symmetric friendship.SocialNetworks18,1–27.

Casciaro,T.,1998.Seeingthingsclearly:socialstructure,personality,andaccuracy insocialnetworkperception.SocialNetworks20,331–351.

Casciaro,T.,Carley,K.,Krackhardt,D.,1999.Positiveaffectivityandaccuracyinsocial networkperception.MotivationandEmotion23,285–306.

Costenbader,E.,Valente,T.W.,2003.Thestabilityofcentralitymeasureswhen net-worksaresampled.SocialNetworks25,283–307.

Cross,R.L.,Parker,A.,Sasson,L.,2003.NetworksintheKnowledgeEconomy.Oxford UniversityPress,Oxford.

Cummings,J.N.,Cross,R.L.,2003.Structuralpropertiesofworkgroupsandtheir consequencesforperformance.SocialNetworks25,197–210.

Daly,A.J.,Moolenaar,N.M.,Bolivar,J.M.,Burke,P.,2010.Relationshipsinreform: theroleofteachers’socialnetworks.JournalofEducationAdministration48, 359–391.

Frank,O.,2005.Networksamplingandmodelfitting.In:Carrington,P.J.,Scott,J., Wasserman,S.(Eds.),ModelsandMethodsinSocialNetworkAnalysis. Cam-bridgeUniversityPress,NewYork,pp.31–56.

Freeman,L.C.,1992.Fillingintheblanks:atheoryofcognitivecategoriesandthe structureofsocialaffiliation.SocialPsychologyQuarterly55,118–127. Freeman,L.C.,2004.TheDevelopmentofSocialNetworkAnalysis:AStudyinthe

SociologyofScience.EmpiricalPress,Vancouver,BC.

Gower,J.C.,Legendre,P.,1986.MetricandEuclideanpropertiesofdissimilarity coefficients.JournalofClassification3,5–48.

Guetzkow,H.,Simon,H.A.,1955.Theimpactofcertaincommunicationnetsupon organizationandperformanceintask-orientedgroups.ManagementScience1, 233–250.

Hansen,M.T.,1999.Thesearch-transferproblem:theroleofweaktiesinsharing knowledgeacrossorganizationsubunits.AdministrativeScienceQuarterly44, 82–111.

Heckathorn,D.,1997.Respondent-drivensampling:anewapproachtothestudyof hiddenpopulations.SocialProblems44,174–199.

Heider,F.,1958.ThePsychologyofInterpersonalRelations.Wiley,NewYork. Kilduff,M.,Crossland,C.,Tsai,W.,Krackhardt,D.,2008.Organizationalnetwork

per-ceptionsversusreality:asmallworldafterall?OrganizationalBehaviorand HumanDecisionProcesses107,15–28.

Kilduff,M.,Krackhardt,D.,2008.InterpersonalNetworksinOrganizations: Cogni-tion,Personality,Dynamics,andCulture.CambridgeUniversityPress. Kilduff,M.,Tsai,W.,2003.SocialNetworksandOrganizations.Sage,London. Krackhardt,D.,1987.Cognitivesocialstructures.SocialNetworks9,109–134. Krackhardt,D.,1990.Assessingthepoliticallandscape:structure,cognition,and

powerinorganizations.AdministrativeScienceQuarterly35,342–369. Krackhardt,D.,1996.CommentonBurtandKnez’sthirdpartyeffectsontrust.

RationalityandSociety8(1),111–120.

Krackhardt,D.,Hanson,J.R.,1993.Informalnetworks:thecompanybehindthechart. HarvardBusinessReview71,104–111.

Kumbasar,E.,Rommey,A.K.,Batchelder,W.H.,1994.Systematicbiasesinsocial perception.AmericanJournalofSociology100,477–505.

Leana,C.R.,Pil,F.K.,2006.Socialcapitalandorganizationalperformance:evidence fromurbanpublicschools.OrganizationScience17,353–366.

Leavitt,H.J.,1951.Someeffectsofcertaincommunicationpatternsongroup perfor-mance.JournalofAbnormalandSocialPsychology46,38–50.

Pattison,P.,1994.Socialcognitionincontext.In:Wasserman,S.,Galaskiewicz,J. (Eds.),AdvancesinSocialNetworkAnalysis.SagePublications,ThousandOaks, pp.79–109.

Provan,K.G.,Milward,B.H.,1995.Apreliminarytheoryofinterorganizational net-workeffectiveness:acomparativestudyoffourcommunitymentalhealth systems.AdministrativeScienceQuarterly40,1–33.

Provan,K.G.,Sebastian,J.G.,1998.Networkswithinnetworks:servicelink over-lap,organizationalcliques,andnetworkeffectiveness.AcademyofManagement Journal41,453–563.

Romney,A.K.,Weller,S.C.,Batchelder,W.H.,1986.Cultureasconsensus:atheoryof cultureandinformantaccuracy.AmericanAnthropologist88,313–338. Shaw,M.E.,1964.Communicationnetworks.In:Berkowitz,L.(Ed.),Advancesin

ExperimentalSocialPsychology.AcademicPress,NewYork,pp.111–147. Sparrowe,R.T.,Liden,R.C.,Wayne,S.J.,Kraimer,M.L.,2001.Socialnetworksandthe

performanceofindividualsandgroups.AcademyofManagementJournal44, 316–325.

Stork,D.,Richards,W.D.,1992.Nonrespondentsincommunicationnetworkstudies: problemsandpossibilities.GroupandOrganizationManagement17,193–209. Uzzi,B.,Spiro,J.,2005.Collaborationandcreativity:thesmallworldproblem.

Amer-icanJournalofSociology111,447–504.

Wasserman,S.,Faust,K.,1994.SocialNetworkAnalysis:MethodsandApplications. CambridgeUniversityPress,Cambridge.

Şekil

Fig. 1. Cognitive social structure for actor A, knowledge versus perception.
Fig. 4. Static versus adaptive methods.
Fig. 5. Graph correlations.
Fig. 6 clearly indicates that the adaptive threshold method provides better coverage of the true network measure than either of the two standard SNA approaches

Referanslar

Benzer Belgeler

“Savaş İle İlgili Âlet ve Malzemeler” başlığı altında altı kelime incelenerek yirmi altı fişleme yapılmıştır ve ‘alem kelimesine daha çok yer verildiği

Bach’ın Eşit Düzenli Klavye I-II çalışmasındaki 48 Füg arasından sergide yer alan konu-cevap-karşıkonu partileri arasında 3’lü (M,m) ve 6’lı (M,m) aralıkların

The main objective of this optimal power flow control is to acquire the complete voltage angle and the magnitude information for each bus in power systems, which is

Hafif preeklampsi grubunda umbilikal arter, ven ve maternal venöz kan serum salusin-β düzeyi di¤er gruplardan istatistiksel an- laml› olarak yuksek bulundu (p=0.000).. Sonuç:

Ancak gebelik öncesi dönemde normal vücut kütle indeksi olan gebelere göre afl›r› kilo- lu ve obez olan gebelerde iri bebek do¤urma oran› daha fazla bulunmufl olup, bu

Bu kısmın ana gayesi birinci kısım faaliyetleri sonunda tespit edilen ve ümitli görülen maden sa­ halarında yeni maden yatağı bulmak veya bilinen bir maden yatağının

Bu araştırmada su geçiren betonların mekanik ve dayanıklı- lık özelikleri, geleneksel betonlara göre farklılıkları ve fark- lı oranlarda ince malzeme içeren su

clustering the complete wireless network depending on the density using the DBSCAN approach and estimating the un-localized nodes within each cluster using PSO based