• Sonuç bulunamadı

AN IMPLEMENTATION FOR PERFORMING A COMPUTER BASED MUTATION ANALYSIS

N/A
N/A
Protected

Academic year: 2021

Share "AN IMPLEMENTATION FOR PERFORMING A COMPUTER BASED MUTATION ANALYSIS"

Copied!
14
0
0

Yükleniyor.... (view fulltext now)

Tam metin

(1)

71

ANIMPLEMENTATIONFORPERFORMINGACOMPUTERBASEDMUTATIONA NALYSIS

BrwaABUBAKER1,HalgurdMOHAMMED2,RıdvanSARAÇOĞLU1,*

1 YüzüncüYılUniversity,EngineeringandArchitectureFaculty,Electric-ElectronicsEngineerin gDepartment,VanTURKEY 2YüzüncüYılUniversity,FacultyofMedicine,MedicalBiologyDepartment,Van TURKEY brwa.pshdary@gmail.com,mhalgurd@ymail.com,*ridvansaracoglu@yyu.edu.tr Abstract ThehistoryofMutationAnalysiscanbesketchedbackfrom1971byRichardLipton.Itisvitalt oidentifythevariationsoccurredinDNAduetomutation.Theaimofthisworkistodevelopanewsof twarethathelpstopredictthemutatedsequencepositionfoundbetweentheanytwosequenceswhet heritmaybeDNAorProteinoritmaybeboth.Moreoverthisapproachismosteffectiveandaccuratet oanalyzesequences.Thesoftwareisdevelopedthathelpstoprovidenecessaryinputandgetdesired output.Theoutputfilewillshowthepositionwerethemutationoccurforprotein1mutationoccurin 1forKandWandforCmutationoccurin40position.Thus,thesystemrunstoprogressqualityoftesti ngandprovideadvanceefficiencybymeansofvariousmutationoperators.Computerizedmutatio nanalysisisperformedwithoutmanualintervention. Keywords:MutationAnalysis,Computerizedmutationanalysis,DNAorProtein BİLGİSAYARTABANLIMUTASYONANALİZİİÇİNBİRUYGULAMA Özet Mutasyonanalizitarihi1971yılındaRichardLiptontarafındanyapılançalışmalaradayanm aktadır.MutasyonnedeniyleDNAiçerisindekioluşanvaryasyonlarınbelirlenmesikritikönemtaş ımaktadır.Buçalışmanınözü;DNA,Proteinveyaherikisideolabilenherhangiikisıraarasındabulu

(2)

72 nanmutasyongeçirmişpozisyonlarıntahminedilmesineyardımcıolacakyenibiryazılımgeliştir mektir.Üsteliksıraanaliziiçinçokverimlivedoğrusonuçüretenbiryaklaşımdır.Buyazılım,gerekl igirişlerinkolaycasağlamasıvearzuedilençıkışlarınalınmasınayardımcıolacakşekildegeliştiril miştir.Çıktıdosyası,40pozisyoniçindeki1pozisyondakioluşanK,WveCmutasyonununyerinig österecektir.Böylecebusistemle,kalitelibirtestsürecigerçekleştirilmekteveçeşitlimutasyonope ratörlerivasıtasıylaverimlilikteilerlemesağlanmaktadır.Bilgisayartabanlımutasyonanalizi,ma nüelmüdahaleolmaksızıngerçekleştirilmişolmaktadır. AnahtarKelimeler:MutasyonAnalizi,Bilgisayarlımutasyonanalizi,DNAveyaProtein 1. Introduction ThehistoryofMutationTestingcanbesketchedbackfrom1971byRichardLipton[1].Thebir thofthefieldcanalsobeidentifiedinotherpaperspublishedinthelate1970sbyDeMilloetal.[2]and Hamlet[3].ItisvitaltoidentifythevariationsoccurredinDNAduetomutation.Forthatgeneticcod ewhichisusedplaysacrucialrole.DNAisamajorcontrollerofON/OFFmechanismofgenes.Som epartsofDNAarenothavinganyfunctionalpropertiesandsomehavethepropertiesoftranslationt oprotein.Whenthereisanerrorlikeabasedeletedoraddedorawrongbaseincorporatedintheseque nceofDNA,itiscalledamutation. Existingnucleicacidmoleculesinalivingorganismactsasagenetictemplatetotransferthege neticinfofromonegenerationtothenext.Nucleicacidmoleculesareorganizedasgeneswhichcod eforaparticularphenotypeviaspecificproteinsandthegeneexpressionisregulatedbybothextern alandinternalfactorswhichaidthedevelopmentalprocessofanorganism.Thisrelationbetweeng enesandproteinsformsthe“centraldogmaoflife”. Theproteinishavingcompletesetofaminoacidsandeveryproteinhasuniqueaminoacidarr angedinaspecificsequence.Theinformationtosynthesizeproteinswithuniqueaminoacidsequen ceisprovidedbythenucleicacidpresentwithinthenucleus.Inapre-setsequence,DNApresentinth enucleusgiverisetothespecificRNAsequenceandthatinturnguidethecellularmachinerytosynth

(3)

73 esizeprotein. Thegeneticcodeisconventionalinformationthattranslatestheinformationencodedingene ticmaterialintoproteinsinlivingcells.TheDNAcodeswithfourlettersA,T,G,andC.Theseprotein codingDNAaresaidtobeCodons.Thesecodonsareagroupofthreeadjacentnucleotidesspecifyth esignalstoprotein.Thestopcodonimpliesthecompletionoftheafreshfabricatedprotein. ManyComputationalprogramdesignlanguagesasawhiteboxunittestmethod.Forexampl e,FORTRANprograms[4-6],Adaprograms[7],[8],Cprograms[9-11],Javaprograms[12-14],C #programs[15-19],SQLcode[20,21]andAspectprograms[22,23].C#isamodest,object-oriente dprogramminglanguageestablishedbyMicrosoftandpermittedbyEuropeanComputerManufa cturersAssociationandInternationalStandardsOrganization.ItisbasedonCandC++programmi nglanguage[16]. ItwasdevelopedbyAndersHejlsbergandhisteamusing.NetFramework.C#isintendedforComm onLanguageInfrastructure(CLI),consistsoftheexecutablecodeandruntimesituationthatpermit svarioushigh-levellanguagesondifferentcomputerplatformsandarchitectures. ThereasonsbehindC#awidelyusedprofessionallanguageismodernwithwell-structuredl anguage,objectaswellascomponentoriented,produceefficientprograms,andcompilevarietyof platforms. The.Netframeworkapplicationsaremulti-platformapplications.Thesehasbeenapplicabl eforC#,C++,VisualBasic,Jscript,COBOL,etc.,foraccesstheframeworkaswellasconversewith eachother[18].The.Netframeworkcontainsenormouslibrarycodesusedbytheclientlanguagess uchasC#.Somecomponentsof.NetframeworkareCommonLanguageRuntime,ASP.NetandA SP.NetAJAX,etc. C#sourcecodefilescanbemadeusingabasictexteditor,likeNotepad,andcompilethecodei ntoassembliesusingthecommand-linecompiler,whichisagainapartofthe.NETFramework.Mo noisanopen-sourceversionofthe.NETFrameworkwhichincludesaC#compilerandrunsonsever aloperatingsystems,includingvariousflavorsofLinuxandMacOS.

(4)

74 Thepurposeofthisworkistodevelopanewsoftwarethathelpstopredictthemutatedsequenc epositionfoundbetweentheanytwosequencesofDNAandthosesequenceswillprocessedfortrans lationtoProteinsequences.Itispossibletotrackmutationinproteinsequencesaswell.Moreoverit mosteffectiveandaccuratetoanalysessequences.ThesoftwareisdevelopedbasedonC#Programl anguagethathelpstoprovidenecessaryinputandgetdesiredoutput. 2. MaterialsandMethods 2.1.DNAMatching DNAsequenceisfabricatedwithfourbases(A,C,T,andG),anwell-organizedfixed-lengthe ncodingsystem[24]canbeused.Inmolecularbiology,DNAsequencescarryvitalinformationfore achspeciesandacomparisonbetweenDNAsequencesisaninterestingandmorecomplicated.Ther earenumerouscomparisontoolstoprovideapproximatematching.OurDNAmatchingalgorithma refastmatchingalgorithmtomatchlengthysequencesinfastestapproach. 2.2.ImplementationofMutationAnalysisProgram FASTAformat:AsequencebookinaFASTAformatincluding(firstline)asingle-linedescri ption(sequencename),followedbyline(s)or(secondline)ofsequencedata.Thefirstcharacterofth edenotelineisagreater-than(">")symbol.likethat >HSBGPGHumangeneforboneglaprotein(BGP) GGCAGATTCCCCCTAGACCCGCCCGCACCATGGTCAGGCATGCCCCTCCTCATC GCTGGGCACAGCCCAGAGGGT FASTAcanbeutilizedtodeducefunctionalandevolutionarylinkagesamidstsequencesalso helpidentifymembersofgenefamilies[25]. “Protein”  ProteintoproteinFASTA.  ProteintoproteinSmith–Waterman(ssearch).  Globalproteintoprotein(Needleman–Wunsch)(ggsearch)

(5)

75  Global/localproteintoprotein(glsearch)  Proteintoproteinwithunorderedpeptides(fasts)  Proteintoproteinwithmixedpeptidesequences(fastf) “Nucleotide“  Nucleotidetonucleotide(DNA/RNAfasta)  Orderednucleotidesvsnucleotide(fastm)  Unorderednucleotidesvsnucleotide(fasts) InFASTAalgorithmNucleotideorproteinsequenceistakenasinput. Thehurryandsensitivityiscontrolledbytheparametercalledktup,whichspecifiesthegauge oftheword.Thisprogramusesthewordhitstoidentifypotentialmatchesbetweenthequerysequenc eanddatabasesequence (Fig. 2.1).Initiallyitreviewforsegment'scontainingseveralthereabouthits. Fig. 2.1.FASTAalgorithm(FASTAAlignments) FASTAalgorithmhasDotmatrixcomparisonsWordsmatchesin2sequencesI&Jcanberep resentedasadotmatrix(as shown Fig.2.2),thus

(6)

76 Fig. 2.2Dotmatrixcomparisons Theflowchartofprogram’salgorithmisshowninFigure2.3inthattheinputersequencesofD NAareintheformofFASTAformat.OncetheDNAisinFASTAformatthenthecomparisonbetwee nthetwosequenceshastobedonebasedoncolordifferences.Followedbytranscriptionandtranslati ontoRNAandProtein.Thencomparisonbetweenthesetwomutatedproteinsequenceshastobeana lysed.Theresulthastobeshownindatagridview.

(7)

77

Fig.2.3.Overviewofprogram

(8)

78

UML daigram of our softwareis shown in Fig. 2.4.

2.3.Retrievesequencesfromdatabase Thesequencewhichisgoingtobeanalyzedhastoberetrievedfromthespecificproteinsdata baseforanalysis.ImportantpointissequencesaremustbeintheformofFASTAformat.ThoseFAS TAsequencesareimportedtooursoftwarebyusingasuitablecods. 3. ExperimentalResults Thecompleteviewofoursoftwareinthatthesequenceswhicharegoingtobesequencedareretrieve dandpastetothefollowingboxandselectRUN.Thencomparisonwillstartprocessingoncetheproc essiscompletetheresultwillshowinrightsideofthedialoguebox(asshowninFig.3.1). Fig. 3.1.RepresenttheWholeSoftware

(9)

79

outputfile

ListViewinC# DataGridViewOutputfile Fig. 3.2.Outputfileshowsseparatemutationofprotein. WeselecttwosequenceswhicharegoingforanalysisisretrievedasaFASTAFormatandthes equencehastobeundergoneformutationanalysis.Beforethatnucleotidesequencesvariationdone bymeansoflistviewcommand.Thethymineresiduesareinorangecolor,adenineresiduesareinblu ecolor,guanineisinRoseandcytosineisinyellow.Theoutputfileprovidethepositionwerethemuta tionoccurforprotein1mutationoccurin1forKandWandforCmutationoccurin40position(Fig.3.2 ). Comparebetweenoursoftwarewithanothertool(bynameTranscriptionandTranslationToo l)isshowninTable3.1. BlastandFastaaretwoalgorithmstheseareutilizedtocomparesequencesofaminoacids,DN A,proteinsandnucleotidesofdiversespeciesandlookforthesimilarities.thosegeneticalgorithms werewrittenkeepingspeedinmindinordertoasthedatabankofthesequencesswelledonceDNAwa

(10)

80 sisolatedinthelabbythescientistsin1980sthereincreasedaneedtocompareandfindcorrespondin ggenesformoreresearchathighspeed. Table3.1ComparisonofSoftware Ourtool TranscriptionandTranslationTool Withoutinternetiswork Itisneedinternettowork ItisutilizeFASTAformat ItisutilizePlainsequenceformat ItcouldusecolortoDNAsequences ItcouldnotusecolortoDNAsequences IthasaccountlengthofsequencesDNA&protein Ithasn’taccountlengthofsequences Itcanloadingtwosequences Itcanloadingonlyonesequence ItcanseparatemutationDNAsequences ItcannotseparatemutationDNAsequence s ItcannotdisplayRNA,immediatelyDNAtoprotei n ItwillshowRNAbeforeprotein Itcouldusecolortoproteinsequences Itcouldnotusecolortoproteinsequences ItwillshowpositiontosequencesDNA&protein Itcannot FASTAwasthemostvastlyutilizedproteinandDNAsequencedatabasesearchprogramnex tthecomingofBLAST.ItisidenticalwithBLASTinmanyroutes,andisstillrepeatedlyutilized.Suc hasBLAST,itisaheuristicforapproximatingtheSmith-Watermanalgorithm,bututilizesdiverseh euristicmethodstoraisespeed.BLASTandFASTAaswellutilizeslightlydifferentmethodstocalc ulatestatisticalsignificance.OursoftwarehasutilizedFASTAthereforeallsoftwareonFASTAfor matcouldnotseparatepartofmutationforsegmentofDNAandsegmentofprotein,onthatoursoftw arewasadditionalpartsofmutationforproteinsandnucleotidesbybestqualitycolour. 4. Conclusion ThepurposeoftheworkistoperformamutationanalysisofeachDNAsequencsfollowedbyc omparisontotrackthepositionaswell,thestructureofthesequancesofDNAis4typesofbasesthatsy mbolizebyfourletterA,C,GandT.thissoftwarecoluredallthebasesofDNAsequencesbydifferent coloureachcolourindicatestospecialnucleotideasdeeppinkcolourtoG,goldtoC,lightskyblueto

(11)

81 AandthecoraltoTthatpropertyofthissoftwaregivetheuserdetailsaboutthecontainofeachtypeofn ucleotideafterthattranslatetheDNAtoproteinandcomparethemalsobymeansofthissoftware. Thiswillbemoreaccurate,alsosequenceofproteinissymbolizebyfourletterA,C,GandUan deachthreesymbolizestooneaminoaciddependontheaminoacidcoden.alsointhisbioinformatics toolgiveeachsymbolspecialcolourtoindicatethatfourdifferentcharacterslesstime,easytopredic tthoseregionswhicharemutated.Thus,thesystemrunstoprogressqualityoftestingandprovidead vanceefficiencybymeansofvariousmutationoperators.Computerizedmutationtestingisperfor medwithoutmanualinterventin. InthebiologicalscienceanychangeinthestructureanyDNAsequenceallowtochangeinprot einsequenceandthatmaybeappearabonormaltyinhumanbodythatcalledmutation. Inthisworkreslutofthissoftware,itissimpletounderstandingfromtheuser.ifcomparethisso ftwarefromspeedandefficiencysides,ithashighefficiencyandmuchspeed. And on the otherhandthissoftwareisworkofflineandeasytodownloadonthewindowssystem. Acknowledgments TheauthorsaregratefulforthesupportprovidedbyYüzüncüYılUniversity. References [1]MathurP.“MutationTesting”,inEncyclopediaofSoftwareEngineering,J.J.Marciniak,Ed.,19 94,pp.707–713. [2]DeMilloRA,LiptonRJ,SaywardFG.“HintsonTestDataSelection:HelpforthePracticingProg rammer,”Computer,vol.11,no.4,pp.34–41,April1978. [3]HamletRG,“TestingProgramswiththeAidofaCompiler,”IEEETransactionsonSoftwareEng ineering,July1977,3(4):279–290, [4].Acree,A.T.,Budd,T.A.,DeMillo,R.A.,Lipton,R.J.,andSayward,F.G.,“MutationAnalysis,” GeorgiaInstituteofTechnology,Atlanta,Georgia,TechniqueReportGIT-ICS-79/08,1979.

(12)

82 [5].BuddTA,DeMilloRA,LiptonRJ,SaywardFG.“TheDesignofaPrototypeMutationSystemfo rProgramTesting,”inProceedingsoftheAFIPSNationalComputerConference,vol.74.Anaheim ,NewJersey:ACM,5-8June1978,pp.623–627. [6]BuddTA,SaywardFG.“UsersGuidetothePilotMutationSystem,”YaleUniversity,NewHave n,Connecticut,TechniqueReport114,1977. [7].BowserJH.“ReferenceManualforAdaMutantOperators,”GeorgiaInstituteofTechnology,A tlanta,Georgia,TechniqueReportGITSERC-88/02,1988. [8].Offutt,A.J.,Voas,J.,andPayn,J.,“MutationOperatorsforAda,”GeorgeMasonUniversity,Fai rfax,Virginia,TechniqueReportISSE-TR-96-09,1996. [9].AgrawalH,DeMilloRA,HathawayB,HsuW,HsuW,KrauserEW,MartinRJ,MathurAP,Spaf fordE.“DesignofMutantOperatorsfortheCProgrammingLanguage,”PurdueUniversity,WestL afayette,Indiana,TechniqueReportSERC-TR-41-P,March1989. [10]DelamaroME,MaldonadoJC,MathurAP.“InterfaceMutation:AnApproachforIntegration Testing,”IEEETransactionsonSoftwareEngineering,May2001,27(3):228–247. [11]VilelaP,MachadoM,WongWE.“TestingforSecurityVulnerabilitiesinSoftware,”inSoftwa reEngineeringandApplications,2002. [12]ChevalleyP.“ApplyingMutationAnalysisforObject-orientedProgramsUsingaReflective Approach,”inProceedingsofthe8thAsia-PacificSoftwareEngineeringConference(APSEC01), Macau,China,4-7December2001,p.267. [13]ChevalleyP,Th´evenod-FosseP.“AMutationAnalysisToolforJavaPrograms,”Internationa lJournalonSoftwareToolsforTechnologyTransfer,November2002,5(1):90–103. [14]Ma,Y.S.,Offutt,A.J.andKwon,Y.R.,“MuJava:AnAutomatedClassMutationSystem,”Soft wareTesting,Verification&Reliability,vol.15,no.2,pp.97–133,June2005. [15]Derezi´nskaA.“Object-orientedMutationtoAssesstheQualityofTests,”inProceedingsofth e29thEuromicroConference,Belek,Turkey,1-6September2003,pp.417–420. [16]Derezi´nskaA.“AdvancedMutationOperatorsApplicableinC#Programs,”WarsawUniver

(13)

83 sityofTechnology,Warszawa,Poland,TechniqueReport,2005. [17]Derezi´nskaA.“QualityAssessmentofMutationOperatorsDedicatedforC#Programs,”inPr oceedingsofthe6thInternationalConferenceonQualitySoftware(QSIC’06),Beijing,China,27-28October2006. [18]Derezi´nskaA,SzustekA.“CREAM-ASystemforObject-OrientedMutationofC#Programs ,”WarsawUniversityofTechnology,Warszawa,Poland,TechniqueReport,2007. [19]Derezi´nskaA,SzustekA.“Tool-SupportedAdvancedMutationApproachforVerificationof C#Programs,”inProceedingsofthe3thInternationalConferenceonDependabilityofComputerS ystems(DepCoS-RELCOMEX’08),SzklarskaPorˆeba,Poland,26-28June2008,pp.261–268. [20]ShahriarH,ZulkernineM.“MUSIC:Mutation-basedSQLInjectionVulnerabilityChecking, ”inProceedingsofthe8thInternationalConferenceonQualitySoftware(QSIC’08),Oxford,UK,1 2-13August2008,pp.77–86. [21]TuyaJ,CabalMJS,delaRivaC.“SQLMutation:ATooltoGenerateMutantsofSQLDatabase Queries,”inProceedingsofthe2ndWorkshoponMutationAnalysis(MUTATION’06).Raleigh, NorthCarolina:IEEEComputerSociety,November2006,p.1. [22]AnbalaganP,XieT.“AutomatedGenerationofPointcutMutantsforTestingPointcutsinAspe ctJPrograms,”inProceedingsofthe19thInternationalSymposiumonSoftwareReliabilityEngine ering(ISSRE’08).Redmond,Washingto:IEEEComputerSociety,11-14November2008,pp.239 –248. [23]FerrariFC,MaldonadoJC,RashidA.“MutationTestingforAspect-OrientedPrograms,”inPr oceedingsofthe1stInternationalConferenceonSoftwareTesting,Verification,andValidation(I CST’08).Lillehammer,Norway:IEEEComputerSociety,9-11April2008,pp.52–61. [24]KimJW,KimE,ParkK.FastmatchingmethodforDNAsequences.InCombinatorics,Algorit hms,ProbabilisticandExperimentalMethodologies,volume4614ofLNCS,pages271–281,2007 . [25]

(14)

84

Setubal&Meidanis.IntroductiontoComputationalMolecularBiology,PWSPublishingCompan y,1997.Chapter3.

Referanslar

Benzer Belgeler

Gender differences are found in the ability to acquire spatial information and navigate through real and virtual environments due to the different types of information that males

Peçevi İbrahim Efendi, Peçevi Tarihi, (Haz.. Sonrasında Şah Tahmasb’ın saldırısına uğramıştır. Sultan Süleyman İran seferi dönüşünde Mimar Sinan ve Zal

1) This communication is translated ht English by Mr.. He even organized courses in econo­ mics and sociology for a group while he was here. Ziya Gokalp was at the

Asliye Ceza Mahkemesi'nde dün yapılan duruşmada, Ahmet Özal’ın avukatı Münci İnci esas hakkındaki savunmasında, T C K ’da olmayan bir suçtan ölürü

university teachers recognize the language acquisition benefits of using games whereas the high school teachers believe games are merely fun activities

The sky types asserts a great importance on the effects of daylight within an interior through various types o f apertures. The overcast sky is three times more luminous at

DBPal leverages recent advances in deep models to make query understanding more robust in the fol- lowing ways: First, DBPal uses novel machine translation models to translate

The synthesized electrospun nanostructured bers exhibited improved photocatalytic performance in water splitting compared to other nanostructures produced using other