• 1.06 MB
  • 2022-04-29 14:33:31 发布

数据库系统概念全套配套课件PPT ch12.ppt

  • 57页
  • 当前文档由用户上传发布,收益归属用户
  1. 1、本文档共5页,可阅读全部内容。
  2. 2、本文档内容版权归属内容提供方,所产生的收益全部归内容提供方所有。如果您对本文有版权争议,可选择认领,认领后既往收益都归您。
  3. 3、本文档由用户上传,本站不保证质量和数量令人满意,可能有诸多瑕疵,付费之前,请仔细先通过免费阅读内容等途径辨别内容交易风险。如存在严重挂羊头卖狗肉之情形,可联系本站下载客服投诉处理。
  4. 文档侵权举报电话:19940600175。
'Chapter12:QueryProcessing Chapter12:QueryProcessingOverviewMeasuresofQueryCostSelectionOperationSortingJoinOperationOtherOperationsEvaluationofExpressions BasicStepsinQueryProcessing1.Parsingandtranslation2.Optimization3.Evaluation BasicStepsinQueryProcessing(Cont.)Parsingandtranslationtranslatethequeryintoitsinternalform.Thisisthentranslatedintorelationalalgebra.Parsercheckssyntax,verifiesrelationsEvaluationThequery-executionenginetakesaquery-evaluationplan,executesthatplan,andreturnstheanswerstothequery. BasicStepsinQueryProcessing:OptimizationArelationalalgebraexpressionmayhavemanyequivalentexpressionsE.g.,salary75000(salary(instructor))isequivalenttosalary(salary75000(instructor))EachrelationalalgebraoperationcanbeevaluatedusingoneofseveraldifferentalgorithmsCorrespondingly,arelational-algebraexpressioncanbeevaluatedinmanyways.Annotatedexpressionspecifyingdetailedevaluationstrategyiscalledanevaluation-plan.E.g.,canuseanindexonsalarytofindinstructorswithsalary<75000,orcanperformcompleterelationscananddiscardinstructorswithsalary75000 BasicSteps:Optimization(Cont.)QueryOptimization:Amongstallequivalentevaluationplanschoosetheonewithlowestcost.Costisestimatedusingstatisticalinformationfromthe databasecataloge.g.numberoftuplesineachrelation,sizeoftuples,etc.InthischapterwestudyHowtomeasurequerycostsAlgorithmsforevaluatingrelationalalgebraoperationsHowtocombinealgorithmsforindividualoperationsinordertoevaluateacompleteexpressionInChapter14Westudyhowtooptimizequeries,thatis,howtofindanevaluationplanwithlowestestimatedcost MeasuresofQueryCostCostisgenerallymeasuredastotalelapsedtimeforansweringqueryManyfactorscontributetotimecostdiskaccesses,CPU,orevennetworkcommunicationTypicallydiskaccessisthepredominantcost,andisalsorelativelyeasytoestimate.MeasuredbytakingintoaccountNumberofseeks*average-seek-costNumberofblocksread*average-block-read-costNumberofblockswritten*average-block-write-costCosttowriteablockisgreaterthancosttoreadablockdataisreadbackafterbeingwrittentoensurethatthewritewassuccessful MeasuresofQueryCost(Cont.)ForsimplicitywejustusethenumberofblocktransfersfromdiskandthenumberofseeksasthecostmeasurestT–timetotransferoneblocktS–timeforoneseekCostforbblocktransfersplusSseeksb*tT+S*tSWeignoreCPUcostsforsimplicityRealsystemsdotakeCPUcostintoaccountWedonotincludecosttowritingoutputtodiskinourcostformulae MeasuresofQueryCost(Cont.)SeveralalgorithmscanreducediskIObyusingextrabufferspaceAmountofrealmemoryavailabletobufferdependsonotherconcurrentqueriesandOSprocesses,knownonlyduringexecutionWeoftenuseworstcaseestimates,assumingonlytheminimumamountofmemoryneededfortheoperationisavailableRequireddatamaybebufferresidentalready,avoidingdiskI/OButhardtotakeintoaccountforcostestimation SelectionOperationFilescanAlgorithmA1(linearsearch).Scaneachfileblockandtestallrecordstoseewhethertheysatisfytheselectioncondition.Costestimate=brblocktransfers+1seekbrdenotesnumberofblockscontainingrecordsfromrelationrIfselectionisonakeyattribute,canstoponfindingrecordcost=(br/2)blocktransfers+1seekLinearsearchcanbeappliedregardlessofselectionconditionororderingofrecordsinthefile,oravailabilityofindicesNote:binarysearchgenerallydoesnotmakesensesincedataisnotstoredconsecutivelyexceptwhenthereisanindexavailable,andbinarysearchrequiresmoreseeksthanindexsearch SelectionsUsingIndicesIndexscan–searchalgorithmsthatuseanindexselectionconditionmustbeonsearch-keyofindex.A2(primaryindex,equalityonkey).RetrieveasinglerecordthatsatisfiesthecorrespondingequalityconditionCost=(hi+1)*(tT+tS)A3(primaryindex,equalityonnonkey)Retrievemultiplerecords.RecordswillbeonconsecutiveblocksLetb=numberofblockscontainingmatchingrecordsCost=hi*(tT+tS)+tS+tT*b SelectionsUsingIndicesA4(secondaryindex,equalityonnonkey).Retrieveasinglerecordifthesearch-keyisacandidatekeyCost=(hi+1)*(tT+tS)Retrievemultiplerecordsifsearch-keyisnotacandidatekeyeachofnmatchingrecordsmaybeonadifferentblockCost=(hi+n)*(tT+tS)Canbeveryexpensive! SelectionsInvolvingComparisonsCanimplementselectionsoftheformAV(r)orAV(r)byusingalinearfilescan,orbyusingindicesinthefollowingways:A5(primaryindex,comparison).(RelationissortedonA)ForAV(r)useindextofindfirsttuplevandscanrelationsequentiallyfromthereForAV(r)justscanrelationsequentiallytillfirsttuple>v;donotuseindexA6(secondaryindex,comparison).ForAV(r)useindextofindfirstindexentryvandscanindexsequentiallyfromthere,tofindpointerstorecords.ForAV(r)justscanleafpagesofindexfindingpointerstorecords,tillfirstentry>vIneithercase,retrieverecordsthatarepointedtorequiresanI/OforeachrecordLinearfilescanmaybecheaper ImplementationofComplexSelectionsConjunction:12...n(r)A7(conjunctiveselectionusingoneindex).SelectacombinationofiandalgorithmsA1throughA7thatresultsintheleastcostfori(r).Testotherconditionsontupleafterfetchingitintomemorybuffer.A8(conjunctiveselectionusingcompositeindex).Useappropriatecomposite(multiple-key)indexifavailable.A9(conjunctiveselectionbyintersectionofidentifiers).Requiresindiceswithrecordpointers.Usecorrespondingindexforeachcondition,andtakeintersectionofalltheobtainedsetsofrecordpointers.ThenfetchrecordsfromfileIfsomeconditionsdonothaveappropriateindices,applytestinmemory. AlgorithmsforComplexSelectionsDisjunction:12...n(r).A10(disjunctiveselectionbyunionofidentifiers).Applicableifallconditionshaveavailableindices.Otherwiseuselinearscan.Usecorrespondingindexforeachcondition,andtakeunionofalltheobtainedsetsofrecordpointers.ThenfetchrecordsfromfileNegation:(r)UselinearscanonfileIfveryfewrecordssatisfy,andanindexisapplicabletoFindsatisfyingrecordsusingindexandfetchfromfile SortingWemaybuildanindexontherelation,andthenusetheindextoreadtherelationinsortedorder.Mayleadtoonediskblockaccessforeachtuple.Forrelationsthatfitinmemory,techniqueslikequicksortcanbeused.Forrelationsthatdon’tfitinmemory,external sort-mergeisagoodchoice. ExternalSort-MergeCreatesortedruns.Letibe0initially.Repeatedlydothefollowingtilltheendoftherelation: (a)ReadMblocksofrelationintomemory (b)Sortthein-memoryblocks (c)WritesorteddatatorunRi;incrementi.LetthefinalvalueofibeNMergetheruns(nextslide)…..LetMdenotememorysize(inpages). ExternalSort-Merge(Cont.)Mergetheruns(N-waymerge).Weassume(fornow)thatN> ComplexJoinsJoinwithaconjunctivecondition:r12...nsEitherusenestedloops/blocknestedloops,orComputetheresultofoneofthesimplerjoinsrisfinalresultcomprisesthosetuplesintheintermediateresultthatsatisfytheremainingconditions1...i–1i+1...nJoinwithadisjunctiveconditionr12...nsEitherusenestedloops/blocknestedloops,orComputeastheunionoftherecordsinindividualjoinsris:(r1s)(r2s)...(rns) OtherOperationsDuplicateeliminationcanbeimplementedviahashingorsorting.Onsortingduplicateswillcomeadjacenttoeachother,andallbutonesetofduplicatescanbedeleted.Optimization:duplicatescanbedeletedduringrungenerationaswellasatintermediatemergestepsinexternalsort-merge.Hashingissimilar–duplicateswillcomeintothesamebucket.Projection:performprojectiononeachtuplefollowedbyduplicateelimination. OtherOperations:AggregationAggregationcanbeimplementedinamannersimilartoduplicateelimination.Sortingorhashingcanbeusedtobringtuplesinthesamegrouptogether,andthentheaggregatefunctionscanbeappliedoneachgroup.Optimization:combinetuplesinthesamegroupduringrungenerationandintermediatemerges,bycomputingpartialaggregatevaluesForcount,min,max,sum:keepaggregatevaluesontuplesfoundsofarinthegroup.Whencombiningpartialaggregateforcount,adduptheaggregatesForavg,keepsumandcount,anddividesumbycountattheend OtherOperations:SetOperationsSetoperations(,and):caneitherusevariantofmerge-joinaftersorting,orvariantofhash-join.E.g.,Setoperationsusinghashing:PartitionbothrelationsusingthesamehashfunctionProcesseachpartitioniasfollows.Usingadifferenthashingfunction,buildanin-memoryhashindexonri.Processsiasfollowsrs:Addtuplesinsitothehashindexiftheyarenotalreadyinit.Atendofsiaddthetuplesinthehashindextotheresult. OtherOperations:SetOperationsE.g.,Setoperationsusinghashing:asbeforepartitionrands,asbefore,processeachpartitioniasfollowsbuildahashindexonriProcesssiasfollowsrs:outputtuplesinsitotheresultiftheyarealreadythereinthehashindexr–s:foreachtupleinsi,ifitisthereinthehashindex,deleteitfromtheindex.Atendofsiaddremainingtuplesinthehashindextotheresult. OtherOperations:OuterJoinOuterjoincanbecomputedeitherasAjoinfollowedbyadditionofnull-paddednon-participatingtuples.bymodifyingthejoinalgorithms.ModifyingmergejointocomputersInrs,nonparticipatingtuplesarethoseinr–R(rs)Modifymerge-jointocomputers:Duringmerging,foreverytupletrfromrthatdonotmatchanytupleins,outputtrpaddedwithnulls.Rightouter-joinandfullouter-joincanbecomputedsimilarly. OtherOperations:OuterJoinModifyinghashjointocomputersIfrisproberelation,outputnon-matchingrtuplespaddedwithnullsIfrisbuildrelation,whenprobingkeeptrackofwhichrtuplesmatchedstuples.Atendofsioutputnon-matchedrtuplespaddedwithnulls EvaluationofExpressionsSofar:wehaveseenalgorithmsforindividualoperationsAlternativesforevaluatinganentireexpressiontreeMaterialization:generateresultsofanexpressionwhoseinputsarerelationsorarealreadycomputed,materialize(store)itondisk.Repeat.Pipelining:passontuplestoparentoperationsevenasanoperationisbeingexecutedWestudyabovealternativesinmoredetail MaterializationMaterializedevaluation:evaluateoneoperationatatime,startingatthelowest-level.Useintermediateresultsmaterializedintotemporaryrelationstoevaluatenext-leveloperations.E.g.,infigurebelow,computeandstore thencomputethestoreitsjoinwithinstructor,andfinallycomputetheprojectiononname. Materialization(Cont.)MaterializedevaluationisalwaysapplicableCostofwritingresultstodiskandreadingthembackcanbequitehighOurcostformulasforoperationsignorecostofwritingresultstodisk,soOverallcost=Sumofcostsofindividualoperations+ costofwritingintermediateresultstodiskDoublebuffering:usetwooutputbuffersforeachoperation,whenoneisfullwriteittodiskwhiletheotherisgettingfilledAllowsoverlapofdiskwriteswithcomputationandreducesexecutiontime PipeliningPipelinedevaluation:evaluateseveraloperationssimultaneously,passingtheresultsofoneoperationontothenext.E.g.,inpreviousexpressiontree,don’tstoreresultofinstead,passtuplesdirectlytothejoin..Similarly,don’tstoreresultofjoin,passtuplesdirectlytoprojection.Muchcheaperthanmaterialization:noneedtostoreatemporaryrelationtodisk.Pipeliningmaynotalwaysbepossible–e.g.,sort,hash-join.Forpipeliningtobeeffective,useevaluationalgorithmsthatgenerateoutputtuplesevenastuplesarereceivedforinputstotheoperation.Pipelinescanbeexecutedintwoways:demanddrivenandproducerdriven Pipelining(Cont.)IndemanddrivenorlazyevaluationsystemrepeatedlyrequestsnexttuplefromtopleveloperationEachoperationrequestsnexttuplefromchildrenoperationsasrequired,inordertooutputitsnexttupleInbetweencalls,operationhastomaintain“state”soitknowswhattoreturnnextInproducer-drivenoreagerpipeliningOperatorsproducetupleseagerlyandpassthemuptotheirparentsBuffermaintainedbetweenoperators,childputstuplesinbuffer,parentremovestuplesfrombufferifbufferisfull,childwaitstillthereisspaceinthebuffer,andthengeneratesmoretuplesSystemschedulesoperationsthathavespaceinoutputbufferandcanprocessmoreinputtuplesAlternativename:pullandpushmodelsofpipelining Pipelining(Cont.)Implementationofdemand-drivenpipeliningEachoperationisimplementedasaniteratorimplementingthefollowingoperationsopen()E.g.filescan:initializefilescanstate:pointertobeginningoffileE.g.mergejoin:sortrelations;state:pointerstobeginningofsortedrelationsnext()E.g.forfilescan:Outputnexttuple,andadvanceandstorefilepointerE.g.formergejoin:continuewithmergefromearlierstatetill nextoutputtupleisfound.Savepointersasiteratorstate.close() EvaluationAlgorithmsforPipeliningSomealgorithmsarenotabletooutputresultsevenastheygetinputtuplesE.g.mergejoin,orhashjoinintermediateresultswrittentodiskandthenreadbackAlgorithmvariantstogenerate(atleastsome)resultsonthefly,asinputtuplesarereadinE.g.hybridhashjoingeneratesoutputtuplesevenasproberelationtuplesinthein-memorypartition(partition0)arereadinDouble-pipelinedjointechnique:Hybridhashjoin,modifiedtobufferpartition0tuplesofbothrelationsin-memory,readingthemastheybecomeavailable,andoutputresultsofanymatchesbetweenpartition0tuplesWhenanewr0tupleisfound,matchitwithexistings0tuples,outputmatches,andsaveitinr0Symmetricallyfors0tuples EndofChapter Figure12.02 SelectionOperation(Cont.)Old-A2(binarysearch).Applicableifselectionisanequalitycomparisonontheattributeonwhichfileisordered.AssumethattheblocksofarelationarestoredcontiguouslyCostestimate(numberofdiskblockstobescanned):costoflocatingthefirsttuplebyabinarysearchontheblockslog2(br)*(tT+tS)IftherearemultiplerecordssatisfyingselectionAddtransfercostofthenumberofblockscontainingrecordsthatsatisfyselectionconditionWillseehowtoestimatethiscostinChapter13'