- 1.06 MB
- 2022-04-29 14:33:31 发布
- 1、本文档共5页,可阅读全部内容。
- 2、本文档内容版权归属内容提供方,所产生的收益全部归内容提供方所有。如果您对本文有版权争议,可选择认领,认领后既往收益都归您。
- 3、本文档由用户上传,本站不保证质量和数量令人满意,可能有诸多瑕疵,付费之前,请仔细先通过免费阅读内容等途径辨别内容交易风险。如存在严重挂羊头卖狗肉之情形,可联系本站下载客服投诉处理。
- 文档侵权举报电话:19940600175。
'Chapter12:QueryProcessing
Chapter12:QueryProcessingOverviewMeasuresofQueryCostSelectionOperationSortingJoinOperationOtherOperationsEvaluationofExpressions
BasicStepsinQueryProcessing1.Parsingandtranslation2.Optimization3.Evaluation
BasicStepsinQueryProcessing(Cont.)Parsingandtranslationtranslatethequeryintoitsinternalform.Thisisthentranslatedintorelationalalgebra.Parsercheckssyntax,verifiesrelationsEvaluationThequery-executionenginetakesaquery-evaluationplan,executesthatplan,andreturnstheanswerstothequery.
BasicStepsinQueryProcessing:OptimizationArelationalalgebraexpressionmayhavemanyequivalentexpressionsE.g.,salary75000(salary(instructor))isequivalenttosalary(salary75000(instructor))EachrelationalalgebraoperationcanbeevaluatedusingoneofseveraldifferentalgorithmsCorrespondingly,arelational-algebraexpressioncanbeevaluatedinmanyways.Annotatedexpressionspecifyingdetailedevaluationstrategyiscalledanevaluation-plan.E.g.,canuseanindexonsalarytofindinstructorswithsalary<75000,orcanperformcompleterelationscananddiscardinstructorswithsalary75000
BasicSteps:Optimization(Cont.)QueryOptimization:Amongstallequivalentevaluationplanschoosetheonewithlowestcost.Costisestimatedusingstatisticalinformationfromthedatabasecataloge.g.numberoftuplesineachrelation,sizeoftuples,etc.InthischapterwestudyHowtomeasurequerycostsAlgorithmsforevaluatingrelationalalgebraoperationsHowtocombinealgorithmsforindividualoperationsinordertoevaluateacompleteexpressionInChapter14Westudyhowtooptimizequeries,thatis,howtofindanevaluationplanwithlowestestimatedcost
MeasuresofQueryCostCostisgenerallymeasuredastotalelapsedtimeforansweringqueryManyfactorscontributetotimecostdiskaccesses,CPU,orevennetworkcommunicationTypicallydiskaccessisthepredominantcost,andisalsorelativelyeasytoestimate.MeasuredbytakingintoaccountNumberofseeks*average-seek-costNumberofblocksread*average-block-read-costNumberofblockswritten*average-block-write-costCosttowriteablockisgreaterthancosttoreadablockdataisreadbackafterbeingwrittentoensurethatthewritewassuccessful
MeasuresofQueryCost(Cont.)ForsimplicitywejustusethenumberofblocktransfersfromdiskandthenumberofseeksasthecostmeasurestT–timetotransferoneblocktS–timeforoneseekCostforbblocktransfersplusSseeksb*tT+S*tSWeignoreCPUcostsforsimplicityRealsystemsdotakeCPUcostintoaccountWedonotincludecosttowritingoutputtodiskinourcostformulae
MeasuresofQueryCost(Cont.)SeveralalgorithmscanreducediskIObyusingextrabufferspaceAmountofrealmemoryavailabletobufferdependsonotherconcurrentqueriesandOSprocesses,knownonlyduringexecutionWeoftenuseworstcaseestimates,assumingonlytheminimumamountofmemoryneededfortheoperationisavailableRequireddatamaybebufferresidentalready,avoidingdiskI/OButhardtotakeintoaccountforcostestimation
SelectionOperationFilescanAlgorithmA1(linearsearch).Scaneachfileblockandtestallrecordstoseewhethertheysatisfytheselectioncondition.Costestimate=brblocktransfers+1seekbrdenotesnumberofblockscontainingrecordsfromrelationrIfselectionisonakeyattribute,canstoponfindingrecordcost=(br/2)blocktransfers+1seekLinearsearchcanbeappliedregardlessofselectionconditionororderingofrecordsinthefile,oravailabilityofindicesNote:binarysearchgenerallydoesnotmakesensesincedataisnotstoredconsecutivelyexceptwhenthereisanindexavailable,andbinarysearchrequiresmoreseeksthanindexsearch
SelectionsUsingIndicesIndexscan–searchalgorithmsthatuseanindexselectionconditionmustbeonsearch-keyofindex.A2(primaryindex,equalityonkey).RetrieveasinglerecordthatsatisfiesthecorrespondingequalityconditionCost=(hi+1)*(tT+tS)A3(primaryindex,equalityonnonkey)Retrievemultiplerecords.RecordswillbeonconsecutiveblocksLetb=numberofblockscontainingmatchingrecordsCost=hi*(tT+tS)+tS+tT*b
SelectionsUsingIndicesA4(secondaryindex,equalityonnonkey).Retrieveasinglerecordifthesearch-keyisacandidatekeyCost=(hi+1)*(tT+tS)Retrievemultiplerecordsifsearch-keyisnotacandidatekeyeachofnmatchingrecordsmaybeonadifferentblockCost=(hi+n)*(tT+tS)Canbeveryexpensive!
SelectionsInvolvingComparisonsCanimplementselectionsoftheformAV(r)orAV(r)byusingalinearfilescan,orbyusingindicesinthefollowingways:A5(primaryindex,comparison).(RelationissortedonA)ForAV(r)useindextofindfirsttuplevandscanrelationsequentiallyfromthereForAV(r)justscanrelationsequentiallytillfirsttuple>v;donotuseindexA6(secondaryindex,comparison).ForAV(r)useindextofindfirstindexentryvandscanindexsequentiallyfromthere,tofindpointerstorecords.ForAV(r)justscanleafpagesofindexfindingpointerstorecords,tillfirstentry>vIneithercase,retrieverecordsthatarepointedtorequiresanI/OforeachrecordLinearfilescanmaybecheaper
ImplementationofComplexSelectionsConjunction:12...n(r)A7(conjunctiveselectionusingoneindex).SelectacombinationofiandalgorithmsA1throughA7thatresultsintheleastcostfori(r).Testotherconditionsontupleafterfetchingitintomemorybuffer.A8(conjunctiveselectionusingcompositeindex).Useappropriatecomposite(multiple-key)indexifavailable.A9(conjunctiveselectionbyintersectionofidentifiers).Requiresindiceswithrecordpointers.Usecorrespondingindexforeachcondition,andtakeintersectionofalltheobtainedsetsofrecordpointers.ThenfetchrecordsfromfileIfsomeconditionsdonothaveappropriateindices,applytestinmemory.
AlgorithmsforComplexSelectionsDisjunction:12...n(r).A10(disjunctiveselectionbyunionofidentifiers).Applicableifallconditionshaveavailableindices.Otherwiseuselinearscan.Usecorrespondingindexforeachcondition,andtakeunionofalltheobtainedsetsofrecordpointers.ThenfetchrecordsfromfileNegation:(r)UselinearscanonfileIfveryfewrecordssatisfy,andanindexisapplicabletoFindsatisfyingrecordsusingindexandfetchfromfile
SortingWemaybuildanindexontherelation,andthenusetheindextoreadtherelationinsortedorder.Mayleadtoonediskblockaccessforeachtuple.Forrelationsthatfitinmemory,techniqueslikequicksortcanbeused.Forrelationsthatdon’tfitinmemory,externalsort-mergeisagoodchoice.
ExternalSort-MergeCreatesortedruns.Letibe0initially.Repeatedlydothefollowingtilltheendoftherelation:(a)ReadMblocksofrelationintomemory(b)Sortthein-memoryblocks(c)WritesorteddatatorunRi;incrementi.LetthefinalvalueofibeNMergetheruns(nextslide)…..LetMdenotememorysize(inpages).
ExternalSort-Merge(Cont.)Mergetheruns(N-waymerge).Weassume(fornow)thatN>
ComplexJoinsJoinwithaconjunctivecondition:r12...nsEitherusenestedloops/blocknestedloops,orComputetheresultofoneofthesimplerjoinsrisfinalresultcomprisesthosetuplesintheintermediateresultthatsatisfytheremainingconditions1...i–1i+1...nJoinwithadisjunctiveconditionr12...nsEitherusenestedloops/blocknestedloops,orComputeastheunionoftherecordsinindividualjoinsris:(r1s)(r2s)...(rns)
OtherOperationsDuplicateeliminationcanbeimplementedviahashingorsorting.Onsortingduplicateswillcomeadjacenttoeachother,andallbutonesetofduplicatescanbedeleted.Optimization:duplicatescanbedeletedduringrungenerationaswellasatintermediatemergestepsinexternalsort-merge.Hashingissimilar–duplicateswillcomeintothesamebucket.Projection:performprojectiononeachtuplefollowedbyduplicateelimination.
OtherOperations:AggregationAggregationcanbeimplementedinamannersimilartoduplicateelimination.Sortingorhashingcanbeusedtobringtuplesinthesamegrouptogether,andthentheaggregatefunctionscanbeappliedoneachgroup.Optimization:combinetuplesinthesamegroupduringrungenerationandintermediatemerges,bycomputingpartialaggregatevaluesForcount,min,max,sum:keepaggregatevaluesontuplesfoundsofarinthegroup.Whencombiningpartialaggregateforcount,adduptheaggregatesForavg,keepsumandcount,anddividesumbycountattheend
OtherOperations:SetOperationsSetoperations(,and):caneitherusevariantofmerge-joinaftersorting,orvariantofhash-join.E.g.,Setoperationsusinghashing:PartitionbothrelationsusingthesamehashfunctionProcesseachpartitioniasfollows.Usingadifferenthashingfunction,buildanin-memoryhashindexonri.Processsiasfollowsrs:Addtuplesinsitothehashindexiftheyarenotalreadyinit.Atendofsiaddthetuplesinthehashindextotheresult.
OtherOperations:SetOperationsE.g.,Setoperationsusinghashing:asbeforepartitionrands,asbefore,processeachpartitioniasfollowsbuildahashindexonriProcesssiasfollowsrs:outputtuplesinsitotheresultiftheyarealreadythereinthehashindexr–s:foreachtupleinsi,ifitisthereinthehashindex,deleteitfromtheindex.Atendofsiaddremainingtuplesinthehashindextotheresult.
OtherOperations:OuterJoinOuterjoincanbecomputedeitherasAjoinfollowedbyadditionofnull-paddednon-participatingtuples.bymodifyingthejoinalgorithms.ModifyingmergejointocomputersInrs,nonparticipatingtuplesarethoseinr–R(rs)Modifymerge-jointocomputers:Duringmerging,foreverytupletrfromrthatdonotmatchanytupleins,outputtrpaddedwithnulls.Rightouter-joinandfullouter-joincanbecomputedsimilarly.
OtherOperations:OuterJoinModifyinghashjointocomputersIfrisproberelation,outputnon-matchingrtuplespaddedwithnullsIfrisbuildrelation,whenprobingkeeptrackofwhichrtuplesmatchedstuples.Atendofsioutputnon-matchedrtuplespaddedwithnulls
EvaluationofExpressionsSofar:wehaveseenalgorithmsforindividualoperationsAlternativesforevaluatinganentireexpressiontreeMaterialization:generateresultsofanexpressionwhoseinputsarerelationsorarealreadycomputed,materialize(store)itondisk.Repeat.Pipelining:passontuplestoparentoperationsevenasanoperationisbeingexecutedWestudyabovealternativesinmoredetail
MaterializationMaterializedevaluation:evaluateoneoperationatatime,startingatthelowest-level.Useintermediateresultsmaterializedintotemporaryrelationstoevaluatenext-leveloperations.E.g.,infigurebelow,computeandstorethencomputethestoreitsjoinwithinstructor,andfinallycomputetheprojectiononname.
Materialization(Cont.)MaterializedevaluationisalwaysapplicableCostofwritingresultstodiskandreadingthembackcanbequitehighOurcostformulasforoperationsignorecostofwritingresultstodisk,soOverallcost=Sumofcostsofindividualoperations+costofwritingintermediateresultstodiskDoublebuffering:usetwooutputbuffersforeachoperation,whenoneisfullwriteittodiskwhiletheotherisgettingfilledAllowsoverlapofdiskwriteswithcomputationandreducesexecutiontime
PipeliningPipelinedevaluation:evaluateseveraloperationssimultaneously,passingtheresultsofoneoperationontothenext.E.g.,inpreviousexpressiontree,don’tstoreresultofinstead,passtuplesdirectlytothejoin..Similarly,don’tstoreresultofjoin,passtuplesdirectlytoprojection.Muchcheaperthanmaterialization:noneedtostoreatemporaryrelationtodisk.Pipeliningmaynotalwaysbepossible–e.g.,sort,hash-join.Forpipeliningtobeeffective,useevaluationalgorithmsthatgenerateoutputtuplesevenastuplesarereceivedforinputstotheoperation.Pipelinescanbeexecutedintwoways:demanddrivenandproducerdriven
Pipelining(Cont.)IndemanddrivenorlazyevaluationsystemrepeatedlyrequestsnexttuplefromtopleveloperationEachoperationrequestsnexttuplefromchildrenoperationsasrequired,inordertooutputitsnexttupleInbetweencalls,operationhastomaintain“state”soitknowswhattoreturnnextInproducer-drivenoreagerpipeliningOperatorsproducetupleseagerlyandpassthemuptotheirparentsBuffermaintainedbetweenoperators,childputstuplesinbuffer,parentremovestuplesfrombufferifbufferisfull,childwaitstillthereisspaceinthebuffer,andthengeneratesmoretuplesSystemschedulesoperationsthathavespaceinoutputbufferandcanprocessmoreinputtuplesAlternativename:pullandpushmodelsofpipelining
Pipelining(Cont.)Implementationofdemand-drivenpipeliningEachoperationisimplementedasaniteratorimplementingthefollowingoperationsopen()E.g.filescan:initializefilescanstate:pointertobeginningoffileE.g.mergejoin:sortrelations;state:pointerstobeginningofsortedrelationsnext()E.g.forfilescan:Outputnexttuple,andadvanceandstorefilepointerE.g.formergejoin:continuewithmergefromearlierstatetillnextoutputtupleisfound.Savepointersasiteratorstate.close()
EvaluationAlgorithmsforPipeliningSomealgorithmsarenotabletooutputresultsevenastheygetinputtuplesE.g.mergejoin,orhashjoinintermediateresultswrittentodiskandthenreadbackAlgorithmvariantstogenerate(atleastsome)resultsonthefly,asinputtuplesarereadinE.g.hybridhashjoingeneratesoutputtuplesevenasproberelationtuplesinthein-memorypartition(partition0)arereadinDouble-pipelinedjointechnique:Hybridhashjoin,modifiedtobufferpartition0tuplesofbothrelationsin-memory,readingthemastheybecomeavailable,andoutputresultsofanymatchesbetweenpartition0tuplesWhenanewr0tupleisfound,matchitwithexistings0tuples,outputmatches,andsaveitinr0Symmetricallyfors0tuples
EndofChapter
Figure12.02
SelectionOperation(Cont.)Old-A2(binarysearch).Applicableifselectionisanequalitycomparisonontheattributeonwhichfileisordered.AssumethattheblocksofarelationarestoredcontiguouslyCostestimate(numberofdiskblockstobescanned):costoflocatingthefirsttuplebyabinarysearchontheblockslog2(br)*(tT+tS)IftherearemultiplerecordssatisfyingselectionAddtransfercostofthenumberofblockscontainingrecordsthatsatisfyselectionconditionWillseehowtoestimatethiscostinChapter13'
您可能关注的文档
- 施工企业会计第二版辛艳红配套教学课件PPT 第9章 所有者权益.ppt
- 施工企业会计第二版辛艳红配套教学课件PPT 第12章 利润及利润分配.ppt
- 施工企业会计第二版辛艳红配套教学课件PPT 第11章_收入.ppt
- 施工企业会计第二版辛艳红配套教学课件PPT 第10章 工程成本和期间费用.ppt
- 施工企业会计第二版辛艳红配套教学课件PPT 第1章_总论.ppt
- 数据结构课件PPT110章全 第四章 串.ppt
- 数据结构课件PPT110章全 第五章 数组和广义表.ppt
- 数据结构课件PPT110章全 第六章 树和二叉树.ppt
- 数据结构课件PPT110章全 第七章 图.ppt
- 数据库系统概念全套配套课件PPT ch6.ppt
- 数据库系统概念全套配套课件PPT ch2.ppt
- 数据库系统概念全套配套课件PPT appB.ppt
- 数据库系统概念全套配套课件PPT ch16.ppt
- 数据库系统概念全套配套课件PPT ch7.ppt
- 数据库系统概念全套配套课件PPT ch26.ppt
- 数据库系统概念全套配套课件PPT ch22.ppt
- 数据库系统概念全套配套课件PPT ch23.ppt
- 数据库系统概念全套配套课件PPT ch21.ppt