STRATEGY FOR SPATIO-TEMPORAL
QUERIES IN VIDEO DATABASES
a thesis
submitted to the department of computer engineering
and the institute of engineering and science
of bilkent university
in partial fulfillment of the requirements
for the degree of
master of science By Gulay Unel July, 2002
inscope and in quality, as athesis for the degree of Master of Science.
Assoc. Prof. Dr.
Ozgur Ulusoy(Supervisor)
IcertifythatIhavereadthisthesisandthatinmyopinionitisfullyadequate,
inscope and in quality, as athesis for the degree of Master of Science.
Assist. Prof. Dr. AttilaGursoy
IcertifythatIhavereadthis thesisandthatinmyopinionitisfullyadequate,
inscope and in quality, as athesis for the degree of Master of Science.
Assist. Prof. Dr. _
IbrahimKorpeoglu
Approved for the Institute of Engineering and Science:
Prof. Dr. MehmetB. Baray
AN EFFICIENT QUERY OPTIMIZATION STRATEGY
FOR SPATIO-TEMPORAL QUERIES IN VIDEO
DATABASES
Gulay
Unel
M.S. inComputer Engineering
Supervisors: Assoc. Prof. Dr.
OzgurUlusoy and
Assist. Prof. Dr. Ugur Gudukbay
July, 2002
The interest for multimedia database management systems has grown rapidly
due tothe needfor the storageof huge volumes of multimediadata incomputer
systems. An important building block of a multimedia database system is the
queryprocessor,andaqueryoptimizerembeddedtothequeryprocessorisneeded
to answer user queries eÆciently. Query optimization problemis widely studied
forconventionaldatabase systems,howeveritisanewresearcharea for
multime-diadatabasesystems. Duetothedierencesinqueryprocessingstrategies,query
optimization techniques used in multimedia database systems are dierent from
thoseusedintraditionaldatabases. Inthisthesis, queryoptimizationproblemin
videodatabase systemsisoutlinedandaqueryoptimizationstrategy isproposed
as a solution to this problem. Reordering algorithms, to be applied on query
execution tree, are also described. Finally, the performance results obtained by
testingthe proposed algorithmsare presented.
Keywords: video databases, query optimization, query tree, querying of video
OZET V _ IDEO VER _ ITABANLARINDA YERLES _ IM-ZAMAN SORGULARI _ IC _ IN ETK _ IL _ I B _ IR SORGU OPT _ IM _
IZASYON STRATEJ
_ IS _ I Gulay Unel
BilgisayarMuhendisligi,Yuksek Lisans
Tez Yoneticileri: Doc. Dr.
Ozgur Ulusoy and
Yrd. Doc. Dr. Ugur Gudukbay
Temmuz, 2002
Multimedya veritaban yonetim sistemlerine olan ilgi buyuk hacimlerde
multimedyaverilerinisaklamaihtiyacndandolayhzlaartmstr. Sorguislemcisi,
bir multimedya veritaban sisteminin onemli yap taslarndan biridir ve
sorgu-lar verimli bir sekilde yantlayabilmek icin sorgu islemcisine yerlestirilmis bir
sorgu eniyileyicisine ihtiyac vardr. Sorgu optimizasyonu problemi
konvansi-yonel veritabanlar icin kapsaml olarak arastrlms olup, multimedya
verita-ban sistemleri icin yeni bir arastrma alandr. Sorgu isleme stratejilerindeki
farkllklardandolay multimedyaveritaban sistemlerindekullanlan sorgu
opti-mizasyon teknikleri, geleneksel veritabanlarnda kullanlanlardan farkldr. Bu
tezde, video veritaban sistemlerindeki sorgu optimizasyon problemi ana
hat-larylaelealnmsvebu probleme cozumolarakbirsorgu optimizasyonstratejisi
onerilmistir. Ayrca, sorgu calsma agacna uygulanacak sralama algoritmalar
tanmlanmstr. Son olarak, onerilen algoritmalarn test edilmesi sonucu elde
edilmisolan performanssonuclar sunulmustur.
Anahtar sozc ukler: video veritabanlar, sorgu optimizasyonu, sorgu agac, video
I would like to express my special thanks and gratitude to my supervisors
Assoc. Prof. Dr.
Ozgur Ulusoy and Assist. Prof. Dr. Ugur Gudukbay for their
concern in the supervision of the thesis.
I would like to express my gratitude to Assist. Prof. Dr. Attila Gursoy and
Assist. Prof. Dr. _
Ibrahim Korpeoglufortheir interest tothe subject matterand
spending their timefor reading and reviewingthe thesis.
I would like to acknowledge the support of Turkish Scientic and Technical
Research Council(T
UB _
ITAK).
I would like toexpress my special thanks toMehmet Emin Donderler for his
support and patience in allstages of the thesis research.
I thank to my spouse Cuneyt, my brother Semih, my mother and father for
their support.
Finally,Iwouldliketoexpressmyspecialthanksandgratitudetomymanager
1 Introduction 1
1.1 Organizationof the Thesis . . . 3
2 Related Work 4 3 BilVideo: A Video DBMS 8 3.1 Video Database System Architecture . . . 8
3.2 Video Query Language . . . 10
3.3 Query Types . . . 11
3.4 Query Processing . . . 12
4 Query Optimization 15 4.1 Structure of the Query Tree . . . 16
4.2 Internal Node Reordering Algorithm . . . 17
4.2.1 Examples . . . 19
4.3.1 Examples . . . 27
5 Performance Results 31
5.1 Fact Base Statistics . . . 32
5.2 Performance Results . . . 32
5.3 Examples . . . 36
6 Conclusions and Future Work 39
References 41
Appendices 44
A Sample Fact Base for an Example Video 44
3.1 BilVideodatabase system architecture. . . 9
3.2 Web client - queryprocessor interaction. . . 13
3.3 Query processingphases. . . 13
4.1 Query optimizationprocess . . . 15
4.2 Internal node reordering algorithm . . . 18
4.3 (a) Initial query tree for Query 1 and (b) Query tree for Query 1 afterinternal node reordering . . . 20
4.4 (a) Initial query tree for Query 2 and (b) Query tree for Query 2 afterinternal node reordering . . . 20
4.5 Leafnode reordering algorithm . . . 22
4.6 The functionthat ndssubquery tree of a leaf node . . . 22
4.7 The functionthat reorders the located subquery tree . . . 24
4.8 The functionthat ndsif thereis a `NOT-OR'type node in atree 25 4.9 The functionthat ordersleaf nodes . . . 25
4.11 The functionthat sorts leafnodes . . . 27
4.12 The functionthat puts the elements tothe leaf nodes . . . 28
4.13 (a) Initial subquery tree for Query 1 and (b) Subquery tree for
Query 1after leaf node reordering . . . 29
5.1 (a) Initial query tree for Query 1 and (b) Query tree for Query 1
afteroptimization . . . 37
5.2 (a) Initial query tree for Query 2 and (b) Query tree for Query 2
5.1 The statisticsof the fact base . . . 32
5.2 Leafnode reorderalgorithmtest results (msecs) . . . 33
5.3 Query optimizationalgorithmtest results (msecs) . . . 34
5.4 Convergence to the optimal querytree; rst test results(msecs) . 35
5.5 Convergence to the optimal querytree; second test results (msecs) 35
Introduction
The interest for multimedia database systems has grown rapidly with the
ad-vances in computer technology. The research on content-based image retrieval
by visual features (color, shape and texture) and keywords [4, 5] has progressed
in time towards videodatabases dealing with spatio-temporaland semantic
fea-tures of video data. First, the techniques devised for image retrieval were used
forsupportingcontent-basedvideoretrieval. Thesetechniquesassumedthevideo
as a consecutive sequence of images ordered in time. Some video database
sys-tems such as VideoQ, KMED, QBIC and OVID [6, 7, 5, 8] were implemented.
Queryingvideoobjectsbymotionpropertieshasalsobeenstudied[16,17,18,19].
Buildingblocksfor multimediadatabase systems are multimediadatamodel,
multimedia storage management, query interface, and query processing and
retrieval. Data models used in multimedia Database Management Systems
(DBMSs)are dierentfromthoseusedinconventionalDBMSs, sonew modeling
techniques are required to represent the semantics of multimedia data. Besides,
a multimedia storage manager is needed and storage devices capable of storing
large volumes of data must be supported to achieve better performance. Query
interface in a multimedia database system must enable the user to construct
well-denedqueries easily. Query processingandretrievalisalsoimportantsince
providingpowerfulquerying facilitiesonmultimediadata is a very crucialissue.
exact queries on conventional types of data but querying multimedia databases
requires additionaltechniques tosupport multimediadata types, likeimage,
au-dioand video. Extensions tothe conventional querylanguages are requiredthat
takeintoaccountoftheparticularcharacteristicsofmultimediadata. Inaddition
tothese, dierent queryoptimizationtechniques are required tobeimplemented
and integrated to the system.
Success of a database system depends on the eectiveness of the query
op-timization module of the system. The input to this module is some internal
representation ofa querygiven by theuser. This representation isthe querytree
inourcase. The aimof queryoptimizationistoselectthe mosteÆcientstrategy
toaccesstherelevantdataandanswerthequery. LetSbetheset ofallstrategies
(querytrees) that can be used toanswera given query. Eachmembers of S has
acostc(s). The goalof anyoptimizationalgorithmistond amemberofSthat
has the minimum cost.
Query optimization has been a challenging research area starting from the
beginning of the relational database management systems. A summary of the
research eorts on query optimization and other related concepts in database
systems can be found in [10].
In this thesis, we study the query optimization problem in multimedia
database systems. Our work concentrates on reordering of query trees in
pro-cessing queries in a multimedia database system to achieve the minimum cost.
Weproposealgorithmsused forreorderingquerytrees. Thegoal ofthe
optimiza-tion algorithms is to change the order of processing subqueries contained in the
query tree in order to execute the parts that are more selective (i.e., result in
fewer frames and/or objects) rst. The query optimizationmodule contains two
typesofreorderings forquerytrees toensuremoreeÆcient processingofqueries.
The rst type is internal node reordering, which reconstructs the query tree by
reorderingthechildrenofinternalnodes. Thesecondtypeisleaf node reordering,
which restructures the query contents of the leaf nodes of the query tree. The
query optimizationalgorithmsare implemented asa part of the queryprocessor
The work done in this thesis constitutes a part of a video database system,
BilVideo, developed by Donderler et al. [1, 2, 3]. In this system, a rule-based
spatio-temporalmodelfor videos and avideoquery processor,whichcan answer
spatial,temporal,trajectory,motionand object queriesfor videos, are proposed.
The work done inthis thesis is integrated intothe query processor of BilVideo.
1.1 Organization of the Thesis
The remainder of the thesis is organized as follows. In Chapter 2, related work
onmultimediaquery optimizationis discussed. The videodatabase system, into
which query optimization module is integrated, is described in Chapter 3. In
Chapter4,ourqueryoptimizationalgorithmsarepresented. Performanceresults
arediscussedinChapter5. Conclusionsofourworkandfutureresearchdirections
are given in Chapter 6. Fact base of the example database and the query sets
Related Work
BasicprinciplesofqueryoptimizationindatabasesystemsareexplainedbyJarke
and Koch [10]. In their paper, awide variety of approaches are proposed to
im-prove theperformance ofqueryprocessing thatincludelogic-based andsemantic
transformations, fast implementation of basic operations, and combinatorial or
heuristic algorithmsfor generating alternative access plans and choosing among
them. These methods are presented in the framework of a general query
evalu-ation procedure using relational calculus representation of queries. In addition
to these methods, nonstandard query optimization issues are also discussed in
the paper. According to Jarke and Koch, the goals of query transformation
are: (1)the construction of astandardized startingpointfor query optimization
(standardization), (2) the eliminationof redundancy(simplication),and (3)the
construction of expressions that are improved with respect to evaluation
perfor-mance(amelioration). Thetransformationrulesforthegeneralqueryexpressions
referenced in the paperare also validfor our query expressions.
Chaudhuri [13] focuses primarily onthe optimization of SQL queries in
rela-tionaldatabase systems. Accordingtothe paper, the two key componentsof the
queryevaluationcomponentof anSQL database system are the queryoptimizer
and the queryexecution engine. The paperdiscusses the System-R optimization
framework, search space that is considered by optimizers, cost estimation and
uses statistical summaries of data that have been stored. It also determines the
statistical summary of the output data stream and estimated cost of executing
the operationgiven anoperatorand thestatisticalsummary foreachofitsinput
data streams. The idea of collectingstatistical summaries for cost estimation is
alsoused in our query optimizationmodule.
The survey of query evaluationtechniques for large databases by Graefe [11]
describes query evaluation techniques for both relational and postrelational
databasesystems,includingiterativeexecutionofcomplexqueryexecutionplans,
the duality of sort- and hash- based set-matching algorithms, types of parallel
query execution and their implementation, and special operators for emerging
database application domains. According to the survey, query optimization is
a special form of planning, and employingtechniques from articial intelligence
such as plan representation, search including directed search and pruning,
dy-namicprogramming, branch-and-bound algorithms, etc.
Semanticqueryoptimizationfortreeandchain queriesbySunandYu[9]
pro-videsaneectiveandsystematicapproachforoptimizingqueriesbyappropriately
choosing semanticallyequivalenttransformations. Basically,thereare two
dier-ent types of transformations: transformations by eliminating unnecessary joins,
andtransformationsbyadding/eliminatingredundantbenecial/nonbenecial
se-lection operations (restrictions). An algorithm is proposed by Sun and Yu to
minimizethe numberof joinsintree queries. They claim thatthe important
op-erationsin semanticquery optimizationare the detection of acontradiction, the
eliminationofasmanyunnecessaryjoinsaspossible,andtheaddition/elimination
of benecial/nonbenecial redundant restrictions.
Alternative plan generation methods for multiple query optimization by
Menekse et al. [12] focus on generating a number of alternative plans in such
a way that the sharing between queries is maximized and an optimal execution
plan with minimalcost is obtained. They state that a globalexecution plan can
beconstructedbychoosingoneplanfor eachqueryandthen mergingtheseplans
together. Twoalgorithmsforalternativeplangenerationhavebeenimplemented,
alternativeplangeneration isalsoproposedtoeliminateuselessalternativeplans
by introducing asharing factor concept.
The paper by Soer and Samet [14] presents optimization methods for
pro-cessing of pictorialqueries specied by pictorialquery trees. Their optimization
strategyforcomputingtheresultofthepictorialquerytreeistochangetheorder
ofprocessingindividualqueryimagesinordertoexecutethe partsthat aremore
selective. The selectivity of a pictorial query is based on matching selectivity,
contextual selectivity, and spatial selectivity. Matching and contextual
selectiv-ity are computed based on the statistics stored as histograms in the database
thatindicatethe distributionof classicationsandcertaintylevelsinthe images.
These histograms are constructed when populatingthe database. Selectivity of
anindividualpictorialquery(leaf)iscomputedbycombiningthesethree
selectiv-ity factors. The querylanguage used intheir system has dierentcharacteristics
fromthequerylanguagethatweused. Theirquerylanguageincludesonlyspatial
relations in the pictorial query tree and they reorder the tree according to the
statistics stored for these spatial relations. Our query language has more
com-plex features, enabling the user to query spatio-temporal relations that will be
described in the next section. In the query optimization module of our system,
fact base statistics are used to reorder spatial relations. In addition to this,
re-orderingalgorithmsfor other types of nodes such asinternalnodes that contain
operatorsare added.
Mahalingamand Candan propose techniques for performingquery
optimiza-tionindierenttypesofdatabases,suchasmultimediaandWebdatabases,which
rely on top-k predicates [15]. Top-k predicates are the k predicates that return
the most relevant portion of all possible results. They propose an optimization
model that takes into account dierent binding patterns associated with query
predicatesand considers thevariationsinthe queryresult size,dependingonthe
execution order. Their optimizationmodel assigns a value (to be minimized) to
all partial or complete plans in the search space. It also determines the output
size of the data stream for every operator and predicate in the plan. So, the
out-ouroptimizationalgorithm. Themajordierenceoftheiroptimizationalgorithm
from ours is that the number of query results can alsochange depending on the
query execution order in their work, whereas it is independent from the query
BilVideo: A Video DBMS
In this chapter, a video database system, BilVideo [1, 2, 3] to which the work
in this thesis is integrated, is described. BilVideo is a video database
man-agement system that supports spatio-temporal and semantic queries on video
data. Aspatio-temporalquerymaycontainanycombinationofspatial,temporal,
object-appearance,external-predicate,trajectory-projectionandsimilarity-based
objecttrajectoryconditions. Thesystemhandlesspatio-temporalqueriesusinga
knowledge-base, which consistsof a factbase and comprehensive set of rules
im-plemented in Prolog, while utilizing an object-relational database to respond to
semantic(keyword, event/activity,andcategory-based), color,shapeandtexture
video queries. The organization of this chapter is as follows: The architecture
of BilVideo is given in Section 3.1. The video query language of BilVideo is
described in Section 3.2. The query types are presented in Section 3.3. Query
processingissues in BilVideoare discussed inSection 3.4.
3.1 Video Database System Architecture
Figure 3.1 illustrates the overall architecture of BilVideo. The system is built
on a client-server architecture and the users access the video database on the
Video Clips
Fact−Extractor
Visual Query Interface
Users
WEB Client
Query Processor
Knowledge−Base
Extracted Facts
Video Annotator
Feature Database
Raw Video Database
Object−Relational DBMS
Results
Query
(File System)
Figure3.1: BilVideo database system architecture.
Queryprocessorliesintheheartofthe system. It isresponsibleforanswering
user queries in a multi-user environment. Query processor communicates with
the object-relational databaseOracle 1
and theknowledgebase. Semanticdata is
storedintheOracledatabaseandfact-basedmetadataisstoredintheknowledge
base. Videodataandrawvideodataarestoredseparately. Semanticpropertiesof
videosusedforkeyword, activity/eventandcategory-basedqueriesonvideodata
are stored inthe feature database. These features are generated and maintained
byavideoannotatortool. Theknowledge-baseisusedtoanswerspatio-temporal
queries. The facts-base isgenerated by the fact-extractor tool.
Therulesusedforqueryingthe videodata,calledquery rules,haveassociated
framenumbers. A secondset ofrules, calledextraction rules, wasalsocreated to
work with frame intervals to extract spatio-temporal relations from video data.
Extracted spatio-temporal relations are converted to facts with frame numbers
of the keyframes in the knowledge-base. These facts are used by the query rules
for queryprocessing.
1
3.2 Video Query Language
The query languagehas four basic statementsfor retrieving information:
select video from all [where condition];
select video from videolist where condition;
select segment from range where condition;
select variable from range where condition;
The target of a query is specied in select clause. A query may return
videos (video), segments of videos (segment), or values of variables (variable)
with/withoutsegments ofvideos where the values are obtained. Variablesmight
be used for object identiers and trajectories. If the target of a query is video
(video),theusersmay alsospecifythemaximumnumberofvideos tobereturned
as aresult. The range of aquery is specied in from clause. The range may be
eitherthe entire videocollectionoralistof specic videos. Query conditions are
given inthe where clause.
Supported operators: Thequerylanguagesupportslogicalandtemporal
op-erators to be used in query conditions. Logical operators are and, or and
not. Temporaloperatorsarebefore,meets, overlaps,starts, during,nishes
and their inverse operators. In additionto these,the query languagehas a
trajectory-projectionoperator,project,whichisusedtoextract
subtrajecto-riesofvideoobjectsonagiven spatialcondition. Thelanguagealsohasthe
operators`=' and `!=', which can be used for assignmentand comparison.
Aggregate functions: The aggregate functions of the query language are
average, sum and count. They take a set of intervals (segments) as input
and return a time value in minutes for each video clip satisfying given
conditions.
External predicates: The query language has a condition type external,
pred-spatialpredicates. Ifanexternalpredicate isto beused, factsand/orrules
relatedto the predicate should be addedto the knowledge-base.
3.3 Query Types
The query language supports spatio-temporal, semantic and low-level queries.
Dierent querytypes that can be specied by the query languageare as follows:
Object queries: This typeof queries may beused toretrieveobjects, along
with videosegments where the objects appear.
Spatial queries: This type of queries may be used to query videos by
spa-tial properties of objects dened with respect to each other. Supported
spatial properties for objects can be grouped into mainlythree categories:
topologicalrelationsthatdescribeneighborhoodandincidencein2D-space,
directionalrelationsthatdescribeorderin2D-space, and3D-relationsthat
describeobjectpositionsonz-axisofthethreedimensionalspace. Thereare
eight distinct topological relations: equal, cover, covered-by, inside, touch,
disjoint, overlapand,contains. Directionalrelationsare west, south,north,
east, northwest, northeast, southwest and, southeast. 3D relations are
in-frontof, behind, strictlyinfrontof, strictlybehind, touchfrombehind,
touched-frombehind and samelevel.
Similarity-based object-trajectory queries: This type ofqueriesmay beused
to query videos to nd out the object and/or time interval of an object
havinga trajectory inthe videotoa given direction.
Temporal queries: This type of queries is used to specify the order of
oc-currence for conditions intime.
Aggregate queries: This type of queries may be used to retrieve statistical
data about objects and events in video data. The three aggregate
Low-levelqueries: Thistypeofqueriesisusedtoqueryvideodatabyvisual
properties such ascolor, shape and texture.
Semantic queries: This type of queries is used to query video data by
se-mantic features. In the system, videos are partitioned intosemanticunits,
which form a hierarchy. This semantic video hierarchy contains three
lev-els: video, sequence and scene. Videos consist of sequences, and sequences
consist of scenes that need not be consecutive in time. With this
seman-tic data model, three types of queries will be answered which are video,
event/activityand object.
3.4 Query Processing
Figure 3.2 illustrates how the query processor communicates with Web clients
and the underlyingsystem components toansweruser queries. Figure 3.3shows
the phases of query processing for spatio-temporal queries. Web clients make
a connection request to the query request handler, which creates a process for
each request passing a new socket for communication between the process and
the Web client. Then, user queries are sent to the processes created by the
query request handler. The queries are transformed intoSQL-like textual query
language expressions beforebeing senttothe serverif they are speciedvisually.
After receiving the query from the client, each process calls the query processor
with a query string and waits for the query answer. When the query processor
returns,the processcommunicatesthe answertotheWebclientissuingthequery
and exits. The query processor rst groups spatio-temporal, semantic, color,
shape and texture query conditions into proper types of sub-queries.
Spatio-temporal subqueries are reconstructed as Prolog-type knowledge-base queries.
Semantic, color, shape and texture sub-queries are sent as SQL queries to an
object relational database. Query processor integrates the intermediate results
and returns them to the query request handler, which communicates the nal
resultstoWebclients. Thephasesofqueryprocessingforspatio-temporalqueries
Web Client
(Java Applet)
User Query
Query Result
Set
Query Request
Handler
User Query
Query Result
Set
Query
Processor
(C++)
(C++)
Figure3.2: Web client -query processor interaction.
DECOMPOSER
Query
PARSER
LEXER
QUERY
EXECUTOR
QUERY
Result Set
Query
Tokens
Parse Tree
Query Tree
Query Execution Phase
Query Decomposition Phase
Query Recognition Phase
Figure 3.3: Query processing phases.
1. Query recognition: The lexical analyzer partitions a query into tokens,
which are passed to the parser with possible values for further
process-ing. Theparserassignsstructuretothe resultingpiecesand createsaparse
treetobeusedasastartingpointforqueryprocessing. Thisphaseiscalled
query recognition phase.
2. Query decomposition: The parse tree generated after the query
recogni-tion phase is traversed in a secondphase, whichis called query
decomposi-tion phase, to construct a query tree. The query tree is constructed from
the parse tree decomposing a query into three basic types of subqueries
which are Prolog subqueries (directional, topological,3D-relation, external
predicateand object-appearance) that canbedirectlysent tothe inference
engineProlog, trajectory-projection subqueriesthat are handled by the
tra-jectoryprojector, and similarity-based object-trajectory subqueries that are
processed by the trajectory processor. Maximal subqueries are subqueries
thatareformedbygroupingprologtypepredicates. Aqueryisdecomposed
insuch away that minimum numberof subqueries are formed.
3. Query execution: The input for the query execution phase is a query tree.
Thisquerytreeistraversedinpostorderinqueryexecutionphase,executing
processed in this phase and nal answers to user queries are formed after
Query Optimization
The aim of the query optimization algorithms designed and implemented for
BilVideo is to process more selective subqueries earlier than the others. The
algorithmsrestructure the initialquery tree andconstruct anoptimal querytree
inwhichthemoreselectivesubqueriesareexecutedearlierbythequeryprocessor.
The query optimizationprocess is outlinedin Figure4.1.
The queryoptimizationprocess implementedduring queryexecutionhas two
basicparts, which areinternal node reorderingand leaf nodereordering. In
addi-tion tothese parts,the statistics collected forthe videois read fromalebefore
executingthe leaf node reordering algorithm. These statisticsare used to
deter-mine the selectivities of relations in the condition part of the query. Selectivity
of arelationis inversely proportionaltothe numberof factsstored forthat
rela-tion. Internalnodereorderingalgorithmreordersthechildrenofinternalnodesby
placingrightchildren of`AND' nodes which are moreselectivethan leftchildren
to the left of their parents. Leaf node reordering algorithm deals with the leaf
InternalNodeReorder(queryt ree );
ReadStatistics();
LeafNodeReorder(querytree) ;
nodes. Every leafnodeinthe querytree hasacontentwhich storesthesubquery
to be executed. Leaf node reordering algorithm restructures these contents. It
usesthe subquerytreesconstructedforeachofthesecontentsinthe construction
of theinitialquerytree. This algorithmsorts the relationsinthe contentsof the
leafnodes whichare connectedby `AND'operatorsaccordingtotheirselectivity.
More selectiveoperationsare executed earlierthan the othersbythe reorderings
of this algorithm.
This chapter is organized as follows: In Section 4.1, our query tree structure
is explained. In Section 4.2, the internalnode reordering algorithmis described.
Finally,the leaf node reordering algorithmis presented inSection4.3.
4.1 Structure of the Query Tree
In our multimedia database model, a query is represented by a query tree
con-tainingthe spatio-temporalrelationshipsbetween thedata thatis tobeselected.
The condition in the where clause of the query is kept in this query tree. The
condition part can contain spatial relationships. Other functions that can take
placeintheconditionpartareobjecttrajectoryandprojecttypequeryfunctions.
Trajectoryqueries nd out theobjectand/or timeintervalof anobjecthavinga
trajectory in the video to a given direction. Project queries are used to extract
sub-trajectories of videoobjectsona given spatialcondition. The boolean
(logi-cal) operators of the query languageare and, or, not,The operators that can be
included in aquery are categorized intothree types:
1. AND:and
2. NOT-OR:not, or
3. TEMPORAL: before, during, meets, overlaps, starts, finishes, and
their inverse operators, ibefore, iduring, imeets, ioverlaps, istarts,
Therearetwotypesofnodesinthequerytree: internalnodesthatcontainthe
operators dened above and leaf nodes that contain spatio-temporalsubqueries.
These subqueries have three types:
1. PlainProlog Queries(PPQ): Spatial subqueries processedby Prolog,
2. Trajectory Queries(TRQ): Object-trajectory subqueries, and
3. Project Queries(PRQ): Project subqueries.
4.2 Internal Node Reordering Algorithm
Inthe querytree, theinternalnodes arereordered rst. Internalnodereordering
algorithmplaces the more selective nodes as left children of their parents, since
theleftchildofaparentisprocessedrst. Theproposedalgorithmiteratesonthe
query tree and restructures the tree to get the optimal internal node structured
query tree. The internal node reordering algorithmisgiven in Figure4.2.
Theinternalnodereorderingalgorithmiteratesonthequerytreeandreorders
the children of `AND'typed nodes suchthat:
The `AND', `TEMPORAL', `PPQ', `PRQ', `TRQ' type child nodes must
beonleftiftheother childis`NOT-OR'type. Since`NOT-OR'typenodes
combineresults fromtwo dierent result sets, they are found out tobethe
least selective compared to the other nodes.
The `AND'typechild nodes must be onleft if the other childis
`TEMPO-RAL'type. Thisisbecauseofthefactthat`AND'typenodesareprocessed
faster than the `TEMPORAL' type nodes.
The `PPQ' type child nodes with zero global variables must be on left if
the other child is `PRQ' or `TRQ' type. This is because of the fact that
InternalNodeReorder(QueryN ode qnode)
// Process the nodes which have children both on left and right
if(qnode->Left != NULL and qnode->Right != NULL)
begin
type=qnode->Type
ltype=qnode->Left->Type
rtype=qnode->Right->Typ e
// Reorder the children of `AND' nodes
if (type==AND)
begin
// `AND', `TEMPORAL', `PPQ', `PRQ', `TRQ' type child
// nodes must be on left if the other child is
// `NOT-OR' type
if (ltype==NOT-OR and
(rtype==AND or rtype==TEMPORAL or rtype==PPQ
or rtype==PRQ or rtype==TRQ))
exchange(qnode->Left, qnode->Right)
// `AND' type child nodes must be on left
// if the other child is `TEMPORAL' type
else if (ltype==TEMPORAL and rtype==AND)
exchange(qnode->Left, qnode->Right)
// `PPQ' type child nodes with zero global variables
// must be on left if the other child is
// `PRQ' or `TRQ' type
else if ((ltype==PRQ or ltype==TRQ) and
((rtype==PPQ) and (gvcount(qnode->Right)= =0)) )
exchange(qnode->Left, qnode->Right)
// `PRQ', `TRQ' type child nodes must be on left if
// other child is 'PPQ' type with global variables
else if (((ltype==PPQ) and (gvcount(qnode->Left)>0))
and (rtype==PRQ or rtype==TRQ))
exchange(qnode->Left, qnode->Right)
// `PRQ' type child nodes must be on left
// if the other child is `TRQ' type
else if (ltype==TRQ and rtype==PRQ)
exchange(qnode->Left, qnode->Right)
// `TRQ' type child nodes with zero global
// variables must be on left if the other
// child is `TRQ' type with global variables
else if ((ltype==TRQ) and (gvcount(qnode->Right)> 0)
and (rtype==TRQ) and (gvcount(qnode->Right)== 0))
exchange(qnode->Left, qnode->Right)
end
end
// call the function recursively for left and right subtrees
InternalNodeReorder(qnode ->L eft)
The `PRQ', `TRQ' type child nodes must be on left if the other child is
`PPQ' type with global variables. This is because of the fact that `PRQ'
and `TRQ'type nodes are found out to be moreselective than `PPQ' type
nodes with global variables.
The`PRQ'typechildnodesmustbeonleftiftheotherchildis`TRQ'type.
Thisisbecauseofthe factthat the subqueryinthe `PRQ'node can havea
variable tobe used by the subquery contained inthe `TRQ' node.
The `TRQ' type child nodes with zero global variables must be on left if
the other childis `TRQ'type with globalvariables. This is due tothe fact
that `TRQ' type nodes with zero global variables are more selective than
`TRQ'type nodes with global variables.
The query tree is restructured using the above rules because the nodes that
are being placed to left found out to be more selective in the experiments. The
gvcountfunctioninthealgorithmndsouttheglobalvariablecountofaparticular
node.
4.2.1 Examples
Some query tree examples are given in this part. In each example, the initial
query tree and the query tree afterinternalnode reordering are shown.
Query 1:
select segment, X, Y
from video
where ((west(X,Y) and disjoint(X,Y) and X != Y)
or Z=project(X, [west(X,a)])) and
(west(X,Y) and X=car1 and appear(Y) and south(Y,X))
In the query tree, the children of the root `AND' node are exchanged since
west(X,Y) and
disjoint (X,Y) and
X != Y
Z = project (X,
[west(X,a)]
west(X,Y) and X=car1
and appear(Y) and
south(Y,X)
OR
AND
west(X,Y) and X=car1
and appear(Y) and
south(Y,X)
west(X,Y) and
disjoint (X,Y) and
X != Y
Z = project (X,
[west(X,a)]
AND
OR
(a) (b)Figure4.3: (a)InitialquerytreeforQuery1and(b)Query treeforQuery1after
internal node reordering
Query 2:
select segment, X, Y
from video
where ((west(X,Y) before disjoint(X,Y)) and
((appear(Y) before touch(X,Y)) and
(X != car1 and Z=project(X, [west(X,a)])))
BEFORE
west(X,Y)
appear(Y)
touch(X,Y)
X != car1
Z=project(X,
[west(X,car1)])
AND
AND
AND
BEFORE
disjoint(X,Y)
AND
X != car1
appear(Y)
touch(X,Y)
west(X,Y)
disjoint(X,Y)
Z=project(X,
[west(X,car1)])
BEFORE
AND
BEFORE
AND
(a) (b)Figure4.4: (a)InitialquerytreeforQuery2and(b)Query treeforQuery2after
internal node reordering
Inthequerytree,thechildrenoftheroot`AND'nodeareexchangedsincethe
typeofthe leftchildis`TEMPORAL'and thetypeof therightchildis`AND'in
theinitialquerytree. Thechildrenofthe`AND' node whichisachildofthe root
node are alsoexchanged sincethe typeof theleft childis`TEMPORAL'and the
4.3 Leaf Node Reordering Algorithm
After the internal node reordering, the leaf nodes are reordered for each
deep-est internal node. Fact base statistics for each video is kept in a separate le.
The number of each spatio-temporal relation in the video is stored in this le.
So the numbers of south, northwest, southwest, equal, cover, inside, touch,
dis-joint,overlap,infrontof,behind,strictlyinfrontof, strictlybehind,touchfrombehind,
touchedfrombehind and samelevel facts are included in the le. These fact base
statisticsare used inleaf node reordering algorithm. In this algorithm,the facts
in the leaf nodes are sorted starting from the fact with the least number in fact
basestatisticsletothefactwiththelargestnumber. `PPQ'and`PRQ'typeleaf
nodes are reordered according tothese statistics. These leafnodes contain
max-imal subqueries that can be directly sent to the inference engine. So subquery
trees for these maximal subqueries must be constructed to reorder leaf nodes.
This construction is implemented within the query tree construction part. As a
result, subquery trees for each maximal subquery inthe `PPQ' and `PRQ' type
leaf nodes are built and kept in a list data structure. The leaf node reordering
algorithmisgiven in Figure4.5.
Thisalgorithmiteratesonthequerytree. Stepsofthealgorithmareasfollows:
1. Findthe `PPQ' and `PRQ' type leaf nodes.
2. Findthe subquerytrees of these nodes inthe subquery list.
3. Reorderthese subquery trees.
4. Get the content of the reordered subqueries.
5. Replacethe contents of the leaf nodes with this content.
As itcan beseen fromthe algorithm,the condition parts of the `PRQ'typeleaf
nodesare replacedonly. Thefunctionsusedinthealgorithmare explainedinthe
LeafNodeReorder(QueryNode qnode,QueryTree qtree)
// Iterate on the tree if node is not null
if(qnode != NULL)
begin
type=qnode->Type()
queryid=qnode->getQID(I NORD ER)
// locate `PPQ' and `PRQ' leaf nodes
if (type==PPQ or type==PRQ)
begin
// find the subquery tree of
// the nodes in subquery list
tmpppq=FindPPQinList(qt ree, queryid)
// reorder the subquery tree
reorderAlg(tmpppq->ppqn ode)
// get the reordered subquery
getSubquery(tmpppq->ppq node )
// set the content of the node
if (type==PPQ)
set content of qnode as subquery
else if type==PRQ
set content of the condition part of
qnode as subquery
end
end
// call the function recursively for left and right subtrees
if(qnode->Left != NULL)
LeafNodeReorder(qnode->Lef t,qt ree)
if(qnode->Right != NULL)
LeafNodeReorder(qnode->Rig ht,q tree )
Figure 4.5: Leaf node reordering algorithm
FindPPQinList(QueryTree qtree, int queryid)
// locate the subquery tree of the leaf node with
// id=queryid in the subquery list tmpppq
tmpppq=qtree->headppq
for (int i=1; i<qtree->ppqcount ; i++)
if (queryid != tmpppq->queryid)
tmpppq=tmpppq->nextppq
FindPPQinList function isusedforlocatingthesubquerytree ofaparticular
leaf node inthe subquery list (see Figure4.6).
The reorderAlg function iterates on the subquery tree which is located in
the subquery tree list and restructures this query tree (see Figure 4.7). This
algorithmrst locates the highest `AND' typenode in the subquery tree, if this
node has left and right children and the left child is `NOT-OR' typed and the
right one is `AND' typed, it exchanges the left and right nodes. If children are
`PPQ'or`AND'typed andthereisno`NOT-OR'typenodebelowthesechildren,
this subtree is called maximal AND subtree and it is reordered according to fact
base statistics. If children are `PPQ' or `AND' typed and there is at least one
`NOT-OR' type node below these children, the algorithm nds out if the right
child is a maximal AND subtree or not. If it is a maximal AND subtree then it
exchanges the child with the left child. If the algorithmlocatesa maximal AND
subtree it does not recurse because it has already reordered all the nodes in the
subtree, otherwise it recurses.
ThereIsNoOrNot function returns 0 if there is a `NOT-OR' type node in a
tree and returns 1if all the nodes are `AND'typed (see Figure 4.8).
OrderLeafNodesfunctionordersamaximalANDsubtree. Itrstputstheleaf
nodesintoanarraystructure, sortsthe arrayaccordingtothefactbase statistics
and puts the leaf nodes back to the tree (see Figure 4.9).
GetLeafNodes function gets leaf nodes of a tree and puts the contents and
globalvariablecountsof thenodestoanarraystructuretobeusedinthe sorting
procedure(see Figure 4.10).
SortLeafNodes functionsortstheleafnodes accordingtothefactbase
statis-tics. Itorderstherelationsintheincreasingnumberofstatistics(seeFigure4.11).
Thegetnum functiongetsthe statisticsof aparticularrelationfromthe statistics
le of the video. After sorting the relationsaccording to the statistics, the
func-tion puts the relations that query an inequality between any two objects in the
reorderAlg(QueryNode qnode)
// Iterate on the subquery tree located
// in the subquery tree list
norecurse=0
if(qnode!= NULL)
begin
type=qnode->Type
// locate the highest `AND' node on the subquery tree
if (type==AND)
if(qnode->Left != NULL and qnode->Right != NULL)
begin
ltype=qnode->Left->Type
rtype=qnode->Right->Type
// exchange left and right children
// if the left child is `NOT-OR' type
// and the right child is `AND' type
if (ltype==NOT-OR and rtype==AND)
exchange(qnode->Left, qnode->Right)
// If children are `PPQ' and `AND' typed and
// there is no `NOT-OR' type node below these
// children order the leaf nodes of this subtree
// else if there is no `NOT-OR' type node in the
// right subtree put this subtree to left
else if ( (ltype==AND and rtype==AND)
or(ltype==AND and rtype==PPQ)
or(ltype==PPQ and rtype==AND)
or(ltype==PPQ and rtype==PPQ) )
if (ThereIsNoOrNot(qnode)== 1) begin OrderLeafNodes(qnode) norecurse=1 end else if (ThereIsNoOrNot(qnode->Righ t)= =1) exchange(qnode->Left, qnode->Right) end
// call the function recursively for left and right
// subtrees if a maximal `AND' subtree is not located
if (norecurse != 1) begin reorderAlg(qnode->Left) reorderAlg(qnode->Right ) end end
ThereIsNoOrNot(QueryNode root)
// return 0 if there is at least one `NOT-OR'
// type node in the tree return 1 otherwise
if(root->Left != NULL) begin if (root->Left->Type==NOT-OR) return 0 if (ThereIsNoOrNot(root->Left )==0 ) return 0 end if(root->Right != NULL) begin if (root->Right->Type==NOT-OR ) return 0 if (ThereIsNoOrNot(root->Righ t)== 0) return 0 end return 1
Figure4.8: The function that nds if there is a`NOT-OR' type node ina tree
OrderLeafNodes(QueryNode qnode)
// get the leaf nodes of the maximal AND subtree
// sort the leaf nodes according to the fact base statistics
// put the leaf nodes back to the tree
leafcounter=0
GetLeafNodes(qnode,nodesa rr)
SortLeafNodes(nodesarr)
leafcounter=0
PutLeafNodes(qnode,nodesa rr)
GetLeafNodes(QueryNode qnode,nodedata nodesarr[])
// get the leaf nodes of the tree and put their contents
// and global variable counts to the array nodesarr
if(qnode->Left != NULL)
if (qnode->Left->Type==PPQ)
begin
nodesarr[leafcounter].ncon tent =qn ode- >Lef t->C ont ent
nodesarr[leafcounter].ppqf lag= gvc ount (qno de-> Lef t)
leafcounter++
end
if(qnode->Right != NULL)
if (qnode->Right->Type==PPQ)
begin
nodesarr[leafcounter].n cont ent= qno de-> Righ t->C ont ent
nodesarr[leafcounter].p pqfl ag= gvcount(qnode->Right)
leafcounter++
end
// call the function recursively for left and right subtrees
if(qnode->Left != NULL)
GetLeafNodes(qnode->Left, nodesarr)
if(qnode->Right != NULL)
GetLeafNodes(qnode->Right, nodesarr)
SortLeafNodes(nodedata nodesarr[])
// sort the leaf nodes according to the fact base
// statistics
for (i=1; i<leafcounter; i++)
begin
for (j=i; j>0 and getnum(nodesarr[j])
<getnum(nodesarr[j-1]);j --)
exchange(nodesarr[j],no desa rr[j -1] )
// put the relations that query an inequality
// between any two objects in the video
// to the end of the order
for (i=0; i<leafcounter; i++)
if ((nodesarr[i].ncontent.fi nd(" !=" )) and
(nodesarr[i].ppqflag>1))
begin
shift nodesarr left starting from i+1 to j
put nodesarr[i] to the end of the array nodesarr
end
end
Figure4.11: The function that sorts leaf nodes
PutLeafNodes function puts the elements of an array structure to the leaf
nodes of a tree. So, the nodes of the unsorted tree are replaced with the sorted
nodes. (see Figure 4.12)
4.3.1 Examples
Some query examples are given inthis part. The initialqueries and the queries
afterleafnodereorderingaccordingtothefactbasestatisticsareshown. The
rela-tionsinthequeryexamplesarereorderedassumingthat(south facts< samelevel
facts < west facts < overlap facts < disjoint facts < appear facts) in the fact
base.
Query 1:
select segment, X, Y
PutLeafNodes(QueryNode qnode,nodedata nodesarr[])
// put the elements of the array nodesarr to the
// leaf nodes of the tree with the root qnode
if(qnode->Left != NULL)
begin
if (qnode->Left->Type==PPQ)
begin
qnode->Left->setContent(n odes arr[ lea fcou nter ].nc ont ent)
leafcounter++
end
PutLeafNodes(qnode->Left ,nod esar r)
end
if(qnode->Right != NULL)
begin
if (qnode->Right->Type==PPQ)
begin
qnode->Right->setContent( node sarr [le afco unte r].n con tent )
leafcounter++
end
PutLeafNodes(qnode->Righ t,no desa rr)
end
where (samelevel(X,Y) and appear(X) and overlap(X,Y))
or (appear(X) and west(X, Y) and disjoint(X,Y))
Query 1 after leaf node reordering:
select segment, X, Y
from video
where (samelevel(X,Y) and overlap(X,Y)
and appear(X)) or (west(X,Y) and
disjoint(X,Y) and appear(X))
InitialsubquerytreeforQuery1andsubquerytreeforQuery1afterleafnode
reordering,whichare locatedinthe subquery tree list,are shown inFigure4.13.
AND
AND
appear(X)
west(X,Y)
disjoint(X,Y)
overlap(X,Y)
samelevel(X,Y)
appear(X)
AND
OR
AND
AND
appear(X)
disjoint(X,Y)
west(X,Y)
overlap(X,Y)
samelevel(X,Y)
appear(X)
AND
AND
AND
OR
(a) (b)Figure 4.13: (a) Initial subquery tree for Query 1 and (b) Subquery tree for
Query 1 afterleaf node reordering
The relations in Query 2 are reordered in the second query, since samelevel
facts < overlap facts < appear facts and west facts < disjoint facts < appear
facts.
Query 2:
select segment, X, Y
from video
where disjoint(X,Y) and X != Y and west(X,Y)
Query 2 after leaf node reordering:
select segment, X, Y
from video
where X=car1 and south(Y,X) and west(X,Y)
and disjoint(X,Y) and appear(Y) and X != Y
The relations in Query 1 are reordered as it can be seen from the second
query,sincesouth facts<westfacts <disjointfacts <appear facts. Theequality
relations are executed atthe beginning of the condition part and the inequality
Performance Results
In this chapter, the performance results obtained for the query optimization
al-gorithm are presented. Performance tests have been conducted on an example
videothatisextracted fromtelevisionnews. Performancetests havebeen carried
out onLinux environmentusing the query processor of BilVideoimplemented in
C++. Theperformanceparametersthataect queryoptimizationare asfollows:
1. Sizeof thequery: Whilethesize ofthequeryisbeing increased,the
perfor-mancegainobtained byour strategyalsoincreases. Forsmallsizedqueries,
therewillbesmallnumberof reorderingsbetween the nodes, sothe
perfor-mancegain will beless compared tothat with the largesized queries.
2. Size of the video: Size of the video is another parameter aecting query
optimizationsince the size of the fact base is directly relatedwith the size
of the video. The performance gain will increase if the size of the video
increases.
The organization of this chapter is as follows: In Section 5.1 statistics of the
fact base used are provided. Example facts from this fact base can be found in
Appendix A. The performance test results are presented Section 5.2. Example
5.1 Fact Base Statistics
The fact base of the example video is created using the fact extractor tool of
BilVideo. The statistics of the videoare given in Table 5.1. These statistics are
used inthe optimizationalgorithmtoreorder the leaf nodes.
Table 5.1: The statisticsof the factbase
Type of relation Number
west 1055 east 1055 south 206 northwest 0 southwest 0 disjoint 1682 overlap 1235 inside 0 appear 10234 touch 9 touchfrombehind 37 strictlyinfrontof 184 infrontof 276 samelevel 487 5.2 Performance Results
Five querysets are used inthe performance tests. The rst queryset isused for
testing the Leaf Node Reordering algorithm. The second set is used for testing
the whole optimizationalgorithm. The third and fourth sets are constructedfor
testingthealgorithmondierentreorderingsofthe samequery. Finally,the fth
set is used for testing the same query on dierent sizes of fact bases and result
sets. Thequerysets canbefoundinAppendixB. Optimizationoverheadgiven in
gain is dened in Formula5.1. The rst set of results are given in Table 5.2.
performance gain=
(proc: timew=oopt: proc: timewith opt:)
proc:time w=o opt:
: (5.1)
Table 5.2: Leaf node reorder algorithmtest results (msecs)
query time time optimization performance
without with overhead gain
opt. opt. 1 310 263 1.0 0.15 2 1002 609 1.0 0.39 3 512 264 1.0 0.48 4 490 291 1.0 0.41 5 508 217 1.0 0.57 6 423 261 1.0 0.38 7 2027 259 1.0 0.87 8 752 708 2.0 0.06 9 303 258 1.0 0.15 10 2030 1603 3.0 0.21 11 225 214 1.0 0.05 12 270 215 1.0 0.20
These results show that leaf node reordering algorithm increases the
perfor-mance of the query processor. There are dierent performance gains for each
queryintheset. Thisisbecausethe performance gaindepends onthe size ofthe
queryandthedierencebetweentheinitialquerytreeandtheoptimalquerytree.
Thesizesoftherst, ninth, eleventhandtwelfthqueriesaresmallsotheir
perfor-mance gains are at most 0.21. If the size of the query is small, the performance
gain isalso smallcompared tothe larger queries.
Leaf node reorderingalgorithmreduces the processing cost, becausethe
rela-tionsintheleafnodesareorderedstartingfromtherelationwiththesmallestsize
of outputtothe relationswith largersized outputs. Sothe unbound variables in
the nodes are rst bound with smaller sets of values and relationswith constant
parametersare executed earlier. This results inanincrease inperformance. The
Table 5.3: Query optimization algorithmtest results (msecs)
query time time optimization performance
without with overhead gain
opt. opt. 1 690 212 1.0 0.69 2 958 530 2.0 0.45 3 532 270 1.0 0.49 4 327 267 1.0 0.18 5 644 283 2.0 0.56 6 639 344 1.0 0.46 7 545 337 1.0 0.38 8 274 214 1.0 0.22 9 261 211 1.0 0.19 10 985 286 1.0 0.71 11 302 213 2.0 0.29 12 845 283 2.0 0.67
Theseresultsshowthattheoverallqueryoptimizationalgorithmincreasesthe
query processing performance. The factors that aect the results obtained with
the the leaf node reordering algorithmdiscussed abovealsoaect those with the
wholeoptimization algorithm.
The query optimization algorithm reduces the processing cost, because the
subqueries with larger selectivities are processed before the subqueries with
smaller selectivities. Forexample, if children of an `and' node are `or' and `and'
type internal nodes, the `and' type child is processed before the other which
results ina considerablegain in performance.
Asitismentionedpreviously,performancegaindependsonthesize and
com-plexity of the query. Another factor aecting the performance is the dierence
between the initialquery tree and the optimal query tree. The third and fourth
performance tests are done using dierent reorderings of the same query. The
querytree convergesto the optimalquerytree startingfromthe rst query. The
thirdresult set that uses asimple Prologquery isgiven inTable 5.4. The fourth
Table 5.4: Convergence to the optimalquery tree; rst test results(msecs)
query time time optimization performance
without with overhead gain
opt. opt.
1 1327 256 2.0 0.81
2 341 256 2.0 0.25
3 305 255 1.0 0.16
4 253 253 1.0 0.00
Table 5.5: Convergence to the optimalquery tree; second test results (msecs)
query time time optimization performance
without with overhead gain
opt. opt. 1 1306 218 2.0 0.83 2 1213 220 1.0 0.82 3 663 218 2.0 0.67 4 647 219 3.0 0.66 5 563 220 2.0 0.61 6 345 222 2.0 0.36 7 324 219 2.0 0.32 8 219 219 2.0 0.00
Thesetworesultsets showthatwhen thequery treeconverges tothe optimal
query tree, the performance gain of the optimization algorithm decreases. This
alsojusties the correctness of the optimization algorithm.
The last performance test is done for investigating the eect of the query
result set size onperformance gain. A query is selected and its result set size is
decreased by decreasing the fact base size at each step. The results of this test
are given inTable 5.6.
As itcan beseen from the performance results, when the size of queryresult
set decreases, the performance gain of thequery doesnot change much,and it is
Table 5.6: Query result set size parameter test results (msecs)
size of time time performance
result without with gain
set opt. opt.
133 2533 786 0.69 120 2259 713 0.68 105 2067 665 0.68 94 2013 632 0.69 85 1960 616 0.69 74 1673 538 0.68 65 1399 449 0.68 45 1275 379 0.70 34 1209 353 0.71 27 830 281 0.66 20 688 251 0.64 11 669 231 0.65 2 650 208 0.68
The performance test results prove that the queryoptimizationmethod
imp-lemented for BilVideo improves the performance of the query processor. Since
the performance gain is observed to decrease when the query tree converges to
the optimal query tree, it can be said that the reordering heuristics used by the
algorithmare correct. Asaconclusion, itisshown that processingmoreselective
subqueriescontainedintheinternalnodes andleafnodesofthe querytree earlier
thanthe othersisvery usefulinoptimizingqueryprocessingtimesinmultimedia
database systems.
5.3 Examples
Somequeriesselectedfromthesetofqueriesusedintheperformancetestsare
dis-cussedinthispart. Theinitialquerytrees andthe querytreesafteroptimization
Query 1:
select segment, X, Y
from video
where (west(X,Y) and disjoint(X,Y) and X != car1
or Z = project(X,[west(X, car1)])) and (west(X,Y)
and T = project(X,[west(X, car1)]))
west(X,Y) and
disjoint (X,Y) and
X != car1
Z = project (X,
[west(X,car1)]
west(X,Y)
T = project (X,
[west(X,car1)]
OR
AND
AND
T = project (X,
[west(X,car1)]
west(X,Y)
west(X,Y) and
disjoint (X,Y) and
X != car1
Z = project (X,
[west(X,car1)]
AND
AND
OR
(a) (b)Figure5.1: (a)InitialquerytreeforQuery1and(b)Query treeforQuery1after
optimization
The initialquery treeof Query 1(Figure5.1(a))is processed in985
millisec-onds and the optimized query tree (Figure 5.1 (b)) is processed in 286
millisec-onds. So, the performance gain is71%.
Query 2:
select segment, X, Y
from video
where (samelevel(X,Y) before disjoint(X,Y)) and
(infrontof(X,Y) and X != car1 and tr(X, [[west], [1]]))
The initialquery treeof Query 2(Figure5.2(a))is processed in845
millisec-onds and the optimized query tree (Figure 5.2 (b)) is processed in 283
samelevel (X,Y)
disjoint(X,Y)
tr(X, [[west],[1]])
BEFORE
AND
AND
infrontof (X,Y) and
X != car1
tr(X, [[west],[1]])
X != car1 and
infrontof (X,Y)
samelevel (X,Y)
disjoint(X,Y)
AND
BEFORE
AND
(a) (b)
Figure5.2: (a)InitialquerytreeforQuery2and(b)Query treeforQuery2after
Conclusions and Future Work
Query processingis essentialfor retrievingdata fromdatabase management
sys-tems and has been explored in the last30 years in the contest of relationaland
object-oriented database management systems. Query optimization constitutes
an important part of query processing, and it is a promisingresearch area since
the amountofdata that canbe managedbydatabase systems isgrowingrapidly
and new data typesare becomingwidely used. Also,new database management
systemssuchasmultimediadatabasesrequirenewtechniquesforqueryprocessing
and query optimization.
In this thesis, we have presented a query optimization method for video
databasesystems, whichwasimplementedonaparticular system,BilVideo. The
proposed optimization method has two parts: internal node reordering and leaf
node reordering. The children of the internal nodes of the query tree of a given
query are reordered using the internal node reordering algorithm which places
more selective children to the left of their parents. The contents of the prolog
and project type leaf nodes are reordered using the leaf node reordering
algo-rithm which makes use of statistical information to sort the relations forming
the contents of the leaf nodes. Therefore, our optimization method reorders the
query tree along two dimensions that results in a considerable improvement in
performance. The performance tests conducted on the query processor justies
reordering and leafnode reordering.
Currently, the proposed optimizationalgorithmsare used by a query
proces-sor which uses linear processing methods. The algorithms can be adapted to
a parallel query processor as a future work which can result in an even better
performance. Another future work can be the use of geneticalgorithmsinquery
optimizationofBilVideo,astheyare becomingwidelyusedandacceptedmethod
fornew and diÆcultoptimization problems. This method must propose atness
value function forthe query trees inthe solutionspace and adapt cross-overand
[1] M. E.Donderler,
O.Ulusoy,U. Gudukbay, A Rule-based Approach to
Rep-resent Spatio-Temporal Relations in Video Data, International Conference
onAdvances inInformation Systems (ADVIS'2000), Izmir, Turkey, Lecture
Notes in Computer Science (Springer Verlag), eds. T. Yakhno, vol. 1909,
pp. 409-418,October2000.
[2] M. E. Donderler,
O. Ulusoy, U. Gudukbay, A Rule-Based Video Database
SystemArchitecture, InformationSciences,vol.143, no.1-4,pp.13-45,2002.
[3] M. E. Donderler,
O. Ulusoy, U. Gudukbay, Rule-Based Spatio-Temporal
Query Processing for Video Databases (Submittedto the VLDB journal).
[4] N.S. Chang, K.S. Fu. Query by pictorial example. IEEE Transactions on
Software Engineering,SE6, vol. 6,pp. 519-524, 1980.
[5] M. Flickner, H. Sawhney, W. Niblack, J. Ashley, Q. Huang, B. Dom, M.
Gorkani, J. Hafner, D. Lee, D. Petkovic, D. Steele, and P. Yanker. Query
by image and video content: The QBIC System. IEEE Computer, vol. 28,
pp. 23-32, 1995.
[6] S. Chang,W. Chen, H. J. Meng, H. Sundaram, and D. Zhong. VideoQ:An
automated content-based video search system using visualcues. In Proc. of
ACM Multimedia, pp. 313-324,Seattle,Washington, USA, 1997.
[7] W.W.Chu,A.F.Cardenas,andR.K.Taira.Aknowledge-basedmultimedia
medicaldistributeddatabase system-KMED.InformationSystems,vol.20,
[8] E.OomotoandK.Tanaka.SemanticQueryOptimizationforTreeandChain
Queries. IEEE Transactions on Knowledge and Data Engineeering, vol. 6,
no. 1, February1994.
[9] W. Sun and C. T. Yu. OVID: Designand implementation of a videoobject
database system. IEEE Transactions on Knowledgeand Data Engineeering,
vol.5, pp. 629-643,1993.
[10] M.Jarke andJ.Koch.Query optimizationindatabasesystems. ACM
Com-puting Surveys, vol.16,no. 2,pp. 111-152, June 1984.
[11] G.Graefe.Queryevaluationtechniquesforlargedatabases.ACMComputing
Surveys, vol.25, no. 2,pp. 73-170,June 1993.
[12] G. Menekse, F. Polat, A. Cosar. Alternative Plan Generation Methods for
Multiple Query Optimization, ISCIS'98, Advances in Computer and
Infor-mation Sciences'98,eds. U. Gudukbay etal., pp. 246-253,1998.
[13] S. Chaudhuri. An Overview of Query Optimization in Relational Systems,
In Proc. of Principles of Database Systems'98, pp. 34-43, 1998.
[14] A.Soer,H.Samet.QueryProcesssingandOptimizationforPictorialQuery
Trees,Visual InformationandInformationSystems- VISUAL99(D.P.
Hui-jsmans and A.W. M.Smeulders, Eds.), LectureNotes inComputer Science
1614, Springer, Berlin,1999, pp. 60-67.
[15] L. P. Mahalingam, K. S. Candan. Query Optimization in the Presence of
Top-k Predicates, Multimedia Information Systems 2001,pp. 31-40.
[16] R. H. Guting, M. H. Bohlen, M. Ervig, C. S. Jensen, N. A. Lorentzos, M.
Schneider,M.Vazirgiannis.Afoundationforrepresentingandquerying
mov-ingobjects, ACM TransactionsonDatabaseSystems,vol.25,no.1,pp.1-42,
2000.
[17] J. Z. Li, M. T.
Ozsu, D. Szafron. Modeling of moving objects in a video
database, In Proc. of IEEE Multimedia Computing and Systems, pp.
[18] M. Nabil,A.H. Ngu, J.Shepherd. Modeligand retrievalof movingobjects,
Multimedia Tools and Applications, vol.13,pp. 35-71,2001.
[19] A. P. Sistla, O. Wolfson, S. Chamberlain, S. Dao. Modeling and querying
Sample Fact Base for an Example
Video
This is anexample video containing16,351 frames and 98salient objects. Some
salientobjectsinthevideoare tank1,car1,bodyguard1 andpowell. The video
isextracted fromtelevision news.
// Directional Relations west(tank1,car1,259). west(car1,car2,259). west(tank1,car1,272). west(car1,car2,272). west(car2,car3,272). west(tank1,car1,277). west(car1,car2,277). west(car2,car3,277). west(car3,car4,277). west(tank1,car1,280). west(car1,car2,280). west(car2,car3,280). west(car3,car4,280). west(car4,car5,280).
west(tank1,car1,282). west(car1,car2,282). west(car2,car3,282). west(car3,car4,282). west(car4,car5,282). west(car1,car2,287). west(car2,car3,287). west(car3,car4,287). west(car4,car5,287). west(car1,car2,298). west(car2,car3,298). west(car3,car4,298). west(car4,car5,298). west(car1,car2,303). west(car2,car3,303). west(car3,car4,303). west(car4,car5,303). west(tank3,tank4,366). west(tank3,tank4,386). west(tank3,tank4,395). west(tank3,tank4,408). west(tank5,israelisoldier1,409). west(tank5,israelisoldier1,435). west(tank5,israelisoldier1,456). west(palestinianofficer4,bodyguard2,484). south(palestinianofficer1,officialcar,527). south(palestinianofficer4,powell,527). south(palestinianofficer3,officialcar,531). south(palestinianofficer1,officialcar,531). south(palestinianofficer4,powell,531). south(palestinianofficer1,officialcar,535). south(palestinianofficer4,powell,535). south(palestinianofficer1,officialcar,542). south(palestinianofficer4,powell,542). south(palestinianofficer3,bodyguard1,542).
south(palestinianofficer1,officialcar,547). south(palestinianofficer4,powell,547). south(palestinianofficer3,bodyguard1,547). south(palestinianofficer1,officialcar,550). south(palestinianofficer4,powell,550). south(palestinianofficer3,bodyguard1,550). south(palestinianofficer4,officialcar,560). south(palestinianofficer3,bodyguard1,560). south(palestinianofficer4,palestinianofficer1,568). south(palestinianofficer4,officialcar,568). south(palestinianofficer3,bodyguard1,568). south(palestinianofficer4,officialcar,572). south(palestinianofficer3,bodyguard1,572). south(palestinianofficer4,officialcar,578). south(palestinianofficer3,bodyguard1,578). south(palestinianofficer4,officialcar,579). //Topological Relations disjoint(car2,car5,303). disjoint(car1,car5,303). disjoint(tank5,israelisoldier1,409). disjoint(tank5,israelisoldier1,435). disjoint(tank5,israelisoldier1,456). disjoint(palestinianofficer2,bodyguard1,484). disjoint(palestinianofficer3,powell,484). disjoint(powell,bodyguard2,484). disjoint(palestinianofficer2,palestinianofficer4,484). disjoint(palestinianofficer4,bodyguard2,484). disjoint(palestinianofficer2,palestinianofficer1,484). disjoint(bodyguard1,powell,484). disjoint(bodyguard1,bodyguard2,484). disjoint(palestinianofficer2,bodyguard2,484). disjoint(palestinianofficer2,powell,484). disjoint(palestinianofficer1,bodyguard2,484). disjoint(palestinianofficer3,bodyguard2,484).
disjoint(palestinianofficer2,bodyguard1,492). disjoint(palestinianofficer3,powell,492). disjoint(palestinianofficer3,palestinianofficer4,492). disjoint(bodyguard1,palestinianofficer4,492). disjoint(powell,bodyguard2,492). disjoint(palestinianofficer2,palestinianofficer4,492). disjoint(palestinianofficer4,bodyguard2,492). disjoint(palestinianofficer2,palestinianofficer1,492). disjoint(bodyguard1,powell,492). disjoint(bodyguard1,bodyguard2,492). disjoint(palestinianofficer2,bodyguard2,492). disjoint(palestinianofficer2,powell,492). disjoint(palestinianofficer1,bodyguard2,492). disjoint(palestinianofficer3,bodyguard2,492). disjoint(palestinianofficer3,powell,498). disjoint(palestinianofficer3,palestinianofficer4,498). disjoint(bodyguard1,palestinianofficer4,498). disjoint(powell,bodyguard2,498). disjoint(palestinianofficer2,palestinianofficer4,498). overlap(palestinianofficer4,powell,503). overlap(palestinianofficer1,powell,503). overlap(palestinianofficer4,palestinianofficer1,503). overlap(palestinianofficer4,officialcar,503). overlap(palestinianofficer3,palestinianofficer1,503). overlap(palestinianofficer3,bodyguard1,503). overlap(palestinianofficer3,officialcar,503). overlap(palestinianofficer2,officialcar,503). overlap(palestinianofficer1,officialcar,503). overlap(officialcar,powell,503). overlap(bodyguard1,officialcar,503). overlap(palestinianofficer1,bodyguard1,503). overlap(bodyguard2,officialcar,503). overlap(palestinianofficer2,bodyguard1,512). overlap(palestinianofficer3,powell,512). overlap(palestinianofficer4,powell,512).