Simulation technology - the role of concepts and algorithms

(1)

Selcuk Journal of

Applied Mathematics

Sel¸cuk J. Appl. Math. Vol. 4, No. 2, pp. 33–52, 2003

Simulation technology – the role of concepts

and algorithms

Hans-Joachim Bungartz and Ralf-Peter Mundani

Institute of Parallel and Distributed Systems, Universit¨at Stuttgart Universit¨atsstr. 38, D - 70569 Stuttgart, Germany

e-mail:{bungartz|mundanrf}@ipvs.uni-stuttgart.de

Received: November 25, 2003

Summary. Since modern numerical simulation is a transdiscipli-nary field of research, the question often arises up to which extent this fact should be reflected by the content of single courses as well as by the structure of the whole respective study programs. Here, an issue that is disputed especially hotly is how and where to keep the balance between an application-oriented education on the one hand, trying to start from real-life problems, and a more concept-based one on the other hand, where the focus is on deriving and depicting general principles of algorithm development that may be fruitful for completely different problem classes.

In this paper, we give two examples of such a general algorithmic framework – octrees as an eﬃcient interface of geometric modelling and numerical simulation, and sparse grids as an a priori optimised grid setting for tasks such as numerical quadrature or the numeri-cal solution of PDE. Both are based upon a hierarchinumeri-cal approach, which turns out to be an advantageous strategy for a large variety of problems.

Key words: eﬃcient numerical algorithms, octrees, sparse grids, teaching computational science and engineering

2000 Mathematics Subject Classiﬁcation: 65D17, 65Y20, 68U07, 65N30,

(2)

1. Introduction

When we look at the development of scientific computing or compu-tational science and engineering as a discipline of its own, it is obvious that there have always been two driving forces of progress: concrete application scenarios, such as the so-called grand challenges, and gen-eral algorithmic concepts. The history of finite element methods pro-vides a nice example for that. On the one hand, civil and, especially, structural engineering needed a way to obtain reliable computations of large buildings (actually, the roof construction of the Olympic sta-dium in München was one of the early buildings with a construction designed by (nonlinear) computations). On the other hand, the mo-tivation of the mathematical community was based on topics such as a sound mathematical theory, error estimators, or adaptive mesh refinement – essential for, but independent of specific applications. This example shows that we need them both for successful research and education in scientific computing: solutions for concrete applica-tion scenarios and general concepts or simulaapplica-tion technology, where, typically, mathematics and computer science play a more dominant part.

In this paper, we want to focus on the second pillar, i.e. on more general concepts and, hence, on examples for mathematical and in-formatical contributions to scientific computing. Of course, such al-gorithmic concepts are not uniformly distributed over the whole sim-ulation pipeline illustrated in Fig. 1, but are concentrated in the numerical treatment, implementation, visualisation, and embedding phases. Concerning numerical methods, aspects like approximation accuracy or speed of convergence are crucial. During implementa-tion, runtime and memory efficiency, parallelization properties, or hardware awareness are some important issues. Finally, an efficient embedding basically means to provide interfaces to connect codes or tools, resp.

Since this issue of SJAM is dedicated to Christoph Zenger and his scientific work, we illustrate the role of general concepts and algo-rithms with the help of the hierarchical paradigm that has been for many years and still is one of his primary fields of research. Hence, the remainder of this paper is organized as follows. In Sect. 2, we discuss octrees as a very important and flexible way of representing 3 D geometries which can be used as an efficient interface between geometric modelling on the one hand and numerical simulation on the other hand. Afterwards, in Sect. 3, we present a concise survey

(3)

Discipline: Informatics Mathematics Natural phenomenon 1. Modelling 2. Numerical treatment 3. Implementation 4. Visualisation 5. Validation 6. Embedding visualise simulation results

provide interfaces discretise model, develop algorithms

develop software, produce (parallel) code

validate, compare with experiments build up mathematical model

or technical process

Application field of

Fig. 1. From phenomena and technical processes to reliable simulation results:

the important steps of a numerical simulation, and where the classical scientiﬁc disciplines enter the scene.

of the main algorithmic features and conceptual advantages of sparse grids, and we brieﬂy report the state of the art of sparse grid research. Finally, some concluding remarks will close the discussion.

2. Eﬃcient geometric modelling: octrees

Nowadays, applications that deal with geometric modelling focus ei-ther on surface-oriented or volume-oriented geometric models, always choosing the class which suits the respective requirements in a better way. While the former ones are primarily used within CAD [1, 13], vi-sualisation, or computer graphics, volume-oriented models dominate the ﬁeld of simulation tasks, where an easy access to the geometry is necessary [15, 11]. Of course, this causes problems if geometric models shall be directly used as the geometry input of some simulation code. Here, common data structures that are well-suited for both worlds would be helpful. Octrees, a volume-oriented hierarchical representa-tion scheme based upon the recursive subdivision of a cube containing the object of concern with respect to each coordinate direction until the resulting cells are completely inside or outside the object or

(4)

un-til a predeﬁned ﬁnest resolution dmax is reached, have turned out to

be a promising candidate. With an octree, the average case overall amount of necessary cells decreases to O(n2) cells, which has to be compared with theO(n3) cells of an equidistant discretisation [6].

Fig. 2. An equidistant discretisation (left) compared to a quadtree (the 2 D

coun-terpart of octrees, right) where the amount of necessary cells decreases fromO(n2)

toO(n).

2.1. Modelling with half-spaces

Every refinement step leads to eight new cells that have to be tested, themselves, for an intersection with the object geometry to check whether there is need for refinement and, if not, to determine whether they are lying inside or outside the object. Such intersection calcu-lations – computationally expensive floating point (FP) operations – are, obviously, a drawback with respect to efficiency. Therefore, this is a good starting point for improvements. Actually, for a given surface-oriented geometry model consisting of flat surface patches only, a corresponding (volume-oriented) octree representation can be derived without any intersection calculations. Regarding the single surface patches as infinite planes, each plane divides the entire space into two half-spaces labelled inside and outside. An intersection of all half-spaces labelled inside, then, results in a volume-oriented repre-sentation of the original surface-oriented geometry.

To generate the respective octrees for each of the half-spaces de-scribed above, the necessary reﬁnement decisions can be reduced to simple parameter comparisons, if we always normalise our local cell to be [−1, 1]3, and if we choose the following (locally normalised) plane representation for all half-spaces:

(1) p : a0 + 3 i=1 ai· xi = 0 with 3 i=1 |ai| = 1 .

(5)

Thus, the value of a0 just denotes the (directed) distance of p from

the origin lying in the cell’s centre. An intersection of the interior of the local cell tested and some plane p only exists for −1 < a0 < 1;

for|a₀| = 1, plane p either completely overlaps with one of the cell’s sides or completely contains one of the cell’s edges or just hits the cell in one of the cell’s corners. Anyway, in the latter cases, a reﬁnement is not necessary.

Before, after some reﬁnement step, the eight new cells can be tested in the same way again, plane p has to be transformed to the respective new coordinate system. Thus, a scaling and translation operation has to be applied to p, which can be described as

(2) ⎛ ⎝x_x1₂ x3 ⎞ ⎠_→ ⎛ ⎝γ_γ1₂ γ3 ⎞ ⎠= ⎛ ⎝1/2 · x_{1/2 · x}1₂+ ζ_{+ ζ}1₂ 1/2 · x3+ ζ3 ⎞ ⎠ _.

The plane’s normal vector, given by a1 to a3, stays invariant with respect to these coordinate transformations. Hence, only the distance

a0 to the new origin changes. Inserting (2) in (1), plane p can now be written as (3) p : a˜0 + 3 i=1 ai· xi = 0 , 3 i=1 |ai| = 1 , where (4) a˜0 = 2· a0+ 2· 3 i=1 ai· γi.

Reﬁned cells are always half as big as the original cell in each direc-tion. Thus, the centre of a reﬁned cell has the distance -1/2 or 1/2 from the original cell’s centre. Substituting -1/2 or 1/2 – depending on the relative position of the respective successor cell – for γ1, γ2,

and γ3 in (4), ˜a0 can be written as ˜

a0 = 2· a0± a1± a2± a3.

For further optimisations, the expression±a₁± a₂± a₃ can be calcu-lated in advance for all eight possibilities and stored to some variables

t(1) to t(8). Hence, all necessary calculations to determine a new cell’s value ˜a0 = 2· a₀+ t(i) can be reduced to one FP multiplication and one FP addition.

A cell’s attribute – inside or outside – can also be determined by variable a0 applying the rule

(6)

if the face normal of plane p in Eq. (1) is always chosen to be an outer one. As a0 is calculated in any case, no further FP operations are needed. Figure 3 shows a small example of successive reﬁnements in 2 D (left) and the resulting quadtree (right) according to Eqs. (3) and (5). -1/3 1/3 -1 -1/3 -5/3 -1 -5/3 5/3 1/3 1 -1/3 1/3 -1 -1/3 -5/3

Fig. 3. A successive reﬁnement for the line l : −1₃−2₃· x1+1₃· x2= 0. The cells

in pictures 1–3 from the left contain the respective values of a0 according to (3).

The cell’s attributes – here, white means outside and grey means inside – in the right picture were determined by (5).

2.2. Boolean operators

So far, we have several octrees, each representing one half-space dis-cretisation of the original object’s surface patches. To form the result-ing volume-oriented model, Boolean operators – intersection, union, and diﬀerence – have to be applied to these octrees. The union is necessary to represent non-convex objects, whereas the set diﬀerence does not have to be considered explicitly due to A \ B = A ∩ B. Hence, the volume-oriented model can be described as a union of convex components that are each built by an intersection of half-spaces. The respective Boolean expressions can be written in their disjunctive normal form (DNF)

(6) (h1₁∩ h1₂∩ . . . ∩ h1_n₁)∪ . . . ∪ (hm₁ ∩ hm₂ ∩ . . . ∩ hm_n_m) ,

where hj_i denotes half-space i of part j. The DNF not only satisﬁes the uniqueness of representation, it also allows for a processing of all half-spaces in one single pass, such that the entire octree generation is free of any redundant work.

It is obvious that the intersection of two octrees after their genera-tion, for instance, leads to unnecessary reﬁnements in both trees that won’t show up in the resulting tree (see Fig. 4). When combining the octree generation and the application of Boolean operators to pro-cess all half-spaces in a single pass, no redundant reﬁnements will be

(7)

done, and the complexity is directly proportional to the surface of the resulting volume-oriented model.

Fig. 4. An intersection of two quadtree reﬁnements done in one single pass. The

dashed lines show redundant reﬁnements that don’t have to be done.

2.3. Decomposition of non-convex objects

Note that, for processing non-convex objects, a decomposition into convex components is necessary. One fast and eﬃcient approach is to recursively calculate an object’s convex hull, marking all faces lying on the convex hull and processing the rest in the same way until all faces are labelled. The result is a Boolean expression where the dif-ferent labels have to be substituted by the corresponding half-spaces again. Applying the rules of DeMorgan, this Boolean expression eas-ily conforms to a DNF according to (6). At the moment, our convex decomposition works ﬁne for 2.5 D objects. For fully 3 D objects, the decomposition algorithm needs a sophisticated exception handling and is work in progress.

In pass i, after calculating the convex hull for a set of faces (in the beginning, this set contains all faces of an object), each face of this set is tested to be part of the convex hull. Those lying on the convex hull are all marked with the same label li, the rest is clustered in connected sets, for each of which the convex hull is recursively cal-culated, followed by an analogous processing until these sets become empty. For that, a Boolean expression of the type

(li)\ ((set₁)∪ . . . ∪ (set_n))

is set up, which is recursively ﬁlled with new expressions of the same type until a set is ﬁnally processed. Thus, the resulting expression is of the form

(8)

Substituting each label (li) by its corresponding intersection of half-spaces (hl₁i∩hl₂i∩. . .∩hl_ni_li) results in a Boolean expression describing a convex decomposition of the original object’s geometry. This ex-pression can be further simpliﬁed by removing unnecessary brackets in case of single half-spaces and by applying DeMorgan’s rules to con-vert it into a DNF according to (6). Figure 5 shows a simple example of a convex decomposition and its corresponding Boolean expressions.

0

1 2

0

2 1

Fig. 5. A convex decomposition of a simple object into its main body (label 0)

and two further parts (label 1 and 2). The resulting Boolean expression reads

(0\ (1 ∪ 2)) or, after converting it into a DNF, (0 ∩ 1 ∩ 2).

2.4. Linearisation and binary streams

Before an octree can be stored to a file, it has to be linearised. A common linearisation technique is a “depth-first” search – starting from the root node and always descending in the left-most branch of the tree until a leaf is reached, then going backwards to examine all visited nodes in the same manner successively – where the single nodes are written in prefix, infix, or suffix notation. Here, a prefix no-tation is used, encoding an inner node – a node that has sons – as “1” and a leaf as “0”. In the case of a leaf, the respective attribute inside or outside also has to be stored, encoding the attribute inside as “1” and the attribute outside as “0”. The encoded attribute is appended to the encoded value for the leaf. Hence, two different encodings “00” and “01” for a leaf are possible (see Fig. 6).

The linearised octree is now available as a stream consisting of zeros and ones. For further optimisations, this stream can be binary en-coded, spending only one bit for each value and packing always 32 bits together in a 32-bit unsigned integer or a 64-bit unsigned integer on a 64-bit architecture, resp. In most cases, the last unsigned inte-ger can’t be completely ﬁlled and, hence, some padding information is appended. Unfortunately, this always requires to check for the end of data whenever a binary encoded linearised octree is processed.

(9)

1

1 00 1 00

00 00 01 01 01 01 00 00

Fig. 6. For the linearisation of an octree (centre), all nodes and leaves are encoded

before they are written in preﬁx notation as visited by a “depth-ﬁrst” search (right). The resulting binary stream is 10010000010100101010000.

Boolean operators, called multiplexers in this context, can also be applied to two or more of such binary streams. It doesn’t matter whether these streams were generated in advance and stored to a ﬁle or whether these streams are generated on-the-ﬂy, i.e. a generation and linearisation in real-time. Besides for the classical Boolean op-erations, multiplexers can also be used for some more sophisticated tasks like collision detection, for instance.

2.5. Handling of binary streams

To process binary streams and to tackle the end of data problem described above, some formalisation concepts will help to ﬁnd an ef-ﬁcient handling of binary encoded octrees. A Chomsky-II-grammar

G = (N, T, P, S), consisting of a set of non-terminal symbols N = {S}, a set of terminal symbols T = {0, 1}, a start symbol S, and

a set of productions P → 0|1SSSSSSSS, can produce any possible octree – here without the additional bits for the attributes. For ev-ery context-free (Chomsky-II-) grammar, there exists a ﬁnite state automaton K that accepts the language LG deﬁned by G. Starting

with an empty stack and a word y ∈ LG as input, y is accepted by

K when K stops in a corresponding ﬁnal state with an empty stack.

Based on this formalisation, an algorithm for processing streams can be implemented by using stacks. Given a maximum depth dmax for

the octree generation, a binary stream is read bit by bit and pro-cessed with a stack of size dmax+ 1 until the stack pointer is set to

the ﬁrst element again, all data has been read, and the rest only contains padding information (see Alg. 1). When a binary stream is being processed, an obvious drawback is the missing possibility to set back within the stream to any point already read. This makes it necessary to start from the beginning, but with respect to our appli-cations, this necessity is extraneous. The advantage of stacks, on the other hand, lies in the very low memory usage and in the fact that

(10)

the stack, at any time, contains the position number of the node or leaf currently processed during the entire process. Position numbers within the context of octrees are explained in [7], e.g.

Algorithm 1 Processing a binary stream with a stack

depth: d ⇐ 0;

allocate stack of size dmax+ 1: stack[i]⇐ −1;

do increase stack[d] by 1; if (stream = “1” ) then increase d by 1; else while (stack[d] = 7 ) do stack[d]⇐ −1; decrease d by 1; end while end if until ( d = 0 )

To address runtime and memory usage of the algorithms presented above, a small CAD model consisting of 60 half-spaces (see Fig. 7) was generated at different resolutions on a standard PC (Pentium4 2.40 GHz). Assuming a width and length of 10m for this model, a depth of 14 already results in a resolution of 0.61mm for a cell on the finest level, for instance. The following Tab. 1 shows runtimes, amounts of nodes, and file sizes for the model from Fig. 7 at resolutions ranging from 8 to 14. As one can see, with depth increasing by one, the values for the runtime, amount of nodes, and file size increase by a factor of four. This was to be expected due to the fact that the work to be done is direct proportional to the model’s surface (and the resolution on the finest level is halved with every increasing depth).

Depth Runtime [s] Nodes File size [Byte]

8 0.130 342,929 80,379 9 0.510 1,395,985 327,187 10 2.010 5,601,137 1,312,772 11 8.060 22,439,785 5,259,332 12 32.400 79,717,870 21,052,260 13 132.950 359,162,713 84,178,768 14 508.810 1,436,440,585 336,665,768

Table 1. Runtimes, amounts of nodes of the tree, and ﬁle sizes at diﬀerent

(11)

Fig. 7. An octree-based geometry of a simple CAD model consisting of 60

half-spaces, here displayed with 342,929 nodes at a depth of 8.

2.6. Applications

In the previous subsections, we showed how an octree-based volume-oriented model can be derived from a surface-volume-oriented one in an effi-cient way and how, by the usage of operators or multiplexers, resp., these octrees can be easily processed. However, octrees should not only be understood as a mere interface between the surface-oriented and the volume-oriented world [2]. Their potential is that they can act as an integrating step between these two worlds to couple simu-lation tasks and control management from the field of mechanical or civil engineering, for instance, with design processes from the field of CAD applications [12]. To illustrate this, we consider an application scenario where an octree-based geometric model is the foundation for a network-based cooperative working environment in the field of civil engineering. Different tasks like collision detection, global consistency, or structural analysis are handled or organized by an octree to cover some first important steps on the way to the long-term objective of completely embedded simulation processes.

2.6.1. Collision detection Collision detection is necessary during the derivation of a volume-oriented model from a surface-oriented one representing some building to ensure the consistency of the model. In CAD applications, two components that should perfectly match may intersect each other or have a small gap in-between, either due to mistakes during the modelling or because of round-oﬀ errors when persistently storing the model to a ﬁle. Such inconsistencies compli-cate simulation tasks like structural analysis or simulation of the air

(12)

conditioning, for example, which both fail if there is a gap between the walls and the ceiling. Hence, all inconsistencies have to be identiﬁed, located, and eliminated in advance before any numerical simulation is based upon the volume-oriented model generated.

To locate inconsistencies, all parts of a model have to be checked, i.e. tested for collisions with all other parts of the model. Thus, a complexity of O(n2) apparently occurs. As most of these tests end after few steps due to a disjoint location of the parts, an effective complexity of O(n) remains. Testing two parts provided as octrees or binary streams, a collision can be easily detected by intersecting both input files – either generated in advance or on-the-fly. When the intersection doesn’t become empty before a certain resolution h given by a maximum depth dmax = log2(1/h) is reached, an overlap

between these two parts exists. On the other side, if the intersection becomes empty before a depth dmin is reached, no collisions exist; if

it becomes empty between dmin and dmax, a gap has been found. In

the latter case, some user interaction is necessary to determine if this gap was produced intentionally or just by mistake.

Fig. 8. Collision detection for a simple CAD model – the parts highlighted in a

diﬀerent colour (left picture) are an inconsistency found by our program. A zoom into the critical parts (right) reveals a gap between the two neighbouring walls.

2.6.2. Network-based cooperative work To achieve global consistency for a geometric model being processed in a network-based coopera-tive working environment by several experts, collision detection, as described in the previous section, helps to locate any modifications in parts of a local model that can’t be joined without conflicts with the global model. Thus, cooperative work can be supported by simpli-fying control and management tasks on both a horizontal and verti-cal layer. While on a horizontal layer conflicts between neighbouring

(13)

parts processed by different experts can be identified, on a vertical layer different tasks for certain parts of the model with respect to a chosen granularity (a single room or a complete floor, e.g.) can be planned and controlled.

The basis for cooperative work is a global geometric model stored in a Relational Database Management System (RDBMS). Hence, each part of the model is given a unique identificator or primary key, under which all necessary information of this part is stored in the RDBMS. Besides the geometric information, all half-spaces of this part as well as its corresponding Boolean expression and further attributes like object type or material, for instance, can be stored. All parts of the global model are accessed via a second octree, containing the primary keys to the RDBMS’s tables and, thus, to the actual data. This oc-tree interface to the RDBMS is a spatial decomposition of the overall model, refined as long as one part entirely fits into one cell. All com-munication between the clients (experts) and the database is handled via this octree – direct communication is not allowed (see Fig. 9).

Client n Client 2 Client 1 . . . 3) Read/Write 2) Locking 1) Read only RDBMS

Fig. 9. An octree-based cooperative working environment, where clients can

ac-cess a global geometric model stored in a Relational Database Management Sys-tem (RDBMS) via an octree in read-only, locking, or read-write mode, allows to detect and avoid inconsistencies due to parts of the model modiﬁed locally.

Clients are given three different ways to check out parts of the global geometric model: read-only, locking, and read-write. The first one only grants read access. Thus, no modifications can be applied to the local model. Locking grants exclusive read-write permission for parts of the model to one single client. As long as all parts of interest and their neighbouring parts are locked, any modifications to these parts

(14)

won’t harm the global consistency when attempting to store the mod-ifications to the database. Only in the last case – read-write access – inconsistencies can occur as the same or neighbouring parts can be altered by different clients in a non-compatible way. When trying to write back these modifications, the system checks for collisions and – if existent – notifies the client about the corresponding parts to resolve the collision.

With these remarks on an application scenario, we close the discus-sion of octrees and turn to another algorithmic concept based on the hierarchical paradigm, the discretization on sparse grids.

3. Eﬃcient discretisation: sparse grids

In the following, we restrict ourselves to a short overview of sparse grids, demonstrating their most important features, their present ﬁelds of application, and their potential as cost-beneﬁt optimal spaces for higher dimensional problems, in particular. For details, we refer to the detailed survey in [5] and the references therein.

3.1. Principles and main properties

The representation of functions and the discretisation of PDE by conventional methods is limited to problems with up to three or four dimensions due to storage requirements and computational cost. The reason is the so-called curse of dimensionality, saying that the cost to compute and represent an approximation with a prescribed ac-curacy ε depends exponentially on the problem’s dimensionality d. We encounter complexities of the order O(ε−αd) with α > 0 depend-ing on the respective approach, the functions’ smoothness, and de-tails of the implementation. If we consider simple uniform grids with piecewise d-polynomial functions over a bounded domain in a finite element or finite difference approach, e.g., this complexity estimate translates to O(Nd_{) degrees of freedom, for which an accuracy of the}

order O(N−α) is achieved, where α depends on the smoothness of the function under consideration and the polynomial degree of the ansatz functions.

In the sparse grid method, we circumvent this curse of dimensionality by restricting ourselves to spaces of functions with bounded mixed derivatives. Starting from some 1 D multilevel basis on the unit inter-val [0, 1], preferably a H1- and L2-stable one, we obtain a multilevel

(15)

basis for the higher dimensional case, i.e. for Ω := [0, 1]d, by a simple tensor product construction. If the resulting hierarchical basis func-tions are arranged by gathering those of a support of size 2· h_l for level l = (l1, . . . ld), hlj = 2−lj, j = 1, ..., d, in a hierarchical subspace

Wl, we get the multilevel subspace splitting illustrated for d = 2 in Fig. 10.

l2 W11

W12

W₂₁ l1

Fig. 10. Scheme of subspaces for d = 2: Each square represents one subspace Wl

with its associated grid points. The supports of the corresponding basis functions have the same mesh size hland cover the unit square.

Now, for each of the hierarchical subspaces Wl, its cost (in terms of degrees of freedom) and its beneﬁt (in terms of its contribution to the overall interpolant with respect to some suitable norm) can be calculated or estimated, resp. Then, a cost-beneﬁt optimisation leads us to grid patterns and corresponding approximation spaces that are known as sparse grids (see Fig. 11). It turns out that the number of degrees of freedom needed for some prescribed accuracy does (up to logarithmic factors) no longer depend on d exponentially. This allows to treat either problems with a moderate number of dimensions substantially faster or to deal with higher dimensional problems at all.

Of course, the sparse grid approach is not restricted to the piece-wise linear case. Extensions of the standard piecepiece-wise linear

(16)

hierar-21 21 31 31 31 31 12 11 12 22 13 13 13 13 22 22 22 W11 12 W 13 W n = 1 n = 2 n = 3

Fig. 11. The regular sparse grid of level 3, d = 2, and the assignment of grid

points to subspaces.

chical basis to general polynomial degree p as well as interpolets, prewavelets, wavelets have been successfully used as 1 D ingredient for the tensor product construction (see [3, 5]).

3.2. Applications

Since its introduction [16], the sparse grid concept has been applied to most of the relevant discretisation schemes for PDE. These are finite element methods and Galerkin techniques, finite differences, finite volumes, spectral methods, and splitting extrapolation which leads to the combination technique. In the finite element context, special attention has been paid to concepts for adaptive mesh refinement and to the development of fast solvers. Concerning treated fields of PDE applications, flow problems have been the first focus for the use of sparse grid PDE solvers. Meanwhile, however, sparse grids are also used for problems from quantum mechanics, for problems in the context of stochastic differential equations, or for the discretisation of differential forms in the context of the Maxwell equations.

Apart from the ﬁeld of PDE, sparse grids have been and presently are applied to a variety of problem classes that shall not be forgotten here. Among these problems are integral equations, general operator equations, eigenvalue problems, data mining, and several others. Due to the historic background of sparse grids (see the work of Smolyak [14], especially), numerical quadrature has always been a hot topic in sparse grid research. Apart from the explicit use of the piecewise linear basis functions to calculate integrals, Smolyak’s scheme has

(17)

been applied to several 1 D quadrature rules such as the midpoint rule, the rectangle rule, the trapezoidal rule, general Clenshaw-Curtis rules, classical Gauss quadrature, or the incremental Gauss-Patterson rules (see [8], e.g.). Recently, in [9] a generalization of the conventional sparse grid approach which is able to adaptively assess the dimensions according to their importance was developed.

Furthermore, the higher order extension of the piecewise linear hier-archical basis to arbitrary polynomial degrees (cf. [3]) was used for the derivation of an adaptive quadrature scheme called piecewise Gauss quadrature (see [4]) that can tackle higher dimensional problems with locally varying smoothness, too. To cope with high dimensionality and to increase the degree of accuracy (i.e. the maximum polynomial degree for which the exact integral can be guaranteed), slight modiﬁ-cations in the basis functions and in the grids are necessary, see Figs. 12 and 13.

Fig. 12. Hierarchical polynomial basis according to [4]. The grid points are no

longer equidistant. Due to the relations to Gauss quadrature, the approach is called piecewise Gauss quadrature.

As an example, we study a simple transport problem (see [10] for details) described by the integral equation

y(x) = x +

₁

x γy(z)dz ,

for which the exact solution (an exponential function) is known. Think of a particle travelling through a 1 D slab of length one. In each step, the particle covers a distance which is uniformly distributed on [0, 1]. This may cause it to exit the slab; if not, it may be absorbed with probability 1− γ. The function y(x) denotes the probability for

(18)

Fig. 13. Classical regular sparse grid (left) and the one used for piecewise Gauss

quadrature (right), 2 D case.

a particle at position x to leave the slab. This problem can also be represented as an inﬁnite dimensional integral,

(7) y(x) = [0,1]∞ ∞ n=0 Fn(x, z)dz ,

where the vector z contains the leap lengths and where Fndenotes the

probability for the particle to leave the slab after exactly n steps. Fn,

basically, is a product of two Heaviside functions, such that the overall integrand has a discontinuity along the diagonal (see Fig. 14), which is kind of a worst case for a regular sparse grid, of course. Concerning

0 _0.2 0.4 _0.6 0.8 ₁ z1 - axis ₀ 0.2 0.40.6 0.81 z2 - axis 0.1 0.2 0.3 0.4 0.5 -12 -10 -8 -6 -4 -2 10 11 12 13 14 15 log2 of error log2 N Monte Carlo quasi-Monte Carlo (Faure, Sobol, Halton) piecewise Gauss

Fig. 14. The integrand∞_n=0Fn(0, z) in the ﬁrst two dimensions (left) and

re-sults for the discontinuous integrand in twenty dimensions for standard Monte Carlo, quasi-Monte Carlo, and piecewise Gauss quadrature (right).

dimensionality, since higher dimensions do not contribute consider-ably to the integral, the sum can be truncated. Figure 14 (right) shows numerical results for y(0) with γ = 0.5 in twenty dimensions. The performance of the adaptive sparse grid lies within the range of the standard Monte Carlo method, but can not compete with quasi Monte Carlo. At least, our adaptivity rescues Monte Carlo perfor-mance in this worst case scenario.

(19)

Much better results can be obtained by changing the problem’s for-mulation once more. Fn(x, z) in (7) can be replaced by a polynomial

F_n∗(x, z) describing the contribution of each jump. Now, the sparse grid properties can be fully exploited. Results for twenty dimensions are shown in Fig. 15. While quasi Monte Carlo gains about two dig-its of accuracy, the sparse grid gains eight. With 30000 evaluated points, the adaptive sparse grid outperforms quasi Monte Carlo by about four digits. Figure 15 also gives the convergence behaviour if we

-30 -25 -20 -15 -10 -5 10 11 12 13 14 15 log2 of error log2 N Monte Carlo quasi-Monte Carlo (Faure, Sobol, Halton) piecewise Gauss 1e-8 1e-7 1e-6 1e-5 1e-4 0.001 0.001 10 100 1000 10000 error function calls quasi-Monte Carlo Gauss-Patterson piecewise Gauss

Fig. 15. Results for the smooth integrand in 20 (left) and 8 dimensions (right).

further reduce dimensionality up to eight. The non-adaptive Gauss-Patterson on sparse grids according to [8] beats quasi Monte Carlo, but not the adaptive sparse grid. Hence, in spite of the smooth in-tegrand, the diﬀerent importance of the involved directions requires some adaptive strategy.

The above results clearly show the potential of sparse grids, in partic-ular for problems of a higher dimensionality. Future work will focus on the solution of PDE in the higher dimensional case.

4. Concluding remarks

In this paper, we discussed two examples of a general algorithmic concept based on the hierarchical paradigm and showed how these can be successfully used in the context of simulation technology. The first example, octrees, has turned out to be ideally suited to connect the worlds of geometric modelling and numerical simulation, and the second one, sparse grids, has proven to be a cost-benefit optimised discretisation which is especially advantageous in the higher dimen-sional case – a scenario where classical approaches suffer from the curse of dimensionality. Hence, the conclusion is that, apart from

(20)

approaching the challenges in scientiﬁc computing from the applica-tions’ side, general concepts can make important contributions, too. This holds for research activities as well as for teaching.

References

1. Barnhill, R.,and Boehm, W. (1984): Surfaces in computer aided geometric

design. North-Holland.

2. Breitling, P., Bungartz, H.-J., and Frank, A. (1999): Hierarchical concepts for improved interfaces between modelling, simulation, and visualization, Proc.

Vision, Modelling, and Visualization. Inﬁx.

3. Bungartz, H.-J. (1998): Finite elements of higher order on sparse grids. Ha-bilitationsschrift, Institute f¨ur Informatik, TU M¨unchen, and Shaker Verlag, Aachen.

4. Bungartz, H.-J., and Dirnstorfer, S. (2003): Multivariate quadrature on adap-tive sparse grids. Computing 71, 1, 89–114.

5. Bungartz, H.-J., and Griebel, M. (2004): Sparse grids. Acta Numerica, to appear.

6. Bungartz, H.-J., Griebel, M., and Zenger, C. (2002): Einf¨uhrung in die com-putergraphik, 2nd edition, Vieweg.

7. Frank, A. (2000): Organisationsprinzipien zur integration von geometrischer

modellierung, numerischer simulation and visualisierung, Herbert Utz Verlag.

8. Gerstner, T., and Griebel, M. (1998): Numerical integration using sparse grids. Numer. Algorithms, 18, 209–232.

9. Gerstner, T., and Griebel, M. (2003): Dimension adaptive tenzor - product quadrature. Computing 71, 1, 65–87.

10. Morokoﬀ, W., and Caﬂisch, R. (1995): Quasi-Monte Carlo integration. J. Comput. Phys., 122, 218–230.

11. Mundani, R.- P.(2000): Strömungssimulation auf adaptiven datenstrukturen mitels finiter volume. Master’s thesis, Department of Computer Sciences, TU München,

12. Mundani, R.- P.,Bungartz, H.-J., Rank, E., Romberg, R., and Niggl, A. (2003): Eﬃcient algorithms for octree-based geometric modeling, in: Proceed-ings of the Night International Conference on Civil and Structural Engineering Computing, Topping, B. (Ed.), Civil-Comp Press.

13. Riesenfeld, R. F. (1993): Modeling with nurbs curves and surfaces, in: Funda-mental Developments of Computer-Aided Geometric Modeling, Piegl, L. A. (Ed.), Academic Press.

14. Smolyak, S. (1963): Quadrature and interpolation formulas for tenzor prod-ucts of certain classes of functions, Soviet Math. Dokl. 4, 240–243. Russian original in Dokl. Akad. Nauk SSSR 148, No. 5, 1042–1045.

15. Tchon, K.-F., Hirsch, C., and Schneiders, R., (1997): Octree-based hexahe-dral mesh generation for viscous ﬂow simulations, in: Proceedings 13th AIAA Computational Fluid Dynamics Conference.

16. Zenger, C. (1991): Sparse grids, in: Parallel algorithms for partial diﬀeren-tial equations, Hackbusch, W. (Ed.), 31 of NNFM, Vieweg, Braunschweig/ Wiesbaden.