• Sonuç bulunamadı

From a calculus to an execution environment for stream processing

N/A
N/A
Protected

Academic year: 2021

Share "From a calculus to an execution environment for stream processing"

Copied!
21
0
0

Yükleniyor.... (view fulltext now)

Tam metin

(1)

From a Calculus to an Execution

Environment for Stream Processing

DEBS 2012

Robert Soulé Martin Hirzel Buğra Gedik Robert Grimm

(2)

… to an Execution Environment

CQL (StreamSQL) StreamIt (SDF) Sawzall (MapReduce) River (execution environment) System S (platform) Fusion (merge ops) Fission (replicate ops) Placement (assign hosts) Benefits of execution environment: •  Language portability •  Optimization reuse Source languages O pt imi za tio ns

(3)

From a Calculus …

•  Calculus = formal language + semantics

–  Stream calculus, Soulé et al. [ESOP’10]

•  Graph language:

–  Stream operators

with functions (F)

–  Queues (

Q

)

–  Variables (

V

)

f q q' v

•  Semantics:

–  Small-step

–  Operational

–  Sequence of

“operator firings”

F <

Q

1

,

V

1

>

b

<

Q

2

,

V

2

>

b

* …

(4)

Benefits of Calculus:

Translation Correctness Proofs

Execute

Input

Output

T

ra

nsl

at

e

T

ra

nsl

at

e

(5)

From Abstractions to the Real World

Brooklet calculus River execution environment

Sequence of atomic steps Operators execute concurrently Pure functions, state threaded

through invocations

Stateful functions, protected with automatic locking

Non-deterministic execution Restricted execution: bounded queues and back-pressure Opaque functions Function implementations No physical platform,

independent from runtime

Abstract representation of platform, e.g. placement Finite execution Indefinite execution

(6)

Concurrent Execution

Case 1: No Shared State

•  Brooklet operators fire one at a time

•  River operators fire concurrently

•  For both, data must be available

o

1

v

o

2

o

3

w

x

Single-threaded operators Atomic queue operations

(7)

Concurrent Execution

Case 2: With Shared State

•  Locks form equivalence classes over shared variables

•  Every shared variable is protected by one lock

•  Shared variables in the same class protected by same lock

•  Locks acquired/released in standard order

o

1

v

o

2

o

3

w

w

(8)

Restricted Execution

Bounded Queues

o

1

v

o

2

o

3

w

w

•  Naïve approach:

block when output queue is full

o2 waits b/c output q is full o3 waits b/c o2 locked w

q

Deadlock!

(9)

Restricted Execution

Safe Back-Pressure

o

1

v

o

3

w

w

•  Our approach: only block on output queue

when not holding locks on variables

q

o

2

5. Move data to output queue 1. Acquire locks

2. Fire operator in local queue 3. Buffer data

(10)

Applications of an

Execution Environment

•  Easier to develop source languages

– Implementation language

– Language modules

– Operator templates

•  Possible to reuse optimizations

– Annotations provide additional information

between source and intermediate language

(11)

Function Implementations

and Translations

logs : {origin : string; target : string} stream; hits : {origin : string; count : int} stream = select istream(origin, count(origin)) from logs[range 300]

where origin != target

Bag.filter (fun x -> #expr)

Bag.filter (fun x ->

origin != target)

Select

Range

Aggr

IStream

Expose operators, communication, and state Pre-existing operator templates

(12)

Translation Support:

Pluggable Compiler Modules

select istream(*)

from quotes[now], history

where quotes.ask<=history.low and quotes.ticker=history.ticker

CQL =

SQL

+

Streaming

+

Expressions

Expression analyzer SQL analyzer CQL analyzer Symbol table is-a has-a has-a has-a

(13)

Optimization Support:

Extensible Annotations

Source language River (execution environment) System S (platform) Optimizer Establishes by construction, e.g., Sawzall reducers commute Needs to know: •  Safety •  Profitability Establishes, e.g., available resources

(14)

Optimization Support:

Current Annotations

Annotation Description Optimization

@Fuse(ID) Fuse operators with same ID in the same process Fusion

@Parallel() Perform fission on an

operator Fission @Commutative() An operator’s function is commutative Fission

@Keys(k1,…,kn) partitionable by fields kAn operator’s state is

1,…,kn

Fission

(15)

Evaluation

•  Four benchmark

applications

–  CQL linear road

–  StreamIt FM radio

–  Sawzall web log

analyzer (batch)

–  CQL web log

analyzer (continuous)

•  Three optimizations

–  Placement

–  Fission

–  Fusion

(16)

Distributed Linear Road

(simplified version from Arasu/Babu/Widom [VLDBJ’06])

now proj ect istre am dup split ran ge join istre am aggre gate join se lect join ran ge parti tion proj ect dis tinct dup-split now proj ect aggre gate pro ject pro ject rstre am

(17)

CQL: Placement, Fusion, Fission

•  Placement + Fusion

 4x speedup on 4 machines

•  Fission

 2x speedup on 16 machines •  Insufficient work per operator

(18)

StreamIt: Placement

(19)

Sawzall (MapReduce on River)

Fission + Fusion

•  Same fission optimizer for Sawzall as for CQL

•  8.92x speedup on 16 machines, 14.80x on 64 cores •  With fusion, 50.32x on 64 cores

(20)

Related Work

Stream processing Execution environment Translators from languages to IL CQL Arasu et al. [VLDB J.’06] SVM Labonte et al. [PACT’04] P-Code Nelson [CC’79] This paper

(21)

Conclusions

•  River, execution environment for streaming

•  Semantics specified by formal calculus

–  Brooklet, Soulé et al. [ESOP’10]

•  3 source languages, 3 optimizations

–  First distributed CQL

–  Language compiler module reuse

–  Optimization enabled by annotations

•  Encourages innovation in stream processing

Referanslar

Benzer Belgeler

Bu nedenle sunulan çalışmaya alınan profesyonel ses kullanıcıları ses bozukluğu şikayeti ile Kulak Burun Boğaz (KBB) kliniğine başvurmamış bireyler olup, bu bireylerin

Prenses'in babası Prens Abdülmünim son Mısır hidivinin oğluydu, 1952'deki Nasır darbesinden sonra kral naipliği yapmıştı ve o da kuzini Said Halim Paşa gibi fotoğrafa ve

weight, birth height, head circumference, weight gain during pregnancy, CANSCORE, and 25-hydroxyvitamin D [25(OH)Vit D] mean levels of the fetal malnourished group were found to

Başlangıç için en iyi yol, okuldaki engelli öğrencilerden biri ile, kütüphane personeline engellilere nasıl yardımcı olabilecekleri konusunda önerilerde bulun­

nutritional control of translation. Rapamycin, as a specific inhibitor of FRAP/TOR at the protein level, allows for characterization of the mechanisms involved in this pathway.

We presented results of accurate modeling of the lasing spectra and material gain thresholds for a 2-D kite-shaped microcavity laser.. In such a laser, WG-like modes display the

Although the QBD platform is quite general and can be used for many stochastic models that pertain to different areas, in this paper, examples of QBD models pertaining to

For the unpassivated detector, dark current density deviate from the Arrhenius type of behavior at the temperatures lower than 140 K indicating surface related currents start to