CS262A Reading Summary 25

Encapsulation of parallelism in the Volcano Query Processing System

G. Graefe
Summary by Feng Zhou
10/31/2002

3 key features,
  1. The "opeartor model" of parallelizing query processing is proposed. The difference between it and the conventional model("bracket model") is: In the "operator model", most of the data/control transfer between adjascent operators are done by normal procedure call using the iterator interface, while in "bracket model", all transfer is done by IPC or RPC. To exploit parallelism, "exchange operators", which are essentially drivers with queues, are inserted into the exectution tree to assign subtrees to different processors.
  2. Several kinds of parallelism in Volcano are discussed. The first is vertical parallelism, which refers to the overlapping of computation along a path in the tree. Another is "bushy parallelism" referring to different subtrees handling by different processors. The third is intra-operator parallelism, which is achieved by partition data to several processors. The last two are horizontal parallelism.
  3. "Flow control" is used by the exchange operator to prevent faster producers from overwhelming slow consumers. However this is rather naive because only a static length queue is used to control the flow.

1 flaw:

The only way to balance the work between different operators is flow control. It is desirable that control is exerted on the concurrency level of operators to maximize the throughput of the whole system.