Friday, May 8, 2009

Energy Efficiency of MapReduce by UC Berkekey

Yanpei Chen and Tracy Xiaoxiao Wang in RAD Lab, UC Berkeley have evaluated performance cost with power consumption varying from 1 node to 4 nodes.
Their performance metric is energy per workload per degree of data replication.

Measurement results
1) CPU-Intensive Tasks
2) Reading/Writing from DFS
3) Writing/Sorting Intermediate pairs
4) Varying degree of DFS replication
5) Combined Task (overall job execution)

As a result, when the number of node is two, performance cost shows the best result comparing with others.

My opinion
1) In this paper, authors said in several times that performance cost depends on workload design. However, I cannot understand what kind of specific design is required. Why authors evaluated with only one workload?

2) The DFS replication result in Reduce phase(Figure8) is somewhat unusual. I have no idea how it will be showed if it performs in Map phase. They did not evaluate this due to time constraint. However, I think this results should be changed into reverse. In Map phase, if replication factor increases, computai0n process can cut down the network overhead with copying from required data, because the probability of data locality could be simaltounously increased against replication factor. Hence, saving the network resource overhead is directly connected to energy efficiency. In the contrary, In Reduce phase, if replication factor increases, task trackers have to additionally copy original data even though it would be processed in pipelined mananer. I think more disk access makes more power consumption, but this paper shows the result totally reversly. I wish that my thought might be incorrect in specific environment such as RAD lab cluster.

3) In conclusion, nonetheless of some odd, this paper suggests the most imporant way to investigate power evaluation with parallel application such as MapReduce, and gives a view point to extend the evaluation methodology as a future work. Thanks. :D

Download paper

4 comments:

  1. 도와주심에 감사 드립니다. ^^

    ReplyDelete
  2. 번역해줘요

    ReplyDelete
  3. Many institutions limit access to their online information. Making this information available will be an asset to all.

    ReplyDelete