2016年5月5日星期四

MSST'16: It's never too fast: storage performance enhancements

Pfimbi: Accelerating Big Data Jobs Through Flow-Controlled Data Replication

in HDFS, synchronous replication (in pipeline) has performance bottlenecks, and seldomly helps application performance
   - only 2% of data was read within 5mins of being written
 
So do asynchronous replication
Need to use flow control to manage congestion as well

ManyLogs: Improved CMR/SMR Disk Bandwidth and Faster Durability with Scattered Logs

problem: small durable writes severely impacts bandwidth of other users (e.g., sequential reader) 

in this case, data journaling outperforms ordered journaling!

Ordered journaling: efficient for large writes
data journaling:efficient for small writes (less seeks) 
previous work: adaptive journaling (ATC'05) 

many logs, small writes to the nearest log (to the current head)  

where to put logs on the disk? reserve 10MB for every platter (?)

checkpointing: lazy instead of every 5 seconds for many-logs 








没有评论:

发表评论