2012年10月21日星期日

How little improvement in data center infrastructure enables you to do interesting things



Spanner: Google’s Globally-Distributed Database
OSDI 2012's best paper

They implemented a semi-relational database, big-table like, but much stronger (external) consistency.

My take away:
Improving data center timing system and having a global clock makes consistency much easier.

True time API is really nice! Google is right in that absolute time is much easier to manage than relative time (e.g., Lamport clock)


Flat Datacenter Storage:
Microsoft Research
OSDI 2012, 

This is basically a re-implementation of GFS on a full bi-sectional, bandwidth matches disk bandwidth data center network. (Thus flat, locality oblivious) 

Data placement is deterministic (using a hash function), doesn't consider locality

They push meta-data management into each data node, while only keep node information in a central server. Replication/recovery a bit Chord style, I think?

Other than that, not too much difference from GFS. This is the only work (that I know of) which talks about how storage system uses network though: short, bursty flows, bimodal distribution in packege size (small control message, large data block). not good for TCP, so they use DCTCP?



没有评论:

发表评论