2013年10月7日星期一

Reliability/Corrutpin model and some key-value store buiding stuff


Zettabyte Reliability with Flexible End-to-end Data Integrity

Data corruption could go undetected, thus high-level (end-to-end) integrity needed
Checksum used (strong checksum needed)
Drawback:
         Performance bad (need to compute checksum) – to change checksum online
Detection to late (want to detect before it goes to durable storage!) --- solved by let every component knows about the checksum

         They use corruption and checksum model to model the undected corruption probability (for a b-bit block)
        
Zettabyte reliability?????
         Less than 3.46*10e-18 for 4KB block (17.5 score)

Improving disk reliabity doesn’t improve overall reliability that much (maybe because disk corruption prob is already small???)


How about add compue overhead as a parameter?
        























PUE: (factory to data center. Used for transformer, etc.)
2005: 2-3 2012: ~1.1

Proportionality: (when idle, don’t use power, when computing in full load, use full power)

Effergy efficiency: 1GHz sweat spot

Increasing speed cost once:
1.       Once for switching speed
2.       Once for memory wall (caching, prefeching, out-of-order execution)

So, sweat spot configuration (wimpy nodes)
1.6G Dral Core
32-160G flash SSD
Only 1GB RAM

Design key-value store from the very bottom (hardware) up
Fast front end cache (cuckoo caching?)
Backend: log structure data structure + hash table index (instead of using file system)
Partial-key caching (complete key stored along with data to deal with collision) to enable memory efficiency

Then hardware change: (CPU 6x, memory 8x, SSD: 30-60x)---so CPU and memory have to keep up
How do you minizie memory per entry???
Static external dictionary theory problem
EPH – 3.8bits/entry
Entroy-coded tries: 2.5 bits/entry

Think it as a pipeline!! (how??????)
They shadow writes then batchly coping them to main index(?)
So they can only support up to 10% puts

Problem: linux kernel I/O stack too high
Load balancing:  add cache in front end to deal with hot spot
Proof: only nlogn cache entries needed for n loads to achieve almost perfect load balancing
So only need L3 for caching!
Intuition: (didn’t understand)
        
Their solution can’t deal with hash hacking (assume hash function invisible)


Some tradeoff: more reads to avoid some writes?
Bottleneck always in I/O? and also flash performance bursts

They want to manage raw flash, and treat it as many sequential writing devices(???????)
Now they have an SCSI command to exchange (remap) mapping on SSD, and they can do cool things using that





没有评论:

发表评论