2013年11月15日星期五

SyNS'13 Session 5: Bugs

Toward Efficient, Portable Application-Level Crash Consistency
Thanu, Remzi

Many techniques for file system consistency (for metadata), but how about application data consistency (for user data)?

How does application achieve consistency: they rely on specific details of file system implementations.

Application-level invariants: thumbnails match pictures, etc. But hard to maintain over system crashes.

Example:
Atomic file update: create temp file, fsync temp file to disk, then use rename(). fsync() maybe left because of wrong understanding, performance sake, or because on most file systems (ext4), it is actually correct.

What to study application consistency behavior.
Methodology: case study of SQLite and LevelDB)

Properties offered by file systems: (Inspired by the disk properties sqlite rely on, I think)
1. Post-Crash property (true/false): does a system call sequence only result in given, desirable set of post-crash states? ext3-ordered: yes, ext3-writeback: yes, ext4-ordered: no, btrfs: yes
2. Ordered Appends (true, false): ext3-ordered: yes, ext3-writeback: no, ext4-ordered: no, btrfs: no
3. Ordered dir-ops: directory operations get persisted in issued order. 
4. Safe-appends: when a file is appended, the appended portion will never contain garbage
5. Safe new file: After fsync() on a new file, another fsync() on the parent directory not necessary. 

Bugs in SQLite and Level DB: all of them only exposed on certain file systems. And developer said that they don't understand file system behavior, so they had to be conservative for their implementation, thus hurting performance.

Performance:
Experiments show that 3x performance boosts if you can rel on certain file system properties to optimize application behaviour!

Future work:
Solutions which doesn't require application re-writing, system call re-ordering, maybe??? Tools to detect such bugs?


Efficient COncurrency-Bug Detection Across Inputs
Dongdong Deng, Shan Lu

Concurrency-bug detection is costly: need to take many inputs, each input has a huge interleaving space to test.  Software company cannot afford exhaustive in-house testing.

Idea:
Remove redundancy Across Inputs. Because multiple inputs could trigger the same bug. How could we remove such overlaps?

Solution:
Find the overlapping interleaving space for different inputs, given a set of inputs?
Which two inputs will give the same interleaving space? We cannot answer this question perfectly, but we can estimate. To this end, concurrent function paris (CFP) metric is proposed.

Characteristic study:
For each harmful data race, on average 7 inputs trigger it; for each benign data race, on average 4 inputs trigger it.

Approach:
1. Profile interleaving space of each input: look at functions instead of instructions, as only a few functions share memory accesses; thus look at all the function pairs which could be executed concurrently. If two inputs have similar concurrent function pairs, then they are likely to trigger the same bug.
    Naive way to detect CFP: look at every instruction in each function, see if they can run concurrently
    Efficient way: whether one function's entrance point can execute between the other's entrance and exit, by looking at locks and barriers.
2. A test plan of selecting inputs and functions.
    Based on the CFP information, use a greedy algorithm to select input which covers the most CFPs.
    Also, select functions which we test. 
3. Conduct bug detection.

Result:
4 times redundancy reduction thus speed up.


Limplock: Understandingthe Impact of Limpware on Scale-out Cloud Systems
Thanh Do, Haryadi Gunawi

Hardware fails in different ways, how about performance degradation as a failure manifest?
Facebook reports one slow NIC in one machine (1GB/s to 1 kbps) has cascading impact on a 100 node clusters. And many other stories.

Limpware: hardware whose performance is much lower than its specifications.
Study the impact of Limpware: (a single limpare has global impact, why?)

Methodology:
Run workload, inject limpware, then probe systems to understand symptoms.

Results:
1. Failures are handled pretty well (by retry etc.), but slowdowns are not!
2. Hadoop's speculative exec is not triggered. (E.g, a degraded NIC on a map node, but all reducers need to fetch data from that node. Speculative exec not triggered because all reducers are slow! In general, a degraded task leads to a degraded node, then leads to many degraded nodes)

Limplock:
 They system progresses slowly due to limware and is not capable
Level 1: Operation limplock
              caused by single point of failure etc.
Level 2: Node limplock
              caused by multi-purpose queue, unbounded queue, etc
Level 3: Cluster limplock
              
Limplock happens in all systems they analized.

Future work: Limpware-tolerant system
   1. Anticipation
   2. Detection
   3. Recovery

没有评论:

发表评论