2012年2月15日星期三

FAST'12 Session 3: File System Design and Correctness

Recon: Verifying File System Consistency at Runtime
Daniel Fryer, Kuei Sun, Rahat Mahmood, TingHao Cheng, Shaun Benjamin, Ashvin Goel, and Angela Demke Brown,University of Toronto

My takeaway: checking consistency at runtime is interesting, and transforming gloable file system property into local invariants in also interesting.

Bugs corrupts in-memeory file metadata

Current solution: assume file systems are correct. Offline consistency check (fsck): slow, require fs offline, repair error prone

Their solcution: Rcon, runtime consistency check

Key idea: every update results in a consistency fs image. (Disk can still conrrupt? Checksum handling that below fs?)

Transform gloable consistency properties to fast, local consistency invariants.

When to check? (Don’t want to check during operation): right before you write journal commit block.

System design: in block device layer. buffer metadata writes (write cache), interpreting metadata, then compare to old data (read cache), check for invariants

How to interpreated: use fs tree structure, just follow the pointer. (Need to understand inode block structure)

Evaluation:

Inject corruption. Recon can detect corruptions not detected by fsck.

8% performance penalty (mainly due to cache misses)

Q&A

Q: What to do after dectecing corruption?

A: fail-stop. (maybe retrying if failure transist?)

Q: What happen if delayed commits?

A: ext3 does that. (large transaction)

Q: future file system to make check easier?

A: back pointers (less data to keep track of). Maybe write consistency in declarive language.

Q: you are delaying writes. What about persistence requirement?

A: we only hold commit block. It increase synchronize write latency. If you sychornzily write, you are wrting commit block every time, and you are paying the cost all the time.

Q: apply this to other things? Distributed system?

A: Consistency in DS in more complex. Maybe DB which has fs structure to maintain, or some other transactional thing?

Q: why not inside FS?

A: We don’t depend on FS state correctness! We do rely on FS structure, but it changes slowly!

Understanding Performance Implications of Nested File Systems in a Virtualized Environment
Duy Le, The College of William and Mary; Hai Huang, IBM T.J. Watson Research Center; Haining Wang, The College of William and Mary

My takeaway: guest/host fle system combination matters!

How to choose guest/host file system combination.

Macro level: Measure throughput and latency: combination choice matters!

Writes more critical than READs

Latency is more senstenve than throuput.

Micro level: random/sequential read/write

Read unafftected by nested fs, while writes affected.

Readahead at the gypervisor when nesting FS

Long idle times for queuing

I/O scheduling not effective on nested fs

Effectivenss of gues fs’ block allocation is NOT guaranteed.

Advice:

Read-dominated workloads: doesn’t matter. Sequential read may even improve with nested fs (look ahead)

Write dominated workload: avoid nesting, causing extra metadata operations.

Latency sensitive workloads: latency increased.

Dala allocation: better pass through.

Q&A

Q: did you put ext2 partition at different location of disk?

A: No. We didn’t.

Q: Then you didn’t isolate the effect of disk zone properties!

A again: we tried access different zone, performance difference within 5%.

Q: container file preallocated? Upperlevel fs make direct I/O to bypass page cache of host?

A: Yes. Yes.

Q: We don’t typically use i/o scheduler in guest.

A: default I/O scheduler in either guest/host.

Q: would your finding generalize to another layer of management?

A: didn’t think about it.

Q: something about cache flush. Didn’t understand….

Consistency Without Ordering

Work from our group.

Q: what’s the memory overhead?

A: only extra bitmap store. So only one bitmap block per 4096 blocks.

Q: how large is the file system. As for large fs, scan time increase. How about almost full fs? Finding a free block costs! There are full fs!

A: Common case fs not so full. For ful file system, not the best approach.

Q: where do you store back pointers.

A: OOB in future disk

Q: strong consistency guarantee? Not as strong as some file systems?

A: We provide data consistency: when access data, data blongs to this file, but could be stale data. Ext3 provides stronger consistency. On disk image may not be a image of a file system that existed.

Q: other problem could solve?

A: any system you have hiarachy (parents and child).

Q: CPU overhead?

A: we looked at it. Not too much different from ext2. We don’t know how many extra circles though.

Q; backpointers removed when file being deleted?

A: Currently lazy deletion. So for short no! We rely on mutual pointer agreement..

没有评论:

发表评论