2013年2月21日星期四
Higgs might be doom for the entire universe?
There seems to be some media heat on the claim that our universe could be swallowed by an alternate one based
on calculation using the current Higgs mass.
“At some point, billions of years from now, it’s all going to be wiped out…. The universe wants to be in a different state, so eventually to realise that, a little bubble of what you might think of as an alternate universe will appear somewhere, and it will spread out and destroy us,” Lykken said at AAAS.
This is based on a renormalization group calculation extrapolating the Higgs effective potential to its value at energies many many orders of magnitude above LHC energies. To believe the result you have to believe that there is no new physics and we completely understand everything exactly up to scales like the GUT or Planck scale. Fan of the SM that I am, that’s too much for even me to swallow as plausible.
Commented by Peter Woit
2013年2月8日星期五
HARDEN FS (selective 2-verisoning on HDFS)
common solution to deal with fail-silent failure:
1. using repliated state machine
2. n-versioning programming
Main idea of HARDFS (selective 2-versioning):
0. better crash than lie! Thus keep watching and whenever somebody is doing something wrong, either recover or just kill them.
0. make use of the fact that the system is already robust, and being able to recover from a lot of failures (e.g., crashes)
1. selective (only replicate important state)
2. use bloom filters to compactly encode states (i.e, all file system states are encoded in terms of yes-or-no questions) -- they use a particular kind of boom filter which supports deletion
3. ask-then-check for unboolean verification???
Evaluation of HARDFS:
1. detect bit-flip error pretty well, more crashes because more bookkeeping (better crash than lie!)
(how well it is on more realistic/correlated errors is still unknown -- butt he did do experiments which shows it protects bugs from mozzila bug report?)
2. 3% space overhead, 8% performance overhead, 12% additional code -- because only a part of state is replicated and a part of code is 2-versioned.
1. using repliated state machine
2. n-versioning programming
Main idea of HARDFS (selective 2-versioning):
0. better crash than lie! Thus keep watching and whenever somebody is doing something wrong, either recover or just kill them.
0. make use of the fact that the system is already robust, and being able to recover from a lot of failures (e.g., crashes)
1. selective (only replicate important state)
2. use bloom filters to compactly encode states (i.e, all file system states are encoded in terms of yes-or-no questions) -- they use a particular kind of boom filter which supports deletion
3. ask-then-check for unboolean verification???
Evaluation of HARDFS:
1. detect bit-flip error pretty well, more crashes because more bookkeeping (better crash than lie!)
(how well it is on more realistic/correlated errors is still unknown -- butt he did do experiments which shows it protects bugs from mozzila bug report?)
2. 3% space overhead, 8% performance overhead, 12% additional code -- because only a part of state is replicated and a part of code is 2-versioned.
More details on HARDFS:
selected part:
harden namespace management,
harden replica management
harden read/write protocol
micro-recovery
second verison watches input/output
node behavior model
state manager:
state manager to replicate subset of states (need to understand HDFS semantics)
use bloom filters because it does boolean verification well
to update bloom filter state (ask main version for values and check with the 2nd version bloom filter)
actively modified states in concrete form to enable in-place updates -- to avoid CPU overhead
false positive in bloom filter only results in unnecessary recovery (as long as faults are transitive, non-deterministic)
action verifier:
four types of wrong actions:
corrupt
missing
orphan
out of order
Handling disagreement:
using domain knowledge to ignore false alarms
for true alarms, I think they just re-start system using on-disk and other node's states
Recovery:
1. crash and reboot (expensive)
2. micro-recovery (pin-pointted corrupted state by comparing states of two verisons, then only reconstrute corrupted state from disk)
however needs to remove corrupted state in bloom-filters
solution: new bloom filter to start over, and add all right states to that new bloom filters
3. thwarting destructive instructions
???? master just tell node?????
2013年2月7日星期四
Some computer science career tips
http://cra.org/resources/bp-view/best_practices_memo_computer_science_postdocs_best_practices/?utm_source=Computing+Research+News&utm_campaign=6640b57cf7-February_2013_CRN
How to get a faculty job, Part 1: The application, interview and offer
2013年2月5日星期二
Linux File System Evolution
pages.cs.wisc.edu/~ll/fs-patch
study linux bug report to understand:
1. what patches do?
2. what type of bugs?
3. what techniques are used to improve performance/reliability?
patches = 40% bug fixes + 9% reliability/performance improment + 10% features + other maintainance
bug fixes = 60% semantic bugs + 10% concurrency bugs + 10% error code bug + 10% memory bug
stable file system doesn't mean bugs will become fewer (feature adding, etc.)
bug consequence = 40% corruption + 20% crash + 10% error + 10% wrong behaviors + others(leak, hang, etc.)
as opposed in application (user space) bugs: wrong behavior dominates
transaction code may introduce a large number of bugs, tree is not really error-prone
40% of bugs happen on failure paths (people don't handle failure well)
performance improvment = 25% sync optimaztion + 25% access optimzaiton + 25 scheduling optimzation
study linux bug report to understand:
1. what patches do?
2. what type of bugs?
3. what techniques are used to improve performance/reliability?
patches = 40% bug fixes + 9% reliability/performance improment + 10% features + other maintainance
bug fixes = 60% semantic bugs + 10% concurrency bugs + 10% error code bug + 10% memory bug
stable file system doesn't mean bugs will become fewer (feature adding, etc.)
bug consequence = 40% corruption + 20% crash + 10% error + 10% wrong behaviors + others(leak, hang, etc.)
as opposed in application (user space) bugs: wrong behavior dominates
transaction code may introduce a large number of bugs, tree is not really error-prone
40% of bugs happen on failure paths (people don't handle failure well)
performance improvment = 25% sync optimaztion + 25% access optimzaiton + 25 scheduling optimzation
2013年2月4日星期一
GPUfs: GPU integrated file system
Mark from UT-Austin
argue for more suitable abstractions for accelerators
accelerators != co-processors (shouldn't be handled as such at software layer) -- even though in the hardware gpu is a co-processor --- it can not open files by itself (as it cannot interrup into host hardwares)
thus On-accelerator OS support + Accelerator applications
In current model, GPU is a co-processor, and you have to manually do double-buffering, pipelineing, etc. i.e., too much low level details exposed. (9 cpu loc for 1 gpu loc)
gpu has thre leves of paralleism:
1. mulitple cores on gpu
2. muliple context on a core (to compensent latency of memory access)
3. SIMD vector parallelism, i.e., mulitple ALUs via data parallesim
gpu can access its local memory 20x ( 200GB/s v.s 16GB/s) faster than accesing cpu memory, consistency is also compromised
file system API design:
1. disallow threads in the same SIMD group to open different files Iall of them collaboratively execute the same API call) --- to avoid divergence problem
2. gopen() is cached on GPU (same file descriptor for the same file, i.e., offset shared, thus read/write has too specify offset explicitly) gread()=pread(), gwrite()=pwrite()
3. When to sync? can't do it asynchrounsly because gpu can't have preemptive threads, and cu polling is too ineffcient. so they require sync explicity (otherwise data never gets to disk!)
GPUfs design:
1. system wide buffer cache : AFS close(sync)-to-open consistency semantics (but per block instead of per file)
2. buffer page false sharing (when gpu and cpu writing to different offsets of the same page)
3. client(gpu)-server(cpu) architecture based RPC system
4. L1 bypassing is needed to do the whole thing
Q&A:
1. why file system abstraction instead of shared memory abstraction? you don't need to do durability on gpu anyway.
if you have weak consistency on memory, hard to program. but if you have weak consistency on file system, no big deal.
argue for more suitable abstractions for accelerators
accelerators != co-processors (shouldn't be handled as such at software layer) -- even though in the hardware gpu is a co-processor --- it can not open files by itself (as it cannot interrup into host hardwares)
thus On-accelerator OS support + Accelerator applications
In current model, GPU is a co-processor, and you have to manually do double-buffering, pipelineing, etc. i.e., too much low level details exposed. (9 cpu loc for 1 gpu loc)
gpu has thre leves of paralleism:
1. mulitple cores on gpu
2. muliple context on a core (to compensent latency of memory access)
3. SIMD vector parallelism, i.e., mulitple ALUs via data parallesim
gpu can access its local memory 20x ( 200GB/s v.s 16GB/s) faster than accesing cpu memory, consistency is also compromised
file system API design:
1. disallow threads in the same SIMD group to open different files Iall of them collaboratively execute the same API call) --- to avoid divergence problem
2. gopen() is cached on GPU (same file descriptor for the same file, i.e., offset shared, thus read/write has too specify offset explicitly) gread()=pread(), gwrite()=pwrite()
3. When to sync? can't do it asynchrounsly because gpu can't have preemptive threads, and cu polling is too ineffcient. so they require sync explicity (otherwise data never gets to disk!)
GPUfs design:
1. system wide buffer cache : AFS close(sync)-to-open consistency semantics (but per block instead of per file)
2. buffer page false sharing (when gpu and cpu writing to different offsets of the same page)
3. client(gpu)-server(cpu) architecture based RPC system
4. L1 bypassing is needed to do the whole thing
Q&A:
1. why file system abstraction instead of shared memory abstraction? you don't need to do durability on gpu anyway.
if you have weak consistency on memory, hard to program. but if you have weak consistency on file system, no big deal.
订阅:
博文 (Atom)