2013年11月27日星期三

Some Linux tricks

1. Error message "Invalid output format udev. Choose from value, > device, list, or full >" when installing kernel
    This is because system uses an old version of blkid, which doesn't recognized the newer udev format.
    To solve this, just install a new util-linux pacage. Sometimes yum screw up and have a lasted util-linux but blkid still not updated; just reinstall in this case: yum reinstall util-linux 

2013年11月18日星期一

MIGRATORY COMPRESSION

EMC

Coarse-grained Data Reordering to IMprove COmpressbility

compression get limited by its window size, larger window size leads to lsower compression

MC: first to coarse-grained reorganization to group similar blocks, so that you can improve compressibility with the same window size

Gzip: HUffman coding 64KB window
Bzip2: 900KB window, Burrows-WHeelses Transformation abd Huffman coding
7z: Markov chain-based range codec, 1GB window
(However, no apple to apple comprison as the don't know how to increase the window size of Gzip etc...)

Almost reverse linear relation between throughput and compression factor

Idea: improve compression by grouping similar blocks

Burrows-Wheelers: reorder within block
MC: reorder the blocks

Usecase:
1. compress a single, large file
2. Archival tier of deduplicating storage system
3. network transfers (maybe?)

Delta compression: just store how to transfor from A to A' in order to store A'. OK for single file, too complex for a deduplicating storage system, not good for locality neighter

Workflow of MC:
to compress:
segmentation (partition into blocks, calculate super fingerprints)----> similarity detector   -----> re-organizer
to un-compress:
header to specify how to re-construct file

Similarity Detection: similar to "reduendancy eleimation at block level", caluclating features by some set of rolling hashes.

Ororganization: reorganize input file, according to a recipe
1. in-memory, read in entire file
2. chunk-level: random access
3. multi-pass: in general better than chunk level

Evaluation:
At least for an in-memory ssystem, adding MC before compression never does worse than vanilla copression. If things don't fit in memory, not that much because of random accesses incurred. SSD helps though. --> Remzi: can you just limit the re-ordering scope so that it fits into memory?

Uncompression has 30% to 70% overhead in an archiveral system though, due to random access


Q: How about workload specific compression? As if you know the workload better, you may be able to compress better? MC is a pretty generic technique though?
A. I need to think about it.
     But NetAPP's vedio compression falls into that..














2013年11月15日星期五

SyNS'13 Session 6: Security

Resource-freeing Attacks: Improve Your Cloud Performance (at Your Neighbor's Expense)
Venkat Varadarajan, Mike Swift, Tom Ristenpart

Side channel stuff, take advantage of cpu scheduling etc.
Solution: better isolatin, pack-up VM and move (not all workloads cheap to move).

This work: greedy customer can get more resources by resource-freeing attack

Victim:
Beneficiary:
Helper: mounts the attack, could sit in beneficiary

Example:
Cache contention: victim webserver frequently interrupts, pollutes the cache for the beneficiary (because Xen gives higher priority to VM consuming less CPU time). Helper ask for cpu-intensive requests from the webserver, thus reduce the interrupts webserver generates, beneficiary improves

General Resource-freeing attacks:
1. Send targed requests to the victime
2. Shift resource usage away from the bottleneck

OpenNF: All your networkk functions are belong to us (almost)
Aaron Gember, Aditya Akella

NFs (or middleboxes) examine/odify packets at layers 3-7, and we have increasing deployment of software NFs.

But now NFs are treated as blackboxes, thus hard to identify bottlenecks etc.

Example:
If you scale up, you need to move flows, otherwise bottleneck may persist or you delay your scale up. But simply removing flow may lead to inconsistency. Move flows along with the internal states of the middleboxes.

Solution: OpenNF
APIs implemented by NFs so that applications can examine/update middlebox's internal states.
Problem of VM replication: unneededstat may cause incorrect actions, and cannot merge thus incapable of complex load balancing

NF state taxonomy:
Per-flow state, multi-flow state, all-flow state (statistics)

Per-flow state we can just move around,  multi-flow state we share (but synchronization expensive, so clone and lazy merging only when scaling down), all-flow state the sameas multipl-flow state. 

State-consistency: (because network could update states while we are moving things)
Suspend flow traffic and move one solution.
Packet redirect events a more efficient solution (Critical that you move the states before you update the network routing)


RiskRoute: A Framework for Mitigating Network Outage Threats
Ramakrishnan Durairajan, Paul Barford

Adjust internet routing before outage (by natural disasters) happen at rea ltime.

Does internet routing currently take advantage of the predictability of natural disasters?
Yes, but now done by hand, thus incomplete, in-efficient

Bit-Risk Miles Metric:
Assess sensitivity to network outages. Defined as
                     # of Bits Sent + Distance + Outage Risk = Bit-Risk Miles

RiskRoute methodology:
An optimization problem which minimize the bit-risk miles.

Evaluation:
1. Risk ratio: average reduction in bit-risk miles
2. Distance ratio: average increase in bit-miles.





SyNS'13 Session 5: Bugs

Toward Efficient, Portable Application-Level Crash Consistency
Thanu, Remzi

Many techniques for file system consistency (for metadata), but how about application data consistency (for user data)?

How does application achieve consistency: they rely on specific details of file system implementations.

Application-level invariants: thumbnails match pictures, etc. But hard to maintain over system crashes.

Example:
Atomic file update: create temp file, fsync temp file to disk, then use rename(). fsync() maybe left because of wrong understanding, performance sake, or because on most file systems (ext4), it is actually correct.

What to study application consistency behavior.
Methodology: case study of SQLite and LevelDB)

Properties offered by file systems: (Inspired by the disk properties sqlite rely on, I think)
1. Post-Crash property (true/false): does a system call sequence only result in given, desirable set of post-crash states? ext3-ordered: yes, ext3-writeback: yes, ext4-ordered: no, btrfs: yes
2. Ordered Appends (true, false): ext3-ordered: yes, ext3-writeback: no, ext4-ordered: no, btrfs: no
3. Ordered dir-ops: directory operations get persisted in issued order. 
4. Safe-appends: when a file is appended, the appended portion will never contain garbage
5. Safe new file: After fsync() on a new file, another fsync() on the parent directory not necessary. 

Bugs in SQLite and Level DB: all of them only exposed on certain file systems. And developer said that they don't understand file system behavior, so they had to be conservative for their implementation, thus hurting performance.

Performance:
Experiments show that 3x performance boosts if you can rel on certain file system properties to optimize application behaviour!

Future work:
Solutions which doesn't require application re-writing, system call re-ordering, maybe??? Tools to detect such bugs?


Efficient COncurrency-Bug Detection Across Inputs
Dongdong Deng, Shan Lu

Concurrency-bug detection is costly: need to take many inputs, each input has a huge interleaving space to test.  Software company cannot afford exhaustive in-house testing.

Idea:
Remove redundancy Across Inputs. Because multiple inputs could trigger the same bug. How could we remove such overlaps?

Solution:
Find the overlapping interleaving space for different inputs, given a set of inputs?
Which two inputs will give the same interleaving space? We cannot answer this question perfectly, but we can estimate. To this end, concurrent function paris (CFP) metric is proposed.

Characteristic study:
For each harmful data race, on average 7 inputs trigger it; for each benign data race, on average 4 inputs trigger it.

Approach:
1. Profile interleaving space of each input: look at functions instead of instructions, as only a few functions share memory accesses; thus look at all the function pairs which could be executed concurrently. If two inputs have similar concurrent function pairs, then they are likely to trigger the same bug.
    Naive way to detect CFP: look at every instruction in each function, see if they can run concurrently
    Efficient way: whether one function's entrance point can execute between the other's entrance and exit, by looking at locks and barriers.
2. A test plan of selecting inputs and functions.
    Based on the CFP information, use a greedy algorithm to select input which covers the most CFPs.
    Also, select functions which we test. 
3. Conduct bug detection.

Result:
4 times redundancy reduction thus speed up.


Limplock: Understandingthe Impact of Limpware on Scale-out Cloud Systems
Thanh Do, Haryadi Gunawi

Hardware fails in different ways, how about performance degradation as a failure manifest?
Facebook reports one slow NIC in one machine (1GB/s to 1 kbps) has cascading impact on a 100 node clusters. And many other stories.

Limpware: hardware whose performance is much lower than its specifications.
Study the impact of Limpware: (a single limpare has global impact, why?)

Methodology:
Run workload, inject limpware, then probe systems to understand symptoms.

Results:
1. Failures are handled pretty well (by retry etc.), but slowdowns are not!
2. Hadoop's speculative exec is not triggered. (E.g, a degraded NIC on a map node, but all reducers need to fetch data from that node. Speculative exec not triggered because all reducers are slow! In general, a degraded task leads to a degraded node, then leads to many degraded nodes)

Limplock:
 They system progresses slowly due to limware and is not capable
Level 1: Operation limplock
              caused by single point of failure etc.
Level 2: Node limplock
              caused by multi-purpose queue, unbounded queue, etc
Level 3: Cluster limplock
              
Limplock happens in all systems they analized.

Future work: Limpware-tolerant system
   1. Anticipation
   2. Detection
   3. Recovery

SyNS'13 Session 4: Measure it!

Inaccurate Spectrum Databases?
Tan Zhang, Suman

TV whitespaces are availbe, how do we use them?

Things people are already doing:
1. Rutal internet connection
2. Smart power-grid controller
3. Bus internet

But it shouldn't interfere primary TV users, thus need to determining whitespaces.

Wardriving on Public Vehicles (spectum sensor depoyed on Metro buses)

Accuracy of Commercial Databases:
1. Good for protecting used spaces
2. But too conservative thus waste while spaces.

V-scope: opportunistic
Wireless Gateway upload data to V-Scope Servers, then Serve uses internet to distribute information to participating databases.

Callenge in measuing Primary Signal:
Weak signals could be overwhalemed b strong signals ---> Zoom-in Peak detection (narrowing capture bandwidth to reduce noise floor).
Power ---> peak based power estimation.
Model refinement --> improve any propagation models by fitting its parameters with measurements. ---> weighted regression fitting to allow good fitting even for weak signals.


Enhancing the Quality of Electric Load Forecasting Methods
Scott ALfeld, Paul Barford

Predicting Electricity: compnents, buildings, cities, countries, days, years.

This work looks at buildings and cities, not single homes.

Problem:
Many, many buildings make up cities
Easily available: Historical electricity usage
Less so: Everything else (what if you have a new building, a new event?)

Data: hourly electriy usage for 2-3 years, NAICS Code ("business type")
Task: Predict tomorrow's (hourly) load

Solution:
Latent Feature Extraction
Anomaly Detection
Priority Encoding

Latent Feature Exatrction:
Solid, but unknown features: number of floors, surface area
Fuzzy and unknown features: sensitivy to humidity, behavior during holidays, ??? (something we didn't thought of)

Anomaly Detection:
Individual Anomalies (for one building): boiler upgrade, 50%-off sale
Shared Anomalies (for multiple buildings): three-day weekends, snowstorms

Priority Encoding:
Accuracy may not be the most important thing. Peak is much more important than total usage.
Peak prediction and Discontinuous penalties: MSE: 1/n* sigma(ti - yi)^2 doesn't work because peak error are more important for your prediction than errors which are made for free hours; you also don't want false positive peaks.
Challenges: need to aligin the peak with actual peak, but not a straight line at peak value! It needs to be easy to optimize on.
We don't have an solution for that yet....


Observing Home Wireless Experience through WIFI APs
Ashish Patro, Suman

A measurementinfrastrure which could capture the "wireless experience" in home WLANs.
Questions asked:
1. What is the wireless performance at any given moment?
2. How often does poor performance occur?
3. What are the causes of poor performance?

Wise infrastructure (wireless infrastructure of in-line sensing)
Data collected:
Signal strength of devices, air timeutilization, latencies. wifi packet statistics.
Time correlation analysis
Characterizing wireless performance:
   metric: only passive and corse local measurements, capture impact of links, provide app anoglistics
    Input: air time utilization, total contention, ????

Results:
Cause of poor performance
in building 1, when performance is poor, 60% of time we have high air time utilization and high packet loss, thus indicate channel congestion
in building 2, when performance is pool, latency is high and packet loss is high, indication a week signal source

Interference of Microwave oven:
Interference is short, but present in most buildings 

















New ENcryption Primitives for Uncertain TImes

Encryption is one of the few things that you can rely on.

But existing encryption primitives aren't always helping:
DPI can block protocols (like Tor) --> encrytpion tools are easy to detection ---> format transforming encryption
Dropbox has access to your data --> encryption doesn't allow de-duplication to save space --> message-locked encryption
Passord-based encryption often get bad passwords ---> de-encrpty yield garbage for wrong key, but meaningful things for correct key, thus good for brute force attacking--> Honey encryption

1. Encryption doesn't try to hide its persence
    Stegonagraphy: emmed bits into HTTP messages; but too slow
    Obsfproxy: encrypt all bits sent over network, but DPI will flag traffic as "????"
    Want:  Trick DPI to think its HTTP tranffic
     Can we build encryption which could fool regular expression matching?  ---> Format-transforming encryption: take clear text and a regular expression R, and the encrypted text is guaranteed to match R.
      Mechanism: Ranking for DFAs, but now for regexes (exponential complexity, but it's OK here)
      Can DPI adapt to detect FTE?
       1. Use R1, R2 against FTE --> flse positives
       2. FInd R that matches against FTE, but not legimate --> fast to change FTE formats
       3. Find non-regular checks ---> speed?
       4. Costum fight back? ----> ???

     FTE beats China's blocking of Tor, but Tor guys are skeptical about deploying an "academic" product (like they weren't an academic product. themselves....)

2. De-duplication doesn't work with conventional client-side encryption
    Prior work (in distributed system literature, surprisingly): convergent encryption (CE)
     Their solution:
        M --> KeyGen(M) ---> K (being used for encryption Enc(K,m)--> ?????
     Big idea: message is "shared  encrypted"
      But it has brute-force attaching problem.
       DupLESS is built to overcome the brute-force attaching issue of MLE

   Q: If we store the same data, then I know that you know (that I know...?)
   A: Yes, but this an orthogonal (side channel) issue with what we are looking at. 


3. Password-based encryption -> people don't pick good passward, you need to assume it is drawn from a small set
  
message + secrate key K ---> Cipher text C. Brute force attach possible by iterating K ( only one decrptyed text is not garbage)
Prior solution: PKCS5 standard, which do decryption b lots of hashing, thus attacker has to guess the right key each single time, but slow, and it onl slows down previous attack by constant factor.
Idea: increase the complexity of choosing the right key by producing plausible output if you use the wrong key to decrypt
Honey Encryption: same API as password-based encryption schemes, but use special encodings to ensure that decrypting ciphertext with *wrong*  key yield fresh sample from some target distribution (like an uniformation distribution in natural language).
Honey encryption for prime numbers: P (prime number) + K ---> C. Attacker can just try decrypt with each key and look for the prime numbers. Now you want an encoding so that every decrption of C yield a prime number.
           P --> (distribution transforming encoder) --->> Uniform bit-string ----> COnventional Encryption with K --> C
           Attacker: C --> decrypt with K' ---> Fresh Uniform bit string ---> Distribution transforming unencoder  ---> P'
      No attacker A can recover message with probability bettern than 1/q (the size of set where you draw password from)  no matter how long you runs 

Hopefully Next year we will have a general encryption scheme which covers all three above (because they are based on similar ideas anyway).

Q: Have you looked at other security real-world problems that you think could use an interesting encryption scheme, but you don't know what that encryption sheme yet?
A: Yes, all the time. LIke Gmail encryption, you lose spam detection, and we don't know how to solve it. In general, if you want data to be safe, but you also want third-party to do something about your data (de-duplication, spam detection), you have a problem.

SyNS'13 session 3: Fast & Safe

How is My Driving: Sensing Driving Behaviors by Android Devices
Lei Kang, Suman

Snapshot: hardware solutions to count driving miles and hard brakes (speed decrease mph per second). 30% discount for "good drivers).

No special hardware approach:
rating system based on sensors (acelerometer and gyroscope for acceleration, brake, ture, change lane)
Keep track of traffic and road conditions.

Challenges:
  1. People put their phone in different location
      Solutions: coodrinate projection to tranlsation the phone coordination in any rotation to the earth coordination. 

  2. Hard to defind good or bad hehavior
      Soluton: compare system rating and passenger rating

Related work:
Driver phone detection by car speakers(mobicom'11)
Driver phone detection by different central force reading on accelerometer(mobisys'13)
Distracted driver detection by mobile phone cameera (mobisys'13)

Q: Immediate feedback would be useful?
A: That is realtime detection of driving behavior. What we are doing is offline study, but online study wouldn’t be difficult.


Rinnegan: Efficient Resource Use for Heterogeneous Processors
Sankar, Mike Swift

Task scheduling: task can be run on different processing units (Parallel for GPU or CPU, encryption for CPU or AES). Not always good to use GPU because contention and data transfer overhead.
Power is limited: all units cannot be used at the high performance. Now we cap power usage pretty well, but don't distribute power smartly over different units

Rinnegan:
 Adaptive task scheduling, power awareness.
1. Adaptivity is achieved by OpenKernel, which expose resource utilization and allow application to make task placement decision. Don't do it transperantly in kernel because application knows best, and accelator can be accessed directly by apps.
    Impletment by accelerator stub, which predicts utlization, acceleartor agent, which implement scheduling decsions, enforce power limits and expose utilization informatio, and accelerator monitor, which publishes information exposed by agents to applications. 
2. Power Abstraction
  Power credits maintained by a central power center, power agent calculates power model and ask for power credites.