Sanjam Garg IBM Watson
make computer programs "unintelligible" without affecting their functionality
What are we trying to hide (using an example)
what kind of sequence do obfuscation try to hide?
1. try to hide private key in a program?
2. patching software securely: don't want hackers to use patches to reverse engineer what the bug was
--use obfuscated patch
Methods:
1. manual obscurity by programmer effort (not that difficult to decipher...)
2. Crypto Obfuscation (by just changing function names, add dummy variables etc...) -- doesn't hide the sequence inside the program
All ad-hoc. So they makes the hacker's mission hard, but not impossible
f[password, m](x) = m if x = password
null otherwise
if you know the password, you should know what m is
if you don't know the password, m should also be hidden from you
New mathematical tools:
Multilinear Maps
Solved a few open problems:
1. Non-interactive Key Agreement
from two party setting (1976) to a three party setting (2000)
now for arbitrary number of parties (2013)
2. Functional Encryption for all Circuits
Public-Private key, but now want Google to check spam (give specialized SK' to Google instead of SK) --- Alice could just give the obfuscated program of spam checker to Google)
Prior Work: very limited functions for which secret keys could be issued
Now: All functions!
3. Witness Encryption
encrypt to arbitrary recipient whoever with a proof (or, a NP Statement)
e.g.: twin prime conjecture, $1M to the solver put in a safe box
whoever has a proof to twin prime conjecture should be able to use it to open that safe box
Prior: don't know if possible or not
Now: Solved!
And many others...(Remove random oracles, properties of zero-knowledge)
How do we do it?
high level idea:
1. for every input x, there is a unique set of puzzle piece which reveals the output on input x
every other combination reveals nothing
two steps: program->unencrypted pieces->encrptyed pieces
e.g. a special function (witness encryption)
G = (S, X) be a bipartite graph, b is {0, 1}
S = {A,B,C,D,E,F}
X = {1, 2, 3, 4, 5, 6, 7}
f[G,b](T) = b (T is an exact cover)
o otherwise
want: hide the bit b
Exact cover is NP-complete, obfuscator doesn't know the exact cover
security: G has no exact cover --> b is hidden (a bit simpler/weaker, but still no trivial as when you write the program, you don't know if G has exact cover or not)
Step 2: (their new encryption scheme)
an a level encryption scheme for Zp, prime p
have Enc1, Enc2, Enc3...Encn
this (new) encryption scheme has the following property:
hiding: Enci(x) hides x
multiplication: Enci(x)* Encj(y) = Enc(i+j)(x*y)
equality: Encn(x), Encn(y) can check if x = y
Step 1: Jigsaw Puzzle
for each vertex X, pick a random number ai
7-level Encryption scheme (based on what A, B, C, D...connect to)
eA = Enc3(a1a4a7)
eB = Enc2(a1a4)
eC = Enc3(a4a5a7)
eD = Enc3(a3a5a6)
eE = Enc4(a2a3a6a7)
eF = Enc2(a2a7)
e = Enc7(a1a2a3a4a5a6a7) if b = 0
Enc7(r) if b = 1
given three pieces eB, eD, eF (which is an exact cover)
eB*eD*eF = Enc2(a1a4)*Enc3(a3a5a6)*Enc2(a2a7) = Enc7(a1a2a3a4a5a6a7)
recovered!
what if G has no exact cover (remove edge B->a4 in G)
now eB chafes to Enc1(a1)
now you cannot recover based on multiplication.
Q: why multiplication instead of addition?
A. If addition, you can subtract. But division would be hard
Q: why is this more general than previous scheme?
A: Now when I encrypt, I don't know the key! I just know the key has certain properties
Q2: YES, but how does this enable you to do interesting things?
A: That's complicated...
Future work:
1. What (minimal) computational assumptions are needed for obfuscation?
2. How to make obfuscation more efficient? (Now it is very inefficient). Might be hard, but we may be able to simplify for specific applications. Or maybe use trusted hardware.
Q: What are your computational assumptions?
A: It is complicated...=,=
Q: How inefficient?
A: Polynomial. power 7 or 8, I don't even want to mention the constant factors...
2014年4月10日星期四
2014年4月4日星期五
Strata: High-Performance Scalable Storage on Virtualized Non-volatile Memory
Dutch Mayer, Coho Data
(An extension to the FAST'14 talk)
Actually I think it is the most SDS like work I ever heard about. They deal with SDN-storage interaction too
Goal: take high end flash memory, add all the enterprise features and sustain the high end performance as much as possible (not possible to sustain all because the flash is so fast, everything you put on top is problematic)
PCI-based Flash: a first step toward a real use of non-volatile memory
Problem:
1. It is really fast, and if you put it in a system, something will break and you won't get the raw performance.
-- prioritize device utilization
2. Hardware evolves really quickly, so need to abstract it in some way
-- virtualize, scale out
3. Have to play well with others (dealing with enterprise, you have to support old protocols, various hardware, different application domains, etc...)
Ideas:
high end flash now look a lot like CPUs: fast, expensive, mostly idle. So we want to virtualize it like we virtualize CPU.
Actually flash is frequently bottlenecked on CPU (if you add more cores, you might get 80% performance boost). So you really need to balance the cpu, the network and the flash
architecture 1. virtualization:
each flash organized as a log structure file system with updated records
records form objects, and objects are organized in a B+ tree style, eventually a big dumb address space
architecture 2. flexibility
inspired by Click: packets (requests) pass through pipeline, (data pahs) at each stage of the pipeline, you mutate it (or pass it...). That's how they handle replication, tripping, etc. (This is kind of SDS like, isn't it...)
load balancing with data paths: just by chaining data paths
from data paths to protocols: a library which binds to different kind of front end (NFS, fuse, mysql). They also allow you to push things down into the data path...
They ship with an SDN switch...(and a lot of other stuff which their box work with...)
Some technical problems:
1. support NFSv3 as a scale-out architecture (they use SDN Scaling, use the switch as a load balancer by pushing some rules. they are limited by the size of rule tables though)
2. remote replication in mutable trees
Future work:
1. they focused on utilization, bu QoS is important too
2. tiering and prefetch
3. next year's flash is 2x better
Q&A:
Q:How do you analyze where you bottleneck. Where do you bottleneck?
A: Btree, uncontested locks (because of cache coherence) and almost everything. Doing analysis is really hard, instrumentation and tracing help a lot. Then we come up with a theory and try test it.
(An extension to the FAST'14 talk)
Actually I think it is the most SDS like work I ever heard about. They deal with SDN-storage interaction too
Goal: take high end flash memory, add all the enterprise features and sustain the high end performance as much as possible (not possible to sustain all because the flash is so fast, everything you put on top is problematic)
PCI-based Flash: a first step toward a real use of non-volatile memory
Problem:
1. It is really fast, and if you put it in a system, something will break and you won't get the raw performance.
-- prioritize device utilization
2. Hardware evolves really quickly, so need to abstract it in some way
-- virtualize, scale out
3. Have to play well with others (dealing with enterprise, you have to support old protocols, various hardware, different application domains, etc...)
Ideas:
high end flash now look a lot like CPUs: fast, expensive, mostly idle. So we want to virtualize it like we virtualize CPU.
Actually flash is frequently bottlenecked on CPU (if you add more cores, you might get 80% performance boost). So you really need to balance the cpu, the network and the flash
architecture 1. virtualization:
each flash organized as a log structure file system with updated records
records form objects, and objects are organized in a B+ tree style, eventually a big dumb address space
architecture 2. flexibility
inspired by Click: packets (requests) pass through pipeline, (data pahs) at each stage of the pipeline, you mutate it (or pass it...). That's how they handle replication, tripping, etc. (This is kind of SDS like, isn't it...)
load balancing with data paths: just by chaining data paths
from data paths to protocols: a library which binds to different kind of front end (NFS, fuse, mysql). They also allow you to push things down into the data path...
They ship with an SDN switch...(and a lot of other stuff which their box work with...)
Some technical problems:
1. support NFSv3 as a scale-out architecture (they use SDN Scaling, use the switch as a load balancer by pushing some rules. they are limited by the size of rule tables though)
2. remote replication in mutable trees
Future work:
1. they focused on utilization, bu QoS is important too
2. tiering and prefetch
3. next year's flash is 2x better
Q&A:
Q:How do you analyze where you bottleneck. Where do you bottleneck?
A: Btree, uncontested locks (because of cache coherence) and almost everything. Doing analysis is really hard, instrumentation and tracing help a lot. Then we come up with a theory and try test it.
2014年4月2日星期三
Google glasses, memory and computational neurology
FIXME:
write it...
(especially how brain and other facilities interact)
write it...
(especially how brain and other facilities interact)
2014年3月27日星期四
A new class of bugs
classify bugs, then an ecosystem for a bug class
see also:
Toward Optimization-Safe Systems: Analyzing the Impact of Undefined Behavior (SOSP'13 best paper)
http://yangsuli.blogspot.com/2013/11/sosp13-papers.html
e.g.:
buffer overflow:
tools: Purify, Valgrind
System: ALSR
Languages: Java
A new class: unstable code
state of the art: turn off optimizations if seeing weird bugs
How prevalent?
How to think about it?
How to detect?
undefined behavior leads to unstable code
example: if(buf + off < buf) (gcc translates to if(false) because pointer overflow behavior is undefined
also: signed integer overflow, oversized shit, null pointer dereference, absolute value overflow
code may or may not work
How prevalent?
all major compilers, all major languages (they have a nice graph in paper) silently discard unstable code (in different ways). so changing/upgrading compiler may lead to broken system
more optimization over time (which makes problems worse)
How to think about it?
first approach: consider unstable code as dead code
problem: restricted to one particular compiler, and lists of false warnings because compiler always kills code
second apprach: as bug patterns
problem: aka antivirus software, inherently inomplete
third approach: as undefined hehavior
problem: too many false positives
Final approach:
cause: disagree on spec (undefined behavior)
Formulation overview:
disagreement delta: compiler: program never invokes undefined heavier
what can be done only with delta: kill unstable code
Two round simulation: first without delta, second with delta, see if output is different
i.e., first time SAT oracle->N, and second time SAT oracle->Y, then unstable code.
Formulate delta:
reach(e, in)
undef(e, in)
delta(in) = for every e: reach(e, in) -> not undef(e, in)
SAT oracle: (booldector solver)
false negative or positive:
1. phase 1 not powerful enough->false errors
2. phase 2 not powerful enought->false negative
How to detect:
STACK: unstable code checker
challenge: previous statements requests inspect the entire programs, would lead to gigantic boolean predicate
solution: per-function and approximation
Summary:
unstable code lead to subtle bugs
language designer: be cautious about undefined behavior
compiler writers: use our techniques to generate better warnings
programs: check your code using STACK...
Research approach: (I like this!)
identify systems problems
derive general solutions
build practical systems/tools
e.g. GUI not-responding (as graph reachability problem)
replay (as graph cut problem)
killing bugs: theorem proving (CompCert: formally verified C compiler CACM'09, seL4: formally verified OS kernel but that's just 8,700 LOC)
identify subcompnones in Linux kernel to verify
Q&A:
Q:Is it because C language just makes some decisions?
A: It is more like if you could design a standard which doesn't have this problem? C developers actually have incentives to make signed integers overflow undefined. If you want to define buffer overflow, then you have to instrument every memory write, then you need a GC, you just turned C into Java...
Q: Is missing code the only result undefined behavior results?Why do you choose to focus on code discarding?
A: An interesting direction to pursue...We focused on this because that's what got our attention.
Q: Does higher level language have fewer undefined behavior? Is it because of the nature of the language being high level, or is it because it's just newer thus better?
A: Yes. And unstable code is (kinda) OK in application level, not so in system level code.
Q: Other undefined behavior like function argument evaluation order, sequence points?
A: I feel like it's less interesting because it can be solved by the frond-end of the compiler. Also, you could statically check that in your source code. There are also C-light without sequence point problems.
2014年3月7日星期五
Greening Datacenters Through Self-Generation of Renewable Energy
(Software defined datacenter power management. And they build a real system: hardware + software on a small size datacenter)
Motivation:
Datacenters uses lots of energy.....Most of it consumed by small datacenters.
Can we use renewables?
Approaches:
1. Buy renewable energy from other power plants off-site. (Google buys wind power from NextEra) --- not stable enough, transmission loss.
2. Self-generation, co-location (Apple built 40MW solar array in NC) --- location may not be ideal for DC or power plant.
Trends in solar energy:
Current PV efficiency (solar energy to electricity) ~15%, expect to grow to ~40% in 10 years.
Capacity efficiency (this largely depends on location) ~15-25%
Cost of solar energy system installation decreasing in time, expect to go down by 40%-60% in 10 years (DOE data)
Main Challenge: power supply is variable, and it may not match your workload consumption
Idea: Match world to energy supply, instead of matching energy supply to workload demands.
(And you can use net metering to sell the power your battery can't store to your grid electricity provider)
Solution:
Hardware (Parasol):
Steel structure on the roof + Backup power (battery + grid) + IT equipment + Cooling
(Actually we don't need cooling that much, even though they are really energy consuming)
Software:
Use Hadoop workload scheduling to predict energy availability (based on weather forecast). Schedule jobs on renewable energy first, then low-price grid electricity, then peak-price electricity, as long as we can meet the job completion deadline.
Also, make sure we don't draw a lot of power from grid at any particular time (high peak usage adds up to bills!)
GreenSwitch: keep keys on jobs, servers on/off, battery lifetime management, peak time power and total power consumption, then output an energy source schedule and a workload schedule.
Results:
Parasol (use polar energy) without software: 60% cost save, amortize in 7 years.
Parasol + GreenSwitch: 75% cost save, but batteries are too expensive to amortize...
Parasol + GreenSwitch + Deferrable Workloads: 96% cost saving, solar + battery amortized in 7.6 years.
(But they did the experiment in summer, so might be too optimistic.....)
Aside:
They have a year of datacenter energy profile data (for every minute, the power efficacy at each inverter, the temperature and the humidity, etc)
Thu Nguyen, Rutgers
Motivation:
Datacenters uses lots of energy.....Most of it consumed by small datacenters.
Can we use renewables?
Approaches:
1. Buy renewable energy from other power plants off-site. (Google buys wind power from NextEra) --- not stable enough, transmission loss.
2. Self-generation, co-location (Apple built 40MW solar array in NC) --- location may not be ideal for DC or power plant.
Trends in solar energy:
Current PV efficiency (solar energy to electricity) ~15%, expect to grow to ~40% in 10 years.
Capacity efficiency (this largely depends on location) ~15-25%
Cost of solar energy system installation decreasing in time, expect to go down by 40%-60% in 10 years (DOE data)
Main Challenge: power supply is variable, and it may not match your workload consumption
Idea: Match world to energy supply, instead of matching energy supply to workload demands.
(And you can use net metering to sell the power your battery can't store to your grid electricity provider)
Solution:
Hardware (Parasol):
Steel structure on the roof + Backup power (battery + grid) + IT equipment + Cooling
(Actually we don't need cooling that much, even though they are really energy consuming)
Software:
Use Hadoop workload scheduling to predict energy availability (based on weather forecast). Schedule jobs on renewable energy first, then low-price grid electricity, then peak-price electricity, as long as we can meet the job completion deadline.
Also, make sure we don't draw a lot of power from grid at any particular time (high peak usage adds up to bills!)
GreenSwitch: keep keys on jobs, servers on/off, battery lifetime management, peak time power and total power consumption, then output an energy source schedule and a workload schedule.
Results:
Parasol (use polar energy) without software: 60% cost save, amortize in 7 years.
Parasol + GreenSwitch: 75% cost save, but batteries are too expensive to amortize...
Parasol + GreenSwitch + Deferrable Workloads: 96% cost saving, solar + battery amortized in 7.6 years.
(But they did the experiment in summer, so might be too optimistic.....)
Aside:
They have a year of datacenter energy profile data (for every minute, the power efficacy at each inverter, the temperature and the humidity, etc)
2014年2月19日星期三
Argon: performance insulation for shared storage servers
Approach:
1. time slice based
2. Increase prefetch/write-back size to insulate streaming efficienc from disk seeks introduced by competing workloads
3. Cache partition
Interference causes:
Disk interference, cache interference
Metric:
fraction of throughput it achieves when it has the storage system to itself, within its share of the service (R value)
1. time slice based
2. Increase prefetch/write-back size to insulate streaming efficienc from disk seeks introduced by competing workloads
3. Cache partition
Interference causes:
Disk interference, cache interference
Metric:
fraction of throughput it achieves when it has the storage system to itself, within its share of the service (R value)
2014年2月5日星期三
Some disk accounting related stuff
Trading Capacity for Performance in a Disk Array (OSDI'00)
Takeaway:
1. One could derive analytically models of disk performance (latency, throughput, etc) based on the disk and workload characteristics (rotational time, seek time, busyness of the system/queue size, read/write ratio, seek locality) and how we place data (striping, replication etc).
2. These models could be used to guide scheduling, and maybe accounting, by actually predicting what's going to happen when serving these requests.
3. Current head position, physical track layout etc., could be found by issuing requests to the disk.
Robust, Portable I/O Scheduling with the Disk Mimic (ATC'03)
by Remzi et. al
Takeaway:
1. Online disk service time prediction based on a more abstract model: instead of attempt to simulate the mechanism or components internal to the disk, simply reproduce the output as a function of the inputs observed.
2. Need to reduce the input space: most important inputs are just logical distance and request types.
3. This method accounts for most of the latency of the disk, but just in a probabilistic manner. And accuracy doesn't necessary translate to performance in scheduling.
4. As most paper from Remzi's group, it has a beautiful evaluation section where they really explored every possible configuration of their solution and understood what is going on.
Argon: performance insulation of shared storage servers (FAST'07)
from CMU
Takeaway:
1. The primary interference of storage system come from disk head time and cache space, which cannot be easily insulated.
2. The use prefetching/write-back to ensure a high disk head efficiency (which is kind of orthogonal to insulation), then cache partition and time slicing for insulation. With the parameters of the above three techniques dynamically adjusted by some analytic models.
3. They say that trade-off between efficiency and latency is fundamental.
4. Using feedback for scheduling is difficult here because you don't know what to expect due to the variability of performance from different workload, so a theoretical prediction could help here.
Takeaway:
1. One could derive analytically models of disk performance (latency, throughput, etc) based on the disk and workload characteristics (rotational time, seek time, busyness of the system/queue size, read/write ratio, seek locality) and how we place data (striping, replication etc).
2. These models could be used to guide scheduling, and maybe accounting, by actually predicting what's going to happen when serving these requests.
3. Current head position, physical track layout etc., could be found by issuing requests to the disk.
Robust, Portable I/O Scheduling with the Disk Mimic (ATC'03)
by Remzi et. al
Takeaway:
1. Online disk service time prediction based on a more abstract model: instead of attempt to simulate the mechanism or components internal to the disk, simply reproduce the output as a function of the inputs observed.
2. Need to reduce the input space: most important inputs are just logical distance and request types.
3. This method accounts for most of the latency of the disk, but just in a probabilistic manner. And accuracy doesn't necessary translate to performance in scheduling.
4. As most paper from Remzi's group, it has a beautiful evaluation section where they really explored every possible configuration of their solution and understood what is going on.
Argon: performance insulation of shared storage servers (FAST'07)
from CMU
Takeaway:
1. The primary interference of storage system come from disk head time and cache space, which cannot be easily insulated.
2. The use prefetching/write-back to ensure a high disk head efficiency (which is kind of orthogonal to insulation), then cache partition and time slicing for insulation. With the parameters of the above three techniques dynamically adjusted by some analytic models.
3. They say that trade-off between efficiency and latency is fundamental.
4. Using feedback for scheduling is difficult here because you don't know what to expect due to the variability of performance from different workload, so a theoretical prediction could help here.
订阅:
评论 (Atom)