Yang Suli的Blog: 四月 2014

2014年4月25日星期五

为什么女性缺少大师

女性争取到选举权有100多年的历史
女性外出参加工作有100多年的历史

中国女性普遍接受教育有60余年
中国男权社会有4000多年的历史
60年的时间内，专家呼吁男孩危机，号召高知女性回归家庭，鼓吹剩女舆论

是的，从古至今，艺术文学各各领域很少出现大师级别的女性。因为她们不是人，根本没有被当做一个人看待。她们只是附庸品，被侮辱被损害被踩到脚底被当做商品任意交换的物品。即使是妻子，也可以被丈夫自由转卖。

基督教曾经郑重讨论女性是否有灵魂，佛教认为女性是污秽的，没有进入极乐世界的资格，必须投胎转世成男性才能得到救赎。

这些年来女性在社会中占的份量确实比以前高了很多，但是相应而来的也是男性对此的反应：各种制约女性地位、控制女性发展的或成文或不成文的措施。

女性真正成为“人”只有一百年历史，100年女性崛起的历史对抗5000年的男权社会，她们还没有积蓄足够的力量，所以处于顶尖层次的女性大师极其稀少，这是现实，也是历史。

2014年4月10日星期四

How to Obfuscate Software

Sanjam Garg IBM Watson

make computer programs "unintelligible" without affecting their functionality

What are we trying to hide (using an example)
what kind of sequence do obfuscation try to hide?
1. try to hide private key in a program?
2. patching software securely: don't want hackers to use patches to reverse engineer what the bug was
--use obfuscated patch

Methods:
1. manual obscurity by programmer effort (not that difficult to decipher...)
2. Crypto Obfuscation (by just changing function names, add dummy variables etc...) -- doesn't hide the sequence inside the program
All ad-hoc. So they makes the hacker's mission hard, but not impossible

f[password, m](x) = m if x = password
null otherwise
if you know the password, you should know what m is
if you don't know the password, m should also be hidden from you

New mathematical tools:
Multilinear Maps

Solved a few open problems:
1. Non-interactive Key Agreement
from two party setting (1976) to a three party setting (2000)
now for arbitrary number of parties (2013)
2. Functional Encryption for all Circuits
Public-Private key, but now want Google to check spam (give specialized SK' to Google instead of SK) --- Alice could just give the obfuscated program of spam checker to Google)
Prior Work: very limited functions for which secret keys could be issued
Now: All functions!
3. Witness Encryption
encrypt to arbitrary recipient whoever with a proof (or, a NP Statement)
e.g.: twin prime conjecture, $1M to the solver put in a safe box
whoever has a proof to twin prime conjecture should be able to use it to open that safe box
Prior: don't know if possible or not
Now: Solved!
And many others...(Remove random oracles, properties of zero-knowledge)

How do we do it?
high level idea:
1. for every input x, there is a unique set of puzzle piece which reveals the output on input x
every other combination reveals nothing
two steps: program->unencrypted pieces->encrptyed pieces

e.g. a special function (witness encryption)
G = (S, X) be a bipartite graph, b is {0, 1}
S = {A,B,C,D,E,F}
X = {1, 2, 3, 4, 5, 6, 7}
f[G,b](T) = b (T is an exact cover)
o otherwise
want: hide the bit b
Exact cover is NP-complete, obfuscator doesn't know the exact cover
security: G has no exact cover --> b is hidden (a bit simpler/weaker, but still no trivial as when you write the program, you don't know if G has exact cover or not)

Step 2: (their new encryption scheme)
an a level encryption scheme for Zp, prime p
have Enc1, Enc2, Enc3...Encn
this (new) encryption scheme has the following property:
hiding: Enci(x) hides x
multiplication: Enci(x)* Encj(y) = Enc(i+j)(x*y)
equality: Encn(x), Encn(y) can check if x = y

Step 1: Jigsaw Puzzle
for each vertex X, pick a random number ai
7-level Encryption scheme (based on what A, B, C, D...connect to)
eA = Enc3(a1a4a7)
eB = Enc2(a1a4)
eC = Enc3(a4a5a7)
eD = Enc3(a3a5a6)
eE = Enc4(a2a3a6a7)
eF = Enc2(a2a7)

e = Enc7(a1a2a3a4a5a6a7) if b = 0
Enc7(r) if b = 1

given three pieces eB, eD, eF (which is an exact cover)
eB*eD*eF = Enc2(a1a4)*Enc3(a3a5a6)*Enc2(a2a7) = Enc7(a1a2a3a4a5a6a7)
recovered!

what if G has no exact cover (remove edge B->a4 in G)
now eB chafes to Enc1(a1)
now you cannot recover based on multiplication.

Q: why multiplication instead of addition?
A. If addition, you can subtract. But division would be hard

Q: why is this more general than previous scheme?
A: Now when I encrypt, I don't know the key! I just know the key has certain properties
Q2: YES, but how does this enable you to do interesting things?
A: That's complicated...

Future work:
1. What (minimal) computational assumptions are needed for obfuscation?
2. How to make obfuscation more efficient? (Now it is very inefficient). Might be hard, but we may be able to simplify for specific applications. Or maybe use trusted hardware.

Q: What are your computational assumptions?
A: It is complicated...=,=
Q: How inefficient?
A: Polynomial. power 7 or 8, I don't even want to mention the constant factors...

2014年4月4日星期五

Strata: High-Performance Scalable Storage on Virtualized Non-volatile Memory

Dutch Mayer, Coho Data

(An extension to the FAST'14 talk)
Actually I think it is the most SDS like work I ever heard about. They deal with SDN-storage interaction too

Goal: take high end flash memory, add all the enterprise features and sustain the high end performance as much as possible (not possible to sustain all because the flash is so fast, everything you put on top is problematic)

PCI-based Flash: a first step toward a real use of non-volatile memory
Problem:
1. It is really fast, and if you put it in a system, something will break and you won't get the raw performance.
-- prioritize device utilization
2. Hardware evolves really quickly, so need to abstract it in some way
-- virtualize, scale out
3. Have to play well with others (dealing with enterprise, you have to support old protocols, various hardware, different application domains, etc...)

Ideas:
high end flash now look a lot like CPUs: fast, expensive, mostly idle. So we want to virtualize it like we virtualize CPU.
Actually flash is frequently bottlenecked on CPU (if you add more cores, you might get 80% performance boost). So you really need to balance the cpu, the network and the flash

architecture 1. virtualization:
each flash organized as a log structure file system with updated records
records form objects, and objects are organized in a B+ tree style, eventually a big dumb address space

architecture 2. flexibility
inspired by Click: packets (requests) pass through pipeline, (data pahs) at each stage of the pipeline, you mutate it (or pass it...). That's how they handle replication, tripping, etc. (This is kind of SDS like, isn't it...)

load balancing with data paths: just by chaining data paths
from data paths to protocols: a library which binds to different kind of front end (NFS, fuse, mysql). They also allow you to push things down into the data path...

They ship with an SDN switch...(and a lot of other stuff which their box work with...)

Some technical problems:
1. support NFSv3 as a scale-out architecture (they use SDN Scaling, use the switch as a load balancer by pushing some rules. they are limited by the size of rule tables though)
2. remote replication in mutable trees

Future work:
1. they focused on utilization, bu QoS is important too
2. tiering and prefetch
3. next year's flash is 2x better

Q&A:
Q:How do you analyze where you bottleneck. Where do you bottleneck?
A: Btree, uncontested locks (because of cache coherence) and almost everything. Doing analysis is really hard, instrumentation and tracing help a lot. Then we come up with a theory and try test it.

2014年4月2日星期三

Google glasses, memory and computational neurology

FIXME:
write it...
(especially how brain and other facilities interact)

Yang Suli的Blog