2014年4月10日星期四

How to Obfuscate Software

Sanjam Garg   IBM Watson

make computer programs "unintelligible" without affecting their functionality

What are we trying to hide (using an example)
what kind of sequence do obfuscation try to hide?
1. try to hide private key in a program?
2. patching software securely: don't want hackers to use patches to reverse engineer what the bug was
    --use obfuscated patch

Methods:
1. manual obscurity by programmer effort (not that difficult to decipher...)
2. Crypto Obfuscation (by just changing function names, add dummy variables etc...) -- doesn't hide the sequence inside the program
All ad-hoc. So they makes the hacker's mission hard, but not impossible

f[password, m](x) = m if x = password
                                 null otherwise
if you know the password, you should know what m is
if you don't know the password, m should also be hidden from you

New mathematical tools:
Multilinear Maps

Solved a few open problems:
1. Non-interactive Key Agreement
    from two party setting (1976) to a three party setting (2000)
    now for arbitrary number of parties (2013)
2. Functional Encryption for all Circuits
    Public-Private key,  but now want Google to check spam (give specialized SK' to Google instead of SK) --- Alice could just give the obfuscated program of spam checker to Google)
     Prior Work: very limited functions for which secret keys could be issued
     Now: All functions!
3. Witness Encryption
     encrypt to arbitrary recipient whoever with  a proof (or, a NP Statement)
     e.g.: twin prime conjecture, $1M to the solver put in a safe box
             whoever has a proof to twin prime conjecture should be able to use it to open that safe box
     Prior: don't know if possible or not
     Now: Solved!
And many others...(Remove random oracles,  properties of zero-knowledge)


How do we do it?
high level idea:
1. for every input x, there is a unique set of puzzle piece which reveals the output on input x
    every other combination reveals nothing
    two steps: program->unencrypted pieces->encrptyed pieces

e.g. a special function (witness encryption)
       G = (S, X) be a bipartite graph, b is {0, 1}
       S = {A,B,C,D,E,F}
       X = {1, 2, 3, 4, 5, 6, 7}
f[G,b](T) = b (T is an exact cover)
                   o otherwise
want: hide the bit b
Exact cover is NP-complete, obfuscator doesn't know the exact cover
security: G has no exact cover --> b is hidden (a bit simpler/weaker, but still no trivial as when you write the program, you don't know if G has exact cover or not)

Step 2: (their new encryption scheme)
an a level encryption scheme for Zp, prime p
have Enc1, Enc2, Enc3...Encn
this (new) encryption scheme has the following property:
hiding: Enci(x) hides x
multiplication: Enci(x)* Encj(y) = Enc(i+j)(x*y)
equality: Encn(x), Encn(y) can check if x = y

Step 1: Jigsaw Puzzle
for each vertex X, pick a random number ai
7-level Encryption scheme (based on what A, B, C, D...connect to)
  eA = Enc3(a1a4a7)
  eB = Enc2(a1a4)
  eC = Enc3(a4a5a7)
  eD = Enc3(a3a5a6)
  eE = Enc4(a2a3a6a7)
  eF = Enc2(a2a7)

e = Enc7(a1a2a3a4a5a6a7) if b = 0
      Enc7(r)  if b = 1

given three pieces eB, eD, eF (which is an exact cover)
eB*eD*eF = Enc2(a1a4)*Enc3(a3a5a6)*Enc2(a2a7) = Enc7(a1a2a3a4a5a6a7)
recovered!

what if G has no exact cover (remove edge B->a4 in G)
now eB chafes to Enc1(a1)
now you cannot recover based on multiplication.

Q: why multiplication instead of addition?
A. If addition, you can subtract. But division would be hard

Q: why is this more general than previous scheme?
A: Now when I encrypt, I don't know the key! I just know the key has certain properties
Q2: YES, but how does this enable you to do interesting things?
A: That's complicated...

Future work:
1. What (minimal) computational assumptions are needed for obfuscation?
2. How to make obfuscation more efficient? (Now it is very inefficient). Might be hard, but we may be able to simplify for specific applications.  Or maybe use trusted hardware.

Q: What are your computational assumptions?
A: It is complicated...=,=
Q: How inefficient?
A: Polynomial. power 7 or 8, I don't even want to mention the constant factors...



   

没有评论:

发表评论