2013年11月1日星期五

external systmes research at Google

Bradley Chen
Principal Engineer, Google

He had an amazing paper at SOSP'93, which describes different OS structure impact on memory system performance, really showed how you could do interesting detailed measurements on systems.

Google refuses to put researchers in an lab, but on the field.

Ph.D teaches you a systematic way to approach creating new science, and to advance the state of the art.

Google's supprot for university research:
1. Faculty research program (for faculty)
2. Research fellow program (for Ph.Ds)
3. Own publications
4. Open source projects
5. Conference sponsorship

Some examples:
1. Fighting fraud on the internet (Stefan Savage, UCSD)
    Computer security is not just a technical problem, but it is really about people (Adversaries, victims,  defenders). Thus social-ecnomic factors come in.
     E-mail spam: complex value chain relationships, email programs, DNS, web-server, goods/service shipping, financial transactions etc. Measures all the parts. Found that most resources are cheap and profitable, but merchant banks are very few (actually, 3). Result: targeted payment intervention. (Really cool!!!) Remzi said: don't they worry about the Russian guy showing up in their labs?!
     Q by Remzi: maybe it's a bigger problem if drugs are cheaper overseas? Then you cannot really stop the brokers. 
      I need to read more about their methodology of doing these kind of measurement. They do collaborate with banks to track money flow though.

2. Integrating Circuit Switching into Datacenter (UCSD)
    Network doesn't scale very well to datacenter scale, because with larger bandwidth, switch comes with fewer ports, then you need a deeper stack of switches to connect the same cluster.
     Key Idea; Hybrid Circuit/Packet networks. (Actually it's just optical/packet switch network).
     Mordia: Circuit switching to the ToR---fast control plane (~100ms to ~100us), and fast OCS.
     Mordia network model: Taffic Matix(time). Advances from original hotspot scheduling, they make reconfiguration much faster by 1. faster hardware, 2, faster way to observe the network using matrix. TM to represent traffice needs ---> TM' to represent bandwidth allocation, then decompose TM' into scheduler (I didn't really understand how though....)
   Faster hardware: 3D mirror setting to a 2D mirror setting. reconfiguration time redues to ~100 us.
   Remzi: Google stopped saying that network is their bottleneck anymore in their papers (like the earlier papers such as GFS), doesn't that mean Google has new ways to do networking?

3. Disciplined Approximate Computing.
    When you convert analog to digital,it is just an approximation. But you need to be aware when approximation happens, instead of doing it recklessly.
    "Diciplined" Approximate programming: decide when approximation can be used. They have languague and architecture support, which allows you to specify when approximation can happen. A didtial NPU: compute as a neuron.
     results: 2x speed up, 3x energy reduce, and less quality loss.

Q: When Google fund a project, is it just based on merits, or is it also based on other funding sources which might be availabe to this project.
A: Vritually all the projects Google likes have some other sources of funding, like NSF.
Q: Does Google do lobbying on research a lot?
A: Google is doing more and more on that, for better or for worse.  Actually UK or Switzerland is a lot robust in terms of supporting research, but Russia or France is much worse than US.


没有评论:

发表评论