2011年3月10日星期四

Followup on academic publishing in systems

zz from http://www.thegibson.org/blog/archives/305

My personal take away from this blog:

1. Getting a faculty job in a decent university is reallllllllly hard. 2 OSDI, 3 SOSP and other publications on FAST?!?! It is hard to imagine myself making this kind of achievement in 5 years, even though I am lucky enough to work with his former advisor. (Would it be easier for research labs affilicated to companies? But you woundn't have grad students for most time of a year, and that's like a whole different ecosystem).

2. Publishing in top system conferences is kind of subjective (at least to me). You have to guess what will make PC interested, and it is still a bit mysterious to me :(
My advisor seems to have a quite different perspective on what is interesting work in the system area than I do; and I guess I should try to develop this kind of perspective too if I am seriously thinking about research as my career. (Still can not believe some debugging linux stuff get their way into OSDI, though...)

3. Congrats to Haryadi! This is like the first faculty in our group? Even though those guys in Microsoft research or national labs are also active in publishing papers (Eurosys, SIGOPS etc.), but I guess being in an university is different.

===========================================================================

I follow Ed Felten’s blog, Freedom To Tinker (which is actually now a blog for many people at Princeton’s Center for Information Technology Policy) — it has good coverage on issues like electronic voting and intellectual property. Dan Wallach, a well known security (among other things) researcher at Rice, published an interesting post titled “Acceptance rates at security conferences” assessing the state of academic CS conferences in the area of security. He points out that the conferences are getting increasingly competitive with an ever growing field of researchers and a relatively fixed number of conference venues; he notes that this will lead to certain “structural problems” in the research community and discusses potential options.

He also points to Matt Welsh‘s thoughts on similar issues in the systems community:

“Scaling up conferences”
“Scaling up program committees”
This is of particular interest to me since, as a PhD student, I am an academic systems researcher. Dan Wallach summarizes Matt first post as follows:

He argues that there’s a real disparity between the top programs / labs and everybody else and that it’s worthwhile to take steps to fix this. (I’ll argue that security conferences don’t seem to have this particular problem.) He also points out what I think is the deeper problem, which is that hotshot grad students must get themselves a long list of publications to have a crack at a decent faculty job. This was emphatically not the case ten years ago.

I definitely see what Matt is talking about in the systems community. For example, for a large subset of lower-level systems work*, SOSP and OSDI are a sort of gold standard in publication venues. Each conference is held every two years (alternating years between the two venues), so each year 20-30 papers will be accepted total (for reference, OSDI ’08 accepted 26/193 and SOSP ’07 accepted 25/131). Given the size of the systems community, that doesn’t give much leeway for up-and-coming researchers, but a publication in such a venue is virtually required to be competitive academically — as Matt describes it, a publication in these venues is “a highly prized commodity, and one that is becoming increasingly valued over time.” Matt says:

Several of us on the hiring committee were amazed at how many freshly-minted Ph.D.s were coming out with multiple papers in places like SOSP, OSDI, and NSDI. Clearly there is a lot more weight placed on getting those papers accepted than there used to be. … Somewhere along the way the ante has been upped considerably.

I notice this too. For example, Georgia Tech’s College of Computing (where I am finishing my PhD) was ranked in the top 10 graduate programs in CS (#9) by US News and World Report in 2008. For systems research specifically, we were also ranked in US News and World Report’s top 10 (#10) in 2008. Now, of course US News and World Report’s rankings are contentious and reducing the work of a whole bunch of different researchers in CS to a single dimensional ordinal representing the whole program is very subjective, but the point is only to say that our program can be considered competitive in the universe of CS graduate programs.

But if you look at our publishing track record in these two prized venues, we’re virtually unrepresented. If you look at the OSDI proceedings, you will see that a paper from Georgia Tech has never been accepted there (my 2008 submission was rejected, although it did get decently positive reviews), and we have two SOSP papers — one in 97 which was collaborative with Microsoft Research and involved only students and no Georgia Tech faculty (which makes me wonder if it was related to an internship) and one in 2007 which was a collaboration between a student in the Electrical and Computer Engineering and a professor in the College of Computing. Compare this college-wide record with that of Haryadi Gunawi, an excellent faculty candidate interviewing at Georgia Tech this year. In his career as a PhD student, he had 2 OSDI and 3 SOSP publications (plus publications in top venues in other areas, like PLDI, ISCA and FAST). As a student, he has amassed significantly more publications in these prized venues than our whole College of Computing can claim**. And other students from his advisor(s) have similarly impressive CVs. Look at the students of many other “rockstar” systems researchers and you’ll see the same pattern; we had a parade of great faculty candidates with similarly strong records.

So what am I supposed to make of this? I get a deep sense of cynicism when I see trends like this over many years. Matt says, “I don’t have hard data to back this up, but it feels that it is increasingly rare to see papers from anything other than ‘top ten’ universities and labs in the top venues.” I would go a step further and say that there’s a certain “clique” (or “cabal” if you want something sinister) of key researchers who facilitate virtually all of the publications in these venues. If you are a student of one of these researchers, or a nth-generation student (e.g. a student of a faculty member who once was a student of…), you know how to do work that appeals to the program committee and present it in the proper way — if you don’t have the right perspective on these fine points of taste, your chances are grim.*** As a student, if your advisor is a big name, you can have a paper in these top venues every year. If you don’t, you have very bleak academic job prospects. Now I’m definitely not trying to diminish Haryadi’s impressive accomplishments, and his research is very exciting. But I get the sense that there’s a very strong dis-proportionality in academic publishing in systems that is a lot worse than most other areas in computer science.

A comment to Matt’s first post and the end of Dan’s post also pointed me to another relevant article. In May’s CACM Viewpoints, Ken Birman and Fred Schneider wrote an interesting critique of the state of systems conferences titled “Program Committee Overload in Systems” (here’s a free pdf from Fred Schneider — the same content but without the fancy formatting of the CACM hardcopy). This CACM article seems like a follow-up and expansion of an earlier work of Ken’s I’ve blogged about (titled “Overcoming Challenges of Maturity”).

Anyway, I’m glad that some well-respected systems researchers are being vocal about these issues. It’s definitely good to know I’m not the only one with gripes; I’ve been somewhat cynical about this for a while, but since I have very little clout it helps to find a few senior systems researchers with some common concerns.

* Yes, I understand that “lower-level” is a matter of perspective. To my electrical and computer engineering colleagues, things like hypervisors and operating systems count as “high-level” “end-user” programming.

** If you look at DBLP, you will find a good bit more from current College of Computing faculty, but I’m counting publications where the author is at Georgia Tech when the publication is made (i.e. the author’s affiliation at the time of the publication).

*** Even presented well, good work on certain kinds of systems topics just doesn’t seem to be interesting to the PCs of these top conferences (the Europeans have been irked by this for years — leading to the establishment of EuroSys).