Hacker News

Learning networking by reproducing research results

by belloon 10/27/2017, 5:46:49 AM with 2 comments

by petroseskinderon 10/28/2017, 8:51:25 PM
On a similar note, the Student Cluster Competition at SC'17 makes use of a reproducibility challenge, where teams are expected to reproduce the results of a selected paper [0]. This year it's the Vectorization of the Tersoff Multi-body Potential [1]. Overall, teams are expected to read the paper, figure out what the claims are, try to reproduce the results on a cluster your team built, and then analyze the final results.
As a student participating in this challenge, I can say that the challenge has been a really fun and illuminating experience. It's given me the chance to deep dive into some computer architecture (vectorization, intrinsics, etc), and also get some basic exposure in molecular body simulations.
> An unexpected outcome of this project is an increased role of students in the networking research community.
This is in line with my experiences, as well. One challenge we faced was that the specific paper experimented exclusively on Intel clusters, whereas we were running a Power machine. Since they hosted their optimizations on Github, we were able to create issues when we had challenges porting to our machine (which we quickly resolved).
All in, a really fun experience. I think this "reproducing canonical papers" would be a fun and hands on way to learn about a new field.
That said, I do question the feasibility of this as a pedagogical approach. We were able to "luck" out because (1) the paper's author, Markus, had to compete to have his paper published, and (2) he was very helpful, (porting the code to Power, answering followup questions, etc.) In a sense, this was "easy" mode. However, there are a number of cases where it's simply building a paper's code is nontrivial. For example, a few researchers at Arizona tried to simply obtain and build the code of 613 papers from ASPLOS’12, CCS’12, OOPSLA’12, OSDI’12, PLDI’12, SIGMOD’12, SOSP’11, VLDB’12, TACO’9, TISSEC’15, TOCS’30, TODS’37, TOPLAS’34 [2]. Of these 613 papers, they had a ~25% success rate, where success means they were able to build and get a basic run.
[0]http://www.studentclustercompetition.us/2017/applications.ht...
[1]https://dl.acm.org/citation.cfm?id=3014914
[2]http://reproducibility.cs.arizona.edu/tr.pdf
by RobertDeNiroon 10/28/2017, 8:04:42 PM
On that same topic I suggest people look into the ICLR reproducibility challenge for machine learning. http://www.cs.mcgill.ca/~jpineau/ICLR2018-ReproducibilityCha...