UIUC An Algorithm for a Good Research Paper (as Experienced by Eyal Amir)

Why is Research Difficult for Beginning Ph.D. Students?

Part of the difficulty with research at the beginning is that (a) it is never a burning thing - you can always do it tomorrow, and (b) it is not clear (at least before you complete a couple of research papers) what needs to be done and where you're supposed to start. This is why it is not very surprising that many students do not come up with revolutionary ideas and get quite frustrated in their first one-to-three years in the Ph.D. program -- very few such ideas come in a vacuum.

In that respect, you, the reader, should definitely not feel that you're not up for research/Ph.D just because you did not come up with any bright idea yet. It just takes determination and some guidance, especially in earlier stages, and especially if you did not have the experience in individually-led research. Determination is the name of the game, which is why it is very important that you know and understand why you're doing the Ph.D. The answer could be as simple as "I enjoy research", or "I want to be a Professor at UIUC", or "I want to get higher in industry", or "I want something more interesting in life", or "I'm afraid to be a mediocre small screw in a large industry". If you know why you're doing it, it makes it much easier to be determined and focused in what you do.

So you are determined. Guidance is what you're missing more acutely now. So, you should go to a random professor (not so random - after all, you chose this school for your Ph.D., so you should have wanted to work with some of those professors, and had a gut feeling about some of them), ask her/him if s/he would be interested in working with you. Many would say "work with me during my class, and we'll see", unless you have exactly what it takes to work with them (knowledge, experience) and have proven yourself in research.

How long does it take to do a Ph.D.? How much time before ideas just pop into my head?

Students differ, and it is typically not the problem of background knowledge (although lack of background knowledge in your goal-expertise area does not help). As I said above, the problem is a little guidance (as a beginning Ph.D. student you don't know how to answer a scientific question yet, not to mention asking the question). Part of the guidance initially is in the advisor giving you a question and some guidance on how to go about solving that question.

About thinking up new papers...

You need determination and stamina here. You see, each paper is a work of "depth", i.e., asking the questions as you go instead of accepting the answers that other people supply. For example, think about your typical class project, one of those that you've done in the past year or two or three. Many of those project include at least one good question. Typically that good question is not the topic of the project itself (although some of those are also interesting and cute questions). The good questions are hidden in the assumptions that you made in order to solve that semester-long project.

For example, consider the game of Poker. Trying to write a Poker playing game is an interesting and cute problem. This is a nice term project, one of those things that you could do in a term or less. The interesting question, however, is not the game itself but rather the assumptions that you make in solving the problem. For example, the question of how you reason about someones beliefs and knowledge. Now, you say, how would I even start to think about reasoning about knowledge? Ah, you now see why you did not see the interesting question from the original Poker project. You made the assumption that nobody worked on this before you, and that you are a little impatient and just want to be over with that term project.

Why is knowledge about knowledge more important than Poker (I mean, you could make a lot of money if your Poker algorithm works well)? Well, from a scientific perspective you can build on answers to the first and not the second. "Build what?", you say. Well, first, you can build your Poker agent, but you could also build agents for other games, advance economic theories on the stock market and its dependence on information, and build programs that understand humans' requests better (e.g., in a computerized customer-service agent over the phone). Also, you will find out (see the next section) that the first guides you better in your research.

Thus, doing research requires more stamina than is typically required in a semester-long class. With that stamina you say "OK, we noticed this question of knowledge about knowledge; what next?" And, you can either sit down and think, or you can ask your friends and faculty, or you can search the WWW. One of those (after you did not get persuaded to do other things) would point at Modal Logics and some books on reasoning about knowledge. You read those, and start working on a formal system that can represent knowledge about knowledge, when each of the sides (knowledge) is probabilistic.

Remmeber, you are never in a vacuum, even when you think that you came up with a question that is so different that there's nothing relevant to it in the literature.

An Algorithm for Good Research Papers

What I usually do is
  1. start working on the question,
  2. develop it in the direction I think it should go, namely, I play with the question and form the answer the way I think it should be in guidelines.
  3. Then, I have a better understanding of the problem
  4. I go and search for topics that could mention facets of my problem.
  5. Normally, I find something that is not too close but gives me a thread of literature to follow for a while.
  6. Then, I go and revise what I have done, and progress is made much farther than before.
  7. Then, I again go back and look for more relevant topics. Normally, at this stage (third iteration of going to literature) I find things that are really close, so close it is a little scary at first - was I scooped 3 years ago??
  8. Then, normally, I look closer and see that they really did something slightly different, and that my results have better performance in some ways, and that also I can answer questions that the previous guys did not. I summarize these differences. NOTE: Many researchers run out of steam at this point, and just are satisfied with their little advance. This is where most peope are when a conference paper is due, and they have to decide if their progress is good enough.
  9. At that point (depending on how better my work is and how different) I usually revise the work completely and from a different perspective. Here comes the importance in choosing a question that is good enough from a scientific perspective. Just ask yourself what is the next step after this work is complete. If your competitors did a good job in ansewring the question, then your in an even better position than before because you can now build on their work. If they did not do such a good job, then it is your challenge to do the work such that others (and you) can build on it. Build what? Whatever you had in mind that helped you choose the research question in the first place (see the previous section on choosing a research question).
  10. At this point you have gained enough insight into the technical problem that you can see and come up with much more mature results, and rewrite your work in a form that is close to the final version. Perhaps one more iteration of "find related work", "explain my findings and my question clearly", and "motivate by telling people the greater story". Remember, you are writing this paper for people other than yourself.
All of this process normally takes around half a year (normally distributed around that 1/2 year mean).

Fare Well

This is enough of the story for now. I think it gives you some perspective on why it is hard to make much progress quickly and without initial guidance . You see, it is like writing a real software application. It is not something you do in a day or even a month, unless it is very simple. Think about creating a Doom-style game from scratch by yourself (not using any software packages created by others).


Comments to Eyal Amir