How to write a paper

How should I write a paper or thesis?

 

Typical structure of a scientific paper

Here is a suggestion of the components and sections that commonly go into a scientific paper. Note that none of the sections in between Introduction and References is strictly necessary: here you should adapt the scheme to your own subject and approach, and choose different headings for your sections. For example, theoretical papers will normally not have a "methodology"  or "results" section, but are likely to have a more extensive review of the literature, and development of the arguments.  

Title

  • should as much possible make clear what the paper is about, and what its main thesis is
  • imagine yourself seeing this title somewhere in a bibliography or table of contents, without further information: would you do the effort to get hold of the paper just on the basis of the title?
  • it is better to avoid vague or possibly misleading (teasers) titles that just use fashionable terms to create interest
  • only well-known authors can afford vague titles, such as "Investigations" (Stuart Kauffman), or "Considerations on ...": would you look up a paper by an unknown author entitled "Some notes on the theory of ..."?
  • if you still like the idea of a short, sexy, albeit vague, title to attract the attention, you can complement it with a more precise subtitle: e.g "The essence of evolution: an analysis of the different components of natural selection".

Author(s)

  • these are normally the people who substantially contributed to the content of the paper, not just to the grammar and style, or to securing the funding and infrastructure that made the research possible; such other contributors can be mentioned in the Acknowledgments section
  • not all authors need to have actually written sections of the text, as long as they contributed to the ideas written down;
  • however, all authors are supposed to agree with everything written down in the paper, which implies that they should at least have read the final version
  • authors are listed either in alphabetical order or according to the importance of their contribution
  • authors' names are normally accompanied by their affiliation: department and university, sometimes also their email or full address

Abstract

  • A short, 1 paragraph summary of the whole of the paper and its different sections, with an emphasis on the new ideas proposed
  • the abstract should not just be a short introduction: take into account that some people will only read the abstract, and decide on the basis of that whether the paper contains some results they are looking for; if they know the domain, they won't need an introduction, but will want to know which novel results the paper proposes
  • Should be written only after the full paper is finished, so that you have a complete overview of everything that is in there
  • should include all the core keywords or concepts, so that people interested in the topic can find the paper on the basis of the abstract alone, as often happens in paper databases
  • Sometimes followed on a separate line by a list of keywords

Introduction

  • sketch the core problem to be addressed
  • Why is it important to tackle this problem?
    • What benefits, applications, clarifications ... might be achieved by investigating this topic?
  • Background: what is already known about this problem/its solution? (This may constitute one or more separate sections of the paper instead of just being part of the introduction)
    • review of the literature
    • comparing different schools of thought towards this issue
    • Why are these existing approaches insufficient?
      • What has been overlooked?
  • How is this paper going to approach the problem?
    • In what way is it different from earlier approaches?
    • Why can we expect this approach to produce novel or better results?
  • The introduction typically concludes with a short outline of how the rest of the paper is organized. This should not be another abstract or detailed table of contents, but just quickly sketch the order of the subsequent steps

Conceptual framework

  • Definition of core concepts
    • resolving possible confusions/ambiguities
    • presenting clear examples or illustrations of the concepts
  • Specification of their relationships

Objectives/research questions

  • Which hypotheses does our approach suggest?
  • How do we operationalize our concepts?
  • How can we test these hypotheses?
  • What additional questions are worth investigating?

Methodology and Design

  • How do we gather the necessary data to test our hypotheses?
    • E.g. Psychological experiments, computer simulation, collection of data on the web, survey, literature review, thought experiments, case studies...
    • Refer to similar experiments, simulations ... used by other authors
  • How are the data processed?
    • E.g. Statistical analysis, visualization, scoring by readers, conceptual analysis...
  • Specify strengths and shortcomings of this methodology
    • what possible confounding factors/errors/noise/ambiguities should we try to avoid?
    • Why didn't we use larger sets of data, more complex processing methods, more sensitive measurements etc.?
      • Point out intrinsic limitations because of time, space, resources, ...

Results

  • Description of the collected data
    • number of respondents, survey questions, simulation runs, etc.
  • Description of the processed data
    • correlation coefficients, trends, graphs, tables, ...
  • Basic conclusions for each of the experiments/simulations/surveys....

Discussion

  • In how far do the results confirm/disconfirm the hypotheses?
  • What does this mean for the overal problem?
  • How do they answer the initial questions?
  • What do the processed data reveal about the phenomenon under investigation?
  • What do they still lack to be fully convincing/complete?
  • What new questions do they suggest?
  • What additional tests do they suggest?

Conclusion

  • Summary of what has been achieved, why this is important, and what needs to be done next
  • Sometimes combined with the previous discussion section, especially in terms of listing issues for further research
  • A concluding section is not strictly necessary and can be left out in a short paper. For longer papers, however, it is strongly advised to end with a conclusion, so as to remind the reader of the key achievements, and to impose a clear structure on what by now may have become a very extensive discussion. 

Acknowledgments

  • Mention of official sponsors of the research, and possibly of the people who contributed in a significant way to the ideas/research/paper but are not listed as co-authors

References / Bibliography

  • Full bibliographical data for all other research/publications referred to in the text
  • Depending on the conventions of the publisher these can be in alphabetical order, or in order of first occurrence in the text
  • if you have the choice of the convention, I would advise alphabetical order and the use of (Smith, 1998) type references rather than numerical references [13] to a numbered entry in the bibliography. The reason is that it is easier for the reader to remember a name than to remember a number when looking up references, and easier to judge whether the reference is worth checking (if the name sounds familiar, you already have an idea what kind of reference it may be).
  • I would also advise that you include the full title of paper and journal in the bibliography. The only advantage of the numerical system (which is usually accompanied by abbreviations) is that it saves a little bit of space. But now that most papers are available electronically, you are hardly going to save any trees by leaving out names or titles of papers, or by abbreviating the "Journal of Memetics" to "J Memet". By including them, on the other hand, you will make life much easier for your readers who are wondering whether a particular reference is worth getting hold of...

 


Common mistakes in the organization of a paper/thesis

After having given a lot of advice to students in the role of their supervisor and to peers in the role of referee, I have developed extensive experience with weaknesses in the presentation of a paper. While such papers may formally contain all the information that needs to be there, it is presented in such a way that the reader fails to fully understand, to become interested in the subject, to be convinced of the author's thesis, or simply to remember what the paper really had to say. The above outline for the organization of a paper will already provide some hints on how to avoid these problems, as will subsequent advice on how to make ideas stick. Still, it is worth pointing out the most common mistakes that inexperienced writers make in their papers or dissertations.

Assuming that the reader knows what you are talking about

A scientific paper is by definition highly technical and abstract, discussing complex and esoteric concepts and theories. If you are writing about a well-delineated subject for an audience of domain specialists, you mayassume that your readers will know the main concepts and theories of the domain. However, as soon as your research is in the slightest interdisciplinary, meaning that it uses concepts from more than one domain, this assumption has to be abandoned. While your readers may be experts in one of the domains, they may have little more than superficial knowledge of the other domain.

Therefore, you should always start by defining and situating the main elements of your approach, preferably with references to publications in which these elements are developed in detail. Failing that, your readers will encounter terms and concepts that they do not understand or, worse, that they misinterpret, because the term reminds them of a similar term in their domain that has a different meaning. So, always start with a clear definition of the core concepts of your approach. Whenever you use another, perhaps less central, concept, provide at least a short circumscription in brackets and a reference if you are not sure that all your readers will know this concept.

Assuming that the reader cares about your subject

While the subject you are writing about will appear very interesting to you (otherwise, you would not have done research on it), you should not assume the same about your readers. While they may have a general interest in the domain, there are so many papers on any subject that are potentially relevant to their interests that they need a good reason to read your paper rather than one of the others. So, in your introduction you should start by convincing the reader that the issue you are discussing is really worth investigating, e.g. because it has many practical applications, is particularly relevant to contemporary problems, or is part of a long lasting controversy in the field. 

Needlessly repeating yourself

When discussing complex and abstract ideas, it is useful to repeat the core message, so as to make sure that the reader has understood and remembered it. However, such repetition should not bore the reader or make the paper too long. The best way to drive a message home is to repeat it with different words, different examples or different applications. Each new formulation of the message will increase the understanding, because the reader will now see the same idea in a different context, getting the occasion to create some additional associations between the new concept and knowledge the reader already has.

However, repeating the message with (virtually) the same words and the same contexts is merely redundant. For the reader it appears like a waste of time having to read the same thing again and again. It also contributes to the next mistake: making the paper too long. A good way to avoid such needless repetition is outlining or Idea Processing: creating an efficient structure for your ideas and arguments before you actually start writing the text. A good outline will ensure that you say precisely what you need to say, and nothing more.

Making the paper too long

Readers, and especially publishers, prefer shorter papers, since these require less investment of time, attention, paper, and other resources. By avoiding repetition or redundancy, your paper will already become shorter. However, the temptation can still be great to include a lot of additional ideas that further support or extend your main thesis, or simply to list every single result of your research that seems interesting. In order to maximize your chances of publication, you should actively combat this tendency towards accumulation.

If you have a lot of material, it is better to split it up into several papers. Assuming that this material concerns the same broad issue, the papers will necessarily overlap to some degree, as you will need to repeat common assumptions, or summarize results published in earlier papers. This is not grave. The readers who are interested in the whole of your results will not mind rereading the core ideas of earlier papers if these are formulated in a new context so that they contribute to a better understanding. Even if the formulation is essentially the same, readers can simply skip these parts and jump straight to the new results.

Remember that in the present system of evaluating research (see Scientific impact), only the number of publications counts, not the length! So, it is strategically better to spread the same results over several papers. On the other hand, you should not exaggerate with this "dilution" of material: papers that contain too few new results may not get published, and are unlikely to get many citations.

Not stating what is new here

In many papers, it is not clear which ideas are the author's own contribution is, and which are merely a review of the literature. This is a common problem during PhD defenses, where the jury would like to know exactly which are the original "theses" put forward. The PhD student, on the other hand, is often inclined  to refer frequently to the literature in order to give more authority to the presented ideas, while being too shy to put his or her personal ideas in the spotlight. As a result, the student's contribution to the literature remains unclear.

In scientific publications too it is expected that you make clear what is new in your paper. Even if there are no original results, the paper may still be novel because it summarizes, clarifies or reviews an extensive, but scattered literature. But this too should be pointed out clearly! Spelling out the novel contribution is best done in the introduction, by contrasting your approach with the work done previously on this subject. It is worth repeating the core contributions in the conclusion and abstract.

Being too categorical

Researchers should not forget that scientific knowledge is always provisional: open to scrutiny, criticism, and eventual replacement by a better theory. This means that you should be careful with statements of the form "A is B" without further qualification, which imply that the statement is an absolute truth that should be obvious for everyone. The only statements that can be put in this way are either a matter of definition, such as 1 + 1 = 2, or generally accepted "laws" or properties, such as "a dog is a mammal", "gravitation is an attractive force". In complex domains such as psychology, sociology, or even biology and medicine, there are very few such straightforward truths.

Here it is better to add qualifiers or hedges to your statements, like "it is commonly accepted that A is B", or, better, "according to theory X (Smith, 1999), A is considered to be B". Other common hedges in scientific style are "A may be viewed as...", "it is likely that...", "it is possible that...", "relative to ...". While you should of course not exaggerate with such phrases, in order to keep your sentences readable, it is good policy to regularly include them, in order to remind the reader that your are not dogmatically positing a truth, but making an assumption that others may like to question.

Being categorical is even worse when you are not describing, but prescribing, i.e. when you are using propositions of the form "A ought to be...", or "you must do ...". Such lecturing or finger-pointing can appear quite offensive to the reader, who may have very different opinions on the matter. Here a better approach is "given our understanding of ..., it seems advisable to do...", i.e. to motivate on the basis of a detailed argumentation why you think a certain approach is better than another one.

Formalizing too much

This is a mistake often made by authors with a background in the "hard" sciences, such as physics, mathematics or computing. These authors tend to assume that the only way to make a description truly scientific is to include a lot of symbols and formulas. Mathematical symbols are of course immensely useful, and necessary in cases where concepts need to be strictly defined or the rules to be expressed are used to calculate, compute, simulate, or make complex logical deductions.

In other cases, however, symbols merely make reading and understanding more difficult. The reason is that our memory is not made to remember lists of meaningless symbols: we remember most easily when terms are used that have already a lot of associations with concepts we know.

Therefore, symbols should not be used just for their own sake: if they merely attach a formal label to a word that is perfectly clear on its own, they will only increase the burden on the reader. An example of such an (ab)use could be the following: "define a city C as a tuple C={P,L,S}, where P is the collection of people that live in C, L is the geographical location of C, and S is the size of P, i.e. the number of people living in C". Unless you are planning to write a program that will teach a computer to reason about cities, you can do perfectly well without this formalization of common notions: your readers will understand you much better if you later mention "the people of the city" than if you write "P(C)".

Formalizing too little

This complementary error is often made by authors with a background in the humanities: they will develop grand theories and conceptual systems in a narrative style, but without clearly defining the essential elements of their system or the relations between them. For the theory to be in the least testable or refutable, there should be an unambiguous understanding of what the theory states or implies, and what it does not. This implies a more formal definition of the concepts and relationships, e.g. in the form "if A, then B", or "the category A is subdivided in the categories A1, A2 and A3".

It is not necessary to use abstract symbols to express these relationships. They could simply take the form of clearly stated propositions, such as "intelligent people on average live longer than less intelligent people", assuming that you have defined intelligence previously in the paper. Another good method to make distinctions and relations more explicit is the use of tables, diagrams, and flow charts, that allow you to graphically distinguish different types of entities, and perhaps represent their relationships by arrows pointing from the one to the other. 

 

Further advice

 There are several guidelines available for writing on the web, e.g.