Mary Dyson, David Březina

Exploring disfluency: Are designers too sensitive to harder-to-read typefaces?

This is a report of our study which we conducted online in the early part of 2019 and talked about at the ICTVC conference in Patras, Greece in June 2019. We presented some of the results at the conference and here we provide a fuller account. We have structured this report to enable readers with different interests and desire for detail to determine what they want to know. Please get in touch if you have any feedback.

Why did we do this research?

There are two main reasons why we were compelled to conduct the study. The first reason was the media attention on harder-to-read typefaces. This began in 2010 when a BBC news item reported on an academic paper about to be published by Diemand-Yauman et al., (2011): Making things hard to read can ‘boost learning’ (Hebblethwaite , 2010). Another article, this time in the Guardian newspaper in 2018, reported on the development of a new font: Font of all knowledge? Researchers develop typeface they say can boost memory (Martin, 2018). These two articles hint at a problem for design practice as the principles of human-centred typography generally suggest that easier-to-read documents are better for readers.

The second reason was our shared interest in what happens to people’s perceptions after they receive design training. Designers, type designers, and typographers in particular, deal with fine details in letter shapes in their practice. This might influence their responses to harder-to-read typefaces.

Existing academic research on disfluency

The initial research (Diemand-Yauman et al., 2011) which suggests that hard-to-read fonts can improve memory or learning uses disfluency theory as an explanation. The theory states that when we perceive that something is harder to read, we put in more effort and the greater effort helps us learn the material by processing it more deeply.

Disfluency theory draws on principles of psychology first written about by James (1890/1950) which introduced two processing systems: one is quick, effortless, and intuitive; a second is slow, effortful, analytic, and deliberate which can lead to better learning. According to this theory, even if the content is simple, we may be tricked into using the second processing system by using a hard-to-read font (Alter et al., 2007; Rummer et al., 2016).

Existing academic research on differences due to training

Comparisons of designers and non-designers have shown that the two groups differ in their judgments of the semantic qualities of typefaces (Bartram, 1982). Designers appear to develop qualitative differences in perceptual abilities due to training (Dyson, 2011; Dyson & Stott, 2012), as do musicians (Burns & Ward, 1978). For example, typographers appear to attend to configural properties of characters such as the relative position of thick and thin strokes, and do not just abstract properties required for letter identification (Dyson & Stott, 2012). The level of expertise with reading an alphabet has also been shown to affect the processing of visual features of characters (Wiley et al., 2016).

Our research questions

Constraints on our study

For practical reasons, we chose to do our study online which introduced various constraints. We needed to use tasks where the responses were simple and involved selecting from alternatives, rather than inputting words etc. We aimed at tasks which would take a maximum time of around 10 minutes. We hope that this means participants kept their attention on the task at hand.

Previous studies of disfluency have measured difficulty in two ways:

What we did

We have made the study website and data available on GitHub and licenced them under a Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA 4.0).

To address our two research questions, we obtained participants’ judgments of legibility and of memory, and measured the speed and accuracy of identifying and remembering words.

We accept that our study does not reflect ‘normal reading’ but is a first step towards exploring potential differences between designers and non-designers in the context of disfluency theory.

We used two fonts: Arial and Sans Forgetica. Participants were invited through posts on Twitter and we also used Twitter ads to increase the number of people who would see the call.


Before they started, we asked participants whether they are a fluent English speaker and their professional background. For the latter we provided the categories:

Task 1

This was a lexical decision task (see Figure 1) which consisted of a sequence of 20 screens where the participant is shown an item, a word-like group of letters, and asked to say whether it is a word or a non-word in the English language. The participant is asked to respond as quickly and accurately as possible. How confident they are in their response is measured by using four alternatives:

Example screen of the lexical decision task using Arial (Task 1)

Figure 1: Example screen of the lexical decision task using Arial (Task 1).

Intervening questionnaire

In between the first and second task they are asked to make two judgments (see Figure 2). The first is how well they remembered the items on a scale from 0 (I do not remember any) to 100 (I can remember everything). We refer to this as a Judgment of Memory (JoM). The second is a Judgment of Legibility (JoL) of the font used in the lexical decision task, using the scale:

Two judgments following the lexical decision task

Figure 2: Two judgments following the lexical decision task.

Task 2

This was a recognition task which involved a sequence of 16 screens where the participant is shown an item in the same font as the one used in the lexical decision task. They are asked to say whether they saw that item in the previous task. In this case, the participant is invited to take as much time as they like to respond. Their confidence is measured through four alternative responses (see Figure 3):

We timed responses and recorded accuracy for both tasks and the two tasks were repeated; a different font was used for the second pair of tasks.

Example screen of the recognition task using Sans Forgetica (Task 2)

Figure 3: Example screen of the recognition task using Sans Forgetica (Task 2).

Details of the words and non-words

We generated a large pool of items, words and non-words, using MCWord: An Orthographic Wordform Database. The parameters we selected were:

From this pool we took out proper names and very similar words, e.g. both singular and plural would not be used. In the lexical decision task, each participant received a unique set of 10 words and 10 non-words which were randomly picked from the pool. For the recognition task, we matched each item (word or non-word) with a foil (a similar item which had not been seen) starting with the same letter. We measured recognition by using four words and four non-words from the lexical decision task, i.e. items participants would have seen, and eight foils for other seen items, i.e. items that were similar, but not seen.

Our participants

We had 97 participants in this study. We grouped the various categories of designers together to form two groups: 53 designers and 44 non-designers.

What we found


In terms of judgments, all participants judged Sans Forgetica as less legible than Arial. We felt it was important to include this judgment as not all previous studies have checked that their participants make the same judgments of legibility as the researchers. Also, all participants thought they would be better at remembering items in Arial than in Sans Forgetica.

In the two tasks, all participants responded more slowly to items in Sans Forgetica than items in Arial. However, across all participants, there were no differences in accuracy of responses to Arial and Sans Forgetica in either task.

Designers and non-designers

There was a greater difference between judgments of memory for Arial and Sans Forgetica in designers than non-designers (see Figure 4), supporting our hypothesis that designers are more sensitive to differences in fonts (see Figure 4). We also found an unexpected general difference between the two groups. In both lexical decision and recognition, designers responded more quickly than non-designers.

Figure 4: Judgments of Memory for the two fonts with greater difference in judgments by designers.

Words vs non-words

Although the study was not specifically aimed at comparing words and non-words, some interesting differences emerged. Responses to words were faster than non-words for both tasks, which is known as a lexicality effect. However, there was also a difference between the two fonts in the lexical decision task: non-words slow participants down even more in Sans Forgetica than in Arial (see Figure 5). A result that seems initially to be counter-intuitive is that non-words were remembered better than words in both fonts.

Figure 5: Responses to non-words are slower than words and particularly in Sans Forgetica. Response times in milliseconds have been transformed using the natural logarithm to be suitable for statistical analysis.

What do these results tell us?

Sans Forgetica is considered harder to read and slows down reading. Non-words in Sans Forgetica need more time to decipher than words because we cannot use our knowledge of words to fill in the letters that are harder to identify. These results enable us to confirm that Sans Forgetica is a less legible typeface than Arial. Participants consider items set in Sans Forgetica to be less memorable, probably because the font is perceived as less legible. This is described in the academic literature as a metacognitive effect: an awareness of thought processes. In this case, an awareness that the items are difficult to read leads to a belief that they are more difficult to remember. This effect is stronger in designers. We therefore have some support for our hypothesis that designers’ sensitivity to typographic presentation, and specifically legibility, is influencing their judgments of memory.

Despite the differences in legibility between typefaces, which are recognised by all participants, accuracy of responses was not affected. Sans Forgetica items were not easier to remember in the recognition task which provides no support for disfluency theory, i.e. that greater effort helps us learn the material by processing it more deeply. However, non-words were better remembered. One explanation for this is that non-words stand out more in the lexical decision task because they are not confused with the many words we already know. This focus of our attention helps in subsequent recognition.

Designers’ approach to the tasks may have been slightly different from non-designers’ given their faster responses throughout the study. The method used to recruit participants, initially through invitation directed at designers, may have contributed to their greater commitment.

Some implications for practice

A less legible font, such as Sans Forgetica, may slow down deciphering a message. This may not be a problem when designing for genres where ease of reading is not a primary concern, e.g. graphic posters. However, in circumstances where the text is difficult to understand, e.g. non-native or complicated language, Sans Forgetica would be problematic and the use of a more conventionally legible font is recommended.


What did you think?

What did you think of the article? We would sincerely appreciate your feedback.

Send a comment

Enjoyed the article?

Sign up for our newsletter and get notified when we publish the next one.


Alter, A. L., Oppenheimer, D. M., Epley, N., & Eyre, R. N. (2007). Overcoming intuition: Metacognitive difficulty activates analytic reasoning. Journal of Experimental Psychology: General, 136, 569–576.

Bartram, D. (1982). The perception of semantic quality in type: Differences between designers and non-designers. Information Design Journal, 3(1), 38–50.

Burns, E.M., & Ward, W.D. (1978). Categorical perception—phenomenon or epiphenomenon: Evidence from experiments in the perception of melodic intervals. Journal of the Acoustical Society of America, 63(2), 456–468.

Diemand-Yauman, C., Oppenheimer, D.M., & Vaughan, E.B. (2011). Fortune favors the bold (and the italicized): Effects of disfluency on educational outcomes. Cognition, 118(1), 111–115.

Dyson, M.C. (2011). Do designers show categorical perception of typefaces? Visible Language, 45(3), 193–220.

Dyson, M.C., & Stott, C.A. (2012). Characterizing typographic expertise: Do we process typefaces like faces? Visual Cognition, 20(9), 1082–1094.

Hebblethwaite, C. BBC News (22 October 2010). Making things hard to read ‘can boost learning’. Accessed 29 April 2020.

James, W. (1950). The principles of psychology. Dover. First published 1890.

Martin, L. The Guardian (4 October 2018). Font of all knowledge? Researchers develop typeface they say can boost memory. Accessed 29 April 2020.

Rummer, R., Schweppe, J., & Schwede, A. (2016). Fortune is fickle: Null-effects of disfluency on learning outcomes. Metacognition and Learning, 11(1), 57–70.

Wiley, R.W., Wilson, C., & Rapp, B. (2016). The effects of alphabet and expertise on letter perception. Journal of Experimental Psychology: Human Perception and Performance, 42(8), 1186–1203.

More reading

See all articles