Page 1

:~l Introduction
For some time now, there has been, in Western culture, a distinct preference for
monomodality. The most highly valued genres of writing (literary novels, academic
treatises, official documents and reports, etc.) came entirely without illustration, and
had graphically uniform, dense pages of print. Paintings nearly all used the same
support (canvas) and the same medium (oils), whatever their style or subject. In
concert performances all musicians dressed identically and only conductor and

. soloists were allowed a modicum of bodily expression. The specialised theoretical
and critical disciplines which developed to speak of these arts became equally
monomodal: one language to speak about language (linguistics), another to speak
about art (art history), yet anotller to speak about music (musicology), and so on,
each with its own methods, its own assumptions, its own technical vocabulary, its
own strengths and its own blind spots.

More recently this dominance of monomodality has begun to reverse. Not only
the mass media, the pages of magazines and comic strips for example, but also the
documents produced by corporations, universities, government departments etc.,
have acquired colour illustrations and sophisticated layout and typography. And not
only the cinema and the semiotically exuberant performances and videos of popular
music, but also the avant-gardes of the 'high culture' arts have begun to use an
increasing variety of materials and to cross the boundaries between the various art,
design and performance disciplines, towards multimodal Gesamtkunstwerke, multi-
media events, and so on.

The desire for crossing boundaries inspired twentieth-century semiotics. The
main schools of semiotics all sought to develop a theoretical framework applicable to
all semiotic modes, from folk costume to poetry, from traffic signs to classical music,
from fashion to the theatre. Yet there was also a paradox. In our own work on visual
semiotics (Kress and Van Leeuwen, 1996), we, too, were in a sense 'specialists' of the
image, still standing with one foot in the world of monomodal disciplines. But at the
same time we aimed at a common terrninologyfor all semiotic modes, and stressed
that, within a given social-cultural domain, the 'same' meanings can often be
expressed in different semiotic modes.

In this book we make this move our primary aim; and so we explore the common

Page 2

2 Multimodal discourse

principlesbEihind multirhodal communication. We move away from the idea that
the,differerit.modesjn mUltirnodal texts have strictly bounded and framed speCialist
taskS/as in afilin where images may provide the action, sync sounds a sense of
realism, music a layer of emotion, and so on, with the editing process supplying the
'integration code', the means for synchronising the elements through a common
rhythm (Van Leeuwen, 1985). Instead we move towards a view of multimodality in
which common semiotic prinCiples operate in and across different modes, and in
which it is therefore quite possible for music to encode action, or images to encode
emotion. This move corries, on our part, not because we think we had it all wrong
before and have now su,ddenly seen the light. It is because we want to create a
theory of semiotics apprqpriate to contemporary semiotic practice. In the past, and
in many contexts still today, multirnodal texts (such as films or newspapers) were
organised as hierarchies of speCialist modes integrated by an editing process.
Moreover, they were produced in this way, with different, hierarchically organised
specialists in charge of the different modes, and an editing process bringing their

. work together.
Today, however, in the age of digitisation, the different modes have technically

become the same at some level of representation, and they can be operated by one
multi-skilled person, using one interface, one mode of physical manipulation, so that
he or she can ask, at every point: 'Shall I express this with sound or music?', 'Shall I say
this visually or verbally?', and so on. Our approach takes its point of departure from
this new development, and seeks to provide the element that has so far been missing
from the equation: the semiotic rather than the technical element, the question of
how this technical possibility can be made to work semiotically, of how we might
have, not only a unified and unifying technology, but also a unified and unifying
" Let us give one specific example. In Reading Images (1996) we discussed 'framing'
as specific to visual communication. By 'framing' we meant, in that context, the way
elements of a visual composition may be disconnected, marked off from each other,
for instance by framelines, pictorial framing devices (boundaries formed by the edge
of a building, a tree, etc.), empty space between eleI!lents, discontinuities of colour,
and so on. The concept also included the ways in which elements of a composition
may be connected to each other, through the absence of disconnection devices,
through vectors, and through continuities and similarities of colour, visual shape and
so on. The significance is that disconnected elements will be read as, in some sense,
separate and independent, perhaps even as contrasting units of meaning, whereas
connected elements will be read as belonging together in some sense, as continuous
or complementary. Arnheim's discussion of Titian's Noli Me Tangere (1982: 112)
provides an example: '[Christ's] staff acts as a visual boundary between the figures',
he comments, and 'Magdalen breaks the visual separation ... by the aggressive act of
her right arm' (see Fig. 1.1).


Introduction 3

Figure 1.1 Noli Me Tangere

But clearly framing is a multimodal prinCiple. There can be framing, not only
between the elements of a visual composition, but also between the bits of writing in
a newspaper or magazine layout (Kress and Van Leeuwen, 1998), between the people
in an office, the seats in a train or restaurant (e.g. private compartments versus
sharing tables), the dwellings in a suburb, etc., and such instances of framing will also
be realised by 'framelines', empty space, discontinuities of all kinds, and so on. In
time-based modes, moreover, 'framing' becomes 'phrasing' arid is realised by the
short pauses and discontinuities of various kinds (rhythmic, dynamic, etc.) which
separate the phrases of speech, of music and of actors' movements. We have here a
common semiotic prinCiple, though differently realised in different semiotic modes.

The search for such common prinCiples can be undertaken in different ways. It is
possible to work out detailed grammars for each and every semiotic mode, detailed
accounts of what can be 'said' with that mode and how, using for each of the
grammars as much as possible (as much as the materiality of the mode makes that
plausible) the same approach and the same terminology. At the end of this process it
would then become possible to overlay these different grammars and to see where
they overlap and where they do not, which areas are common to which of the modes,
and in which respects the modes are speCialised. There have by now been a number
of attempts at devising such grammars, all based to a greater or lesser degree on the
semiotic theories of Halliday (Halliday 1978, 1985) and Hodge and Kress (1988), and
hence sharing a common approach - for instance the semiotics of action of Martinec
(1996, 1998), the semiotics of images of O'Toole (1994) and Kress and Van Leeuwen
(1996), the semiotics of sound of Van Leeuwen (1999), the semiotics of theatre of
Martin (1997) and McInnes (1998), arid so on.

Page 11

20 Multimodal discourse

experience a variety of identities, duties and pleasures realised in a spatial mass
medium, a globally distributable language of interior design.


In this chapter we have sketched the outline of a theory of multimodal communica-
tion. We have defined multimodality as the use of several semiotic modes in the
design of a semiotic product or event, together with the particular way in which these
modes are combined - they may for instance reinforce each other ('say the same
thing in different ways'), fulfil complementary roles, as in the House Beautiful article
about Stephanie's bedroom, or be hierarchically ordered, as in action films, where
action is dominant, with music adding a touch of emotive colour and sync sound a
touch of realistic 'presence'. We defined communication as a process in which a
semiotic product or event is both articulated or produced and interpreted or used. It
follows from this definition that we consider the production and use of designed
objects and environments as a form of communication: we used the example of a
room, but could also have used a designed object as our example.

The main concepts we have introduced in the chapter are recapitulated in the
�d�i�s�~�u�s�s�i�o�n� of terms below. '


Strata: The basis of stratification is the distinction between the content and the
expression of communication, which includes that between the signifieds and the
signifiers of the signs used. As a result of the invention of writing, the content stratum
could be further stratified into discourse and design. As a result of the invention of
modern communication technologies, the expression stratum could be further
stratified into production and distribution.

The stratification of semiotic resources has its counterpart in the social stratifica-
tion of semiotic production, certainly in the early stages of the use of new communi-
cation technologies. In later stages it may become possible for one person to produce
the product or event from start to finish, as is beginning to happen today with
interactive multimedia.

In this book we argue that production and distribution produce their own layers
of signification. Indeed, we have argued that semiotic modes and design ideas
usually flow out of production, using principles of semiosis typical for production,
such as provenance and experiential meaning potential.

Discourse: Discourses are socially situated forms of knowledge about (aspects of)
reality. This includes knowledge of the events constituting that reality (who is

Introduction 21

involved, what takes place, where and when it takes place, and so on) as well as a set
of related evaluations, purposes, interpretations and legitimations.

People often have several alternative discourses available with respect to a
particular aspect of reality. They will then use the one that is most appropriate to the
interests of the communication situation in which they find themselves.

Design: Designs are conceptualisations of the form of semiotic products and events.
Three things are designed simultaneously: (1) a formulation of a discourse or combi-
nation of discourses, (2) a particular (inter) action, in which the discourse is
embedded, and (3) a particular way of combining semiotic modes. '

Design is separate from the actual material production of the semiotic product or
event, and uses (abstract) semiotic modes as its resources. It may involve inter-
mediate productions (musical scores, play scripts, blueprints, etc.) but the form these
take is not the form in which the design is eventually to reach the pUblic, and they
tend be produced in as abstract a modality as possible, using austere methods of
realisation that do not involve any form of realistic detail, texture, colour and so on.

Production: Production is the articulation in material form of semiotic products or
events, whether in the form of a p,rototype that is still to be 'transcoded' into another
form for purposes of distribution (e.g. a 35 mm telemovie) or in its final form (e.g. a
videotape packaged for commercial distribution).

Production not only gives perceivable form to designs but adds meanings which
flow directly from the physical process of articulation and the physical qualities of the
materials used, for instance from the articulatory gestures involved in speech
production, or from the weight, colour and texture of the material used by a sculptor.

Distribution: Distribution refers to the technical Ire-coding' of semiotic products
and events, for purposes of recording (e.g. tape recording, digital recording) and/or
distribution (e.g. radio and television transmission, telephony).

Distribution technologies are generally not intended as production technologies,
but as re-production technologies, and are therefore not meant to produce meaning.t'
themselves. However, they soon begin to acquire a semiotic potential of their own,
and even unwanted 'noise' sources such as the scratches and discolorations of old
film prints may become signifiers in their own right. In the age of digital media,
however, the functions of production and distribution become technically integrated
to a much greater extent.

Another key distinction in this chapter is the distinction between mode, which is
on the 'content' side, and medium, which is on the 'expression' side.

Mode: Modes are semiotic resources which allow the simultaneous realisation of
discourses and types of (inter)action. Designs then use these resources, combining

�.�~�-�-�-�-�-�- �-�-�-�-�-�-�-�-�-�-�-�-�-�-�-�-�-�-�~�

Page 12

22 Multimodal discourse

semiotic modes, and selecting from the options which they make available according
to the interests of a particular communication situation.

Modes can be realised in more than one production medium. Narrative is a mode
because it alloyvs discourses to be formulated in particular ways (ways which
'personify' and 'dramatise' discourses, among other things), because it constitutes a
particular kind of interaction, and because it can be realised in a range of different

It follows that media become modes once their principles of semiosis begin to
be conceived of in more abstract ways (as 'grammars' of some kind). This in turn
will make it possible to realise them in a range of media. They lose their tie to a
specific form of material realisation.

Medium: Media are the material resources used in the production of semiotic
products and events, including both the tools and the materials used (e.g. the musical
instrument and air; the chisel and the block of wood). They usually are specially pro-
duced for this purpose, not only in culture (ink, paint, cameras, computers), but also
in nature (our vocal apparatus).

Recording and distribution media have been developed specifically for the
recording andlor distribution of semiotic products and events which have already
been materially realised by production media, and as such are not supposed to func-
tion semiotically. But in the course of their development, they usually start �f�u�n�c�t�i�o�n�~�
ing as production media - just as production media may become design modes.

I Lastly, we 'discussed the specific ways in which meaning is produced 'in production' .
This is not always a matter of 'realising designs', in the way that a speech may realise
what the speaker has prepared, or a building what the architect has designed, and it
certainly does not usually happen in the 'arbitrary' ways which have been fore-
grounded by linguists. In fact, signification starts on the side of production, using
semiotic principles which have not yet sedimented into conventions, traditions,
grammars, or laws of design. Only eventually, as the particular medium gains in
social importance, will more abstract modes of regulation ('grammars') develop, and
the medium will become a mode. The opposite, modes becoming media again, is also
possible; The science of physiognomy, for instance, lost its status as a result of its
racist excesses, and now semiotic practices like casting are 'media' again, operating
on the basis of primary semiotic principles such as 'provenance' and 'experiential
meaning potential'.

Experiential meaning potential: This refers to the idea that material signifiers have a
meaning potential that derives from what it is we do when we articulate them, and
from our ability to extend our practical experience metaphorically and turn action
into knowledge. This happens, for instance, with the textural characteristics of sound

Introduction 23

qualities (as when singers adopt a soft, breathy voice to signify sensuality), in the
absence of a conventionalised 'system' of sound qualities (such as the symphony

Provenance: This refers to the idea that signs may be 'imported' from one context
(another era, social group, culture) into another, in order to signify the ideas and
values associated with that other context by those who do the importing. This
happens, for instance, in giving names to people, places or things (e.g. in naming a
perfume 'Paris') when there is no 'code', no sedimented set of rules for naming

Page 22

42 Multimodal discourse

history would be a rich history of discursive practices over the period of existence of
the house - from the layer on layer of paint and wallpaper on doors and walls, to
doors blocked up and broken through.

Hierarchies of practices of articulation and interpretation

The fact that different kinds of knowledge are implied in practices of articulation and
interpretation finds its clearest expression in the semiotics of work and of profes-
sional practices. 'Home' magazines (unlike dO-it-yourself magazines) are not meant
as a text for articulation - except for certain kinds of do-it -yourself work - precisely
because the reader of the magazine is not meant to have the requisite knowledge, and'
perhaps predOminantly because she or he is meant to engage the services of some-
one who has that knowledge, whether the interior designer, the landscape gardener
or the owner of the furniture store, all of whom advertise in or support the magazine
precisely for that reason.

There is a further reason for the solidification and reification of complexes of �p�r�a�c�~�
tices, and that is that all action, all practice, rests on an understanding, a knowledge
of modes. Speaking a language rests on a knowledge of that language; playing a game
of soccer similarly rests on a knowledge of (the 'rules' of) the 'language' of that game;
making a chair rests on knowledge about the mode (design principles) of chairs.
Whether I choose to accept such a distinction and division depends on a number of
factors. I might never employ a landscape gardener, in the absolutely confident
knowledge that I - and only I - can design and 'articulate' the garden of my desire; but
my neighbour might feel absolutely daunted by the magnitude and the complexity of
the task of deSign, or by that of articulation. Other factors are less individual and more
socially regulated. Perhaps I belong to a social group where it is just not' done' not to
have an interior designer, never mind someone to do my hair. I might be in a profes-
sion which has absolutely clearly delineated framings of what I can and what I cannot
do. 'Demarcation lines', whether in professions, trades or 'private lives', rest on such

At any'one period certain of these couplings of practices can come undone, and
new couplings can come into being; certain aggregations are unmade and others are
newly made. Where practices are tightly framed, hierarchies of practices are likely to
develop. Production of a film in the 'classical' Hollywood fashion is an example par
excellence, with clearly delineated practices and roles in which, for instance, the
director instructs the cinematographer, who in turn instructs the gaffer (the chief
lighting technician), while both cinematographer and gaffer also lead' their own
teams which can be quite large and have a strict division of labour, in which, for
instance, the cinematographer will leave the actual operation of the camera to the
camera operator, who, in turn, will leave adjusting the focus to the 'focus-puller',

Discourse 43

loading the film in the camera's magazine to the 'clapper-loader', and so on. Whether
the producer or the director is at the top of this hierarchy depends on the context,
with traditional Hollywood practice favouring the producer, and European art film
the director. What was never in question, however, was the fact of hierarchy.

Multimedia production, by contrast, is unmaking this particular aggregation of
discrete practices, and favours multi-skilling, complex practice, which is now not
seen as an 'aggregation' but as one integrated practice. If film-making demonstrates
specificity of skilling, the totally different arena of pedagogy, contemporary institu.
tionalised education, provides an example of de-skilling, at least in some Anglophone
countries. Formerly, in the English tradition, the teacher was in control of curriculum
and of its shape to a very large degree; he or she was in control of pedagogic practice
in the classroom, as well as being in charge of assessment and evaluation. This aggre-
gation of practices in one person is now being unmade by currently potent ideologi-
cal and political forces, and teachers are seen as 'delivering' (the newly fashionable
metaphor is significant) a curriculum designed elsewhere without the teacher's
input, and increasingly tight control is exercised over the mode of 'delivery', the
pedagogic practice in the classroom, as well as over assessment and its forms. This
de-skilling is, in our terms, taking from teachers the task of design, and is limiting
their potential for action to the field of 'delivery' only, a circumscribed form of pro-
duction, where before he or she was in control of design, of discursive arrangements
in the form of curricular content, of production as pedagogic practice, and of
practices of evaluation .

. Such processes of multi-skilling and de-skilling are the effects in social and eco-
nomic life of larger -scale economic and ideological shifts. They have semiotic conse-
quences at every level, whether in the shaping of what counts as forms of knowledge
in 'disciplines' or 'subjects', orm the existence or decline of professions and trades,
or in the appearance, development and availability of clearly understood and articu-
lated semiotic means - the representational modes. They have effects beyond this, in
terms of the shaping of social subjects, and the possibilities of being social actors.

The field of discursive practice is social and therefore historical, and cannot be
understood without a sense of the historicall social contingencies of the arrangement
and configuration of practices and modes. Nor can we hope to understand fully the
shaping and the availability of modes and discourses without a clear. sense of the
embeddedness of semiosis in the social, and of its historical shaping. In short, what
we are describing in this chapter and in those which follow is both the principles of
social 'semiosis as such', and at the same time of semiosis as it is at this time, in this
place, on this occasion. We describe the principles of human social semiosis, but we
stress that what are common principles have very different articulation at different
times and in different places.

At the same time this is not an attempt to suggest that all is fluid and that nothing
is (ever) fixed in the field of social semiosis, that we cannot, either as humans in the

Page 23

44 Multimodal discourse

social practices of interpretation and production, or as academic analysts in the
process of description, point to semiotic arrangements of known possibilities and
limitations. It is to say that we are talking about configurations at this time, in a field
which is subject to constant human social action.

Design in the contemporary period

The term 'design' is currently hugely fashionable. Whenever an idea becomes so
ubiquitous that it has entered common parlance to suchan extent it is time to ask
Why. Why is this idea everywhere? Why does it pop up in the most unlikely places?
And in particular, why am I using this word, this idea? Am I simply caught up ina

Fashions always speak of something real, which may not, however, be quite on
the surface of the debate, there for all to see. In the context of a book on multi-
modality, one answer to the questions we have just posed seems ready to hand: it is
the fact of multimodality itself which needs the notion of design. If the awareness of
multimodality, and of its move into the centre of theoretical attention in communi-
cation and representation, is a recent phenomenon, as we suggest it is, then the
emphasis on and interest in the concept of design is, we think, at least in part a
consequence of that.

To explain. In an era when monomodality was an unquestioned assumption (or
rather, when there simply was no such question, because it could not yet arise), all
the issues clustering around the idea of design - a deliberateness about choosing the
modes for representation, and the framing for that representation - were not only not
in the foreground, they were not even about. Lariguage was (seen as) the central and
only full means for representation and communication, and the resources of
language were available for such representation. Where now we might ask 'Do you
mean l!mguage as speech or as writing?', there was then simply 'language'. Of course
there was attention to 'style' , to the mmner in which the resources of 'language' were
to be used on particular occasions. And of course there were other modes of
representation, though they were usually seen as ancillary to the central mode of
communication and also dealt with in a monomodal fashion. Music was the domain
of the composer; photography was the domain of the photographer, etc. Even though
a multiplicity of modes of representation were recognised, in each instance
representation was treated as monomodal: discrete, bounded, autonomous, with its
own practices, traditions, professions, habits.

By contrast, in an age where the multiplicity of semiotic resources is in focus,
where multimodality is moving into the centre of practical communicative action

