Document Type : Review Article


Department of Foreign Languages, Literature and Humanities Faculty, Kharazmi University, Tehran, Iran


This article explores how social semiotic analysis promotes the field of critical discourse analysis and visual studies in EFL/ESL teaching materials, media, educational software packages, and also other language-related fields of study. At first glance, the article attempts to introduce social semiotics as a systematic methodology and discipline through summarizing theoretical backgrounds and methodological foundations. Secondly, it elaborates close relationships among the language, signs, and cultural values as meaning making resources and how these interrelated concepts are discussed in various studies in the past years. The article finally reviews a variety of studies done in this field and emphasizes that social semiotics is the key to highlight the hidden meanings usually imposed and dictated by higher institutions and cultural backgrounds. This article can be of potential help and use for those researchers who require a quick understanding of social semiotics and critical studies in language-related fields to conduct research in these areas.


Main Subjects


Visual social semiotics is a framework which can assist professional communicators and media researchers who need practical tools for image analysis and may not have the time or desire to delve into a new field of study.

The social semiotic approach, particularly the systemic functional linguistics of Michael Halliday 1978, emphasizing the importance of interpersonal meaning as a doorway to meaning formation practices, tends to focus on semiotic resources such as camera angle, modality, and gaze which negotiate the relationship between image and its audience [22].

Some scholars have claimed the fact that the purpose of language is not only information and knowledge exchange, but also it can be used by social and political communities to impose their own values, ideologies, and social behaviors and sustain their dominance over them. Power, discrimination, control, and dominance are either explicitly or implicitly manifested in all kinds of discourses and languages and one way of controlling language and dominating people is through the use of media in which various modes of manipulation control people’s minds through incomplete visual and verbal images. These verbal and non-verbal manipulated messages greatly influence the people’s ideas and even it can alter their attitudes toward certain groups of people whom they never met or had little contact with, it can further change their gender, race, social, and place stereotypical values and ideas. In other words, Nelson and Kern (2012) proposed that as we are now in a “post method” era [9], we are also in a post linguistic era, due to the technological advancements which introduced new platforms for nonlinguistic and multimodal communication and expressions for meaning making. Kress (2003) states that written language and visual image are governed by two different logics; the former is governed by time and temporal sequences, while the latter is governed by spatiality, simultaneity, and organized arrangements.

Based on what Kress (2003) believes, language in itself does not have the ability to provide access to the meaning of the multimodal texts and messages. as he later [25] explains, we should look at meaning-making resources from a “satellite view” in which language, like Earth in space, is seen as “only one small part of a much bigger whole” (p.15). As a conclusion to these statements, according to Kress and Van Leeuwen (1996, 2006), language is moving away from its former function as the medium of all kinds of communication, to a new and modern function as only one medium of communication.

Social Semiotics

Different approaches have been used to analyze the meaning of visual images for long. One of the approaches is formalism or universalism which depends on a set of criteria to analyze all the artworks. It deals with only visible or intrinsic aspects of artworks and does not require any external knowledge. In formalism or universalism, comprehending only the visible aspects, will leads to meaning making of artwork [43].

Another approach is contextualism which is concerned with functions of culture, history, society, and politics. The meaning of an artwork can be considered and analyzed in the context in which it is produced [15].

Semiotics offers another approach which combines formalism and contextualism by focusing on the meaning of message being transmitted. In simple words, semiotics is the study of signs and their meaning making resources. Signs are anything that represents something else, for example, smooth flowing lines represent love. Interpreting signs will bring the elements which are behind the close doors to the foreground [15]. According to Danesi (2004) and Fiske (1990), semiotics, as a communication study, mostly deals with how the message, signs and images are read, interpreted, and translated by the readers through interaction with other images. Meanwhile, according to Kress (2003), semiotics is the scientific study of signs, an amalgamation of form (signifier) and meaning (signified). In Bhartes’ visual semiotics, the key idea is the meaning layering in visual images. In his opinion, there are two layers of meaning: denotation layer which refers to what and who is depicted in the image and the connotation layer which refers to what values and ideas are expressed through the image [62].

Social semiotics which is mostly concerned with visual analysis, is a branch of critical discourse analysis to introduce the ways in which meaning is constructed by the “combination of different semiotic resources surrounding us including both obvious modes of communication such as language, gesture, images, and music, and less obvious ones such as food, dress, and everyday objects all of which carry cultural value and significance” [64]. 

According to Jewitt and Oyama (2001), social semiotics as a subsection of the field of semiotics, provides practical tools for researchers to analyze and study visual texts in a systematic way. In other words, social semiotics is not an end itself but as a means for critical research. The field of semiotics intends to explain meaning making resources as a social practice. The origin of social semiotics dates back to the combination of Halliday’s systemic functional linguistics and Paris structuralist semiotics [21].

Since visual social semiotics regards the function of visual resources as doing a specific kind of semiotic work, it is considered as functionalist [13]. Therefore, the main aim of social semiotics is how people manipulate the application of semiotic resources in specific contexts of institutions and social practices.

Social semiotics follows the tradition of Halliday (1978) in recognizing three major kinds of semiotic meanings which are functioned and performed simultaneously. Social semiotics and its analysis open our eyes and other senses to a wide variety of different semiotic resources both at the level of production and interpretation. It helps the researchers to discover new modes of semiotic resources as well as novel ways of applying currently used semiotic signs. A wide variety of texts are used by social semiotics including photographs, advertisements, magazine pages, and films [64].

Van Leeuwen (2005) believes that social semiotics does not offer ready-made answers, but it just suggests ideas and new ways for constructing questions and finding their potentially correct answers. Based on his ideas, cultural, social, and semiotic meanings can be articulated differently if we behave differently.

All the meanings which exist within the world are both subjective and objective . As Van Leeuwen (2005) states that this phenomenon is very much like what Halliday has referred to as meaning potential, in which none of the words and sentences have a determined and fixed meaning, but their meanings can be differed in diverse social contexts. Hence, every meaning potential whether visual or textual meanings, should be examined in their own social contexts.

There are three schools of social semiotics which have shifted their attention from linguistics to the domains of non-linguistic modes of communication:

  1. Prague school with the members who focused on art, theater, cinema, and costume.
  2. Paris school with the members who have concentrated on photography, painting, cinema, fashion, music, and comic strip, and
  3. Halliday or Sydney school with Australian scholars studying visual semiotics and semiotic literature which has two main sources. Its initial source is developed out of critical linguistics to include other semiotic modes and the second source is an expansion of Hallidayian systemic-functional linguistics [22].

Social semiotics which is strongly influenced by Paris school of semiotics has shifted its attention from structure and systems just as linguistics changed its focus from the sentence to text, and then context. Likewise, the focus of linguistics is no longer on grammar and signs, but on discourse and resources, respectively [22].

One of the most notable characteristics of social semiotic approach is its interdisciplinary nature which made it a new and distinctive approach to theory and practice of semiotics. Social semiotics is not a pure or self-contained theory or domain [64].

As it was mentioned previously, social semiotics follows Halliday’s tradition (1978) in recognizing three major kinds of semiotic meanings which are functioned and performed simultaneously. According to Halliday (1978), these three meta functions are called ideational, with a function of creating representations, interpersonal with the function of creating and establishing a relation between the writer and the reader, and the textual with the function of bringing together the representations and interactions into a meaningful whole. Recognition of these three major functions in language has largely influenced Kress and Van Leeuwen’s social semiotic framework of visual communication grammar [13]. Furthermore, Kress and van Leeuwen (1996) have extended Halliday’s metafunctions to visual images and other modes of communication by applying different terminologies. Representational, interactive, and compositional meanings were the new terminologies adopted by Kress and Van Leeuwen (1996) instead of ideational, inter-personal, and textual metafunctions, respectively.

Representational meta functions

The model proposed by Kress and Van Leeuwen (1996) suggest that representational meaning has two kinds of meanings: narrative and conceptual processes. This kind of meaning deals with how people, settings, and things within the image are interacting with the readers and viewers of the given images. Every semiotic act involves represented participants- those who are represented, depicted, and shown in the images- and interactive participants- those who are looking at, reading, and listening to their represented participants. As Van Leeuwen (2008) mentioned, the represented participants or social actors are involved in either narrative or conceptual processes.

Narrative processes

The narrative meaning consists of four processes of action, reactional, speech- mental, and conversion processes and it represents social actors as being involved in doing something to or for each other. Narrative meanings focus on social actors’ actions, reactions and transactions. Vectors as lines of direction includes bodies, eyes and gestures. In Action processes, the participant who forms the vector or the vector itself is the Actor. Action processes can be either transactional or non-transactional. There are two participants in a transactional process: actor and goal. However, in non-transactional processes, there is only one participant which is actor. In Reactional processes, the vector is created by existence of a gaze or eye-line. Instead of actor and goal, the reactor and phenomenon are applied, respectively. Like action processes, reactional processes also could be either transactional or non-transactional. In Speech and mental processes, the represented participants are connected to their own speech through a talk-bubble or thought balloons. In Conversion processes, there is a third kind of participant who is the Goal in relation to one participant, and the Actor in relation to another. This kind of participant is called Relay by Kress and Van Leeuwen (2006). Besides narrative processes, there are other secondary participants who are not related to the main social actors by means of vector which is called circumstances. Circumstances of means - the tools with which social actors perform the actions, Locative circumstances relate the participants to a specific setting through contrast between foreground and background of the pictures. Likewise, the Circumstance of accompaniment is the participant who has no vector and relations with other participants.

Conceptual processes

The conceptual meaning seeks to define, classify, and analyze the participants by classification as well as symbolic and analytical structures within the visual images.

Classificational Processes

Classificational processes often relate their participants together through a taxonomy. There are three types of taxonomies found in conceptual processes including covert, single leveled overt, and multi-leveled overt taxonomy.

Analytical Process

Analytical processes relate the represented participants in a part-whole relationship and structure to make them fit together. The parts of structure are called possessive attributes and the whole is the carrier. 

Symbolic Process

There are two forms within symbolic meanings including attributive and suggestive. In attributive symbolic process, one of the represented participants are highlighted through its color, size, lightning, or placement while the suggestive symbolic process represents only one participant for the meaning to be carried [22].


Table 1: Representational visual structures (Ideational) (redesigned from The Grammar of Visual Design by Kress and Van Leeuwen (1996))




Representational structures






Narrative representations


  • Action (Actor + Goal)
  • Reactional (Reactor + phenomenon)
  • Speech & Mental
  • Conversion
  • Geometrical Symbolism


  • Setting
  • Means
  • Accompaniment

Conceptual representations


Classificational processes

  • Covert
  • Overt (single or multi-leveled)

Analytical processes

  • Unstructured
  • Structured
  • Temporal
  • Exhaustive & Inclusive
  • Conjoined & Compounded Exhaustive structures
  • Topographical & topological processes
  • Dimensional & quantitative topography
  • Spatio-temporal

Symbolic processes

  • Attributive
  • Suggestive



Interpersonal meta functions

The second meta function is interpersonal processes in which there is an interaction between the represented and interactive participants. According to Kress & Van Leeuwen 2006 (1996), to examine the interactive or interpersonal meanings of images, there are certain semiotic resources of contact (demand or offer), social distance (intimate, social, or impersonal), and attitude (subjectivity versus objectivity including involvement, detachment, viewer power, equality, representation of power, action orientation, and knowledge orientation). Images fall into two categories with respect to the presence and absence of gaze: demand-represented participants look directly to their viewers- and offer- represented participants do not look directly to their viewers. Size of the frame, ranging from extreme close shot to very long shot, which is usually applied to indicate various degrees of social distance between the interactive and represented participants. Moreover, the attitude which is represented via perspectives and angle is divided into two main categories of objective images and subjective images through horizontal angle as an indicator of involvement or detachment of the social actors and viewers, and vertical angle as an indicator of power differences among the participants.


Table 2: Interactive visual meanings (interpersonal) (redesigned from Kress and Van Leeuwen’s The Grammar of Visual Design (1996))












Interactive meanings




Image act

  • Offer
  • demand


  • Direct (degrees of engagement)
  • Indirect (degrees of disengagement)

Social distance

Size of frame

  • Close (Intimate/Personal)
  • Medium (Social)
  • Long (Impersonal)



Subjective image

  • Horizontal angle (involvement and detachment)
  • Vertical angle (Power and equality)

Objective image

  • Action orientation (frontal angle)
  • Knowledge orientation (top-down angle)












  • Color saturation
  • Color differentiation
  • Color modulation


  • Absence of background
  • Full detail


  • Maximum abstraction
  • Maximum representation


  • Absence of depth
  • Maximally deep perspective


  • Full representation of light and shade
  • Absence of light and shade


  • Maximum brightness
  • Black and white or shades of light grey and dark grey

Coding orientation

  • Technological
  • Sensory
  • Abstract
  • naturalistic



As it was mentioned before, gaze as a semiotic resource is able to make interaction between the represented participants, the interactive participants, or the viewers. If represented social actors do not look directly at the viewers, this image is an offer image which depicts represented participants as objects of the viewers’ scrutiny. But if they directly look at the viewers, this demand image addresses the viewers directly in order to say something or transmit a message [22, 63]. Comprehending the concept of gaze is an essential feature of semiotic analysis in which interaction and relationship between the represented and interactive participants is constructed. Image producers in a demand picture are attempting to involve the viewers into an imaginary relation with the represented participant, for example, purchasing a highly expensive and luxury product. In other cases, for offer images in which the viewer is addressed indirectly, interactive participants can think about the suggested participants and products being advertised [22].  


Vectors are responsible for connecting the interactive and represented participants. Vectors can be a gaze (eye-line direction), pointing fingers, extended arms, or objects held in hands in a specific direction [22, 39, 63, 65, 40,]. Recognition of vectors is very crucial to identify which participant is the Actor, the Goal, the Reactor, the Phenomenon, the Speaker, and the Senser. The actor is the participant who emanates a vector, the Goal is the one at whom the vector is directed. The Reactor in a reaction process is the active participant whose gaze is the eye-line vector. The Phenomenon, in a transactional process, is the passive participant at whom the gaze vector is directed. The Senser is the one which the thought bubble emanates and the Speaker is the participant from which the dialogue balloon emanates [22].


Distance is able to create different relations and various social differences between the social actors and viewers of visual resources. As an indication of interpersonal or intimate relationship, Distance varies from extreme close shot to very long shot. The extreme close shot, less than head and shoulder, creates a kind of intimate relationship between viewer and the depicted participants. This close distance shows that the participant is ‘one of us’. The close up which is participants’ head and shoulders, shows close personal distance. Medium close shot which cuts off the participants at their waist, indicates far personal distance. Medium shot which depicts them from their knees represents social distance. Medium long shot which shows a full figure of social actors is an indicator of close social distance. The long shot which depicts participants as half of the frame, implies a far social distance. In addition, at the end, the very long shot where there are groups of people around the social actors is an implication of public distance and in this case, the participants are shown as strangers for the viewers [22].


Angle falls into two main categories of horizontal and vertical angle which implies different degrees of attitude. Horizontal angle represents social actors from side or front. It is actually an indication of involvement and detachment of the represented participants with viewers. In other words, if the social actors are shown from side or front, they are involved with viewers, but if it is an oblique horizontal angle, the social actors are detached from the viewers.

The second type of angle which is vertical angle shows the participants from above, below, or at eye-level. If the participants are shown from a low vertical angle, they have power over the viewers and if they are shown from above, the viewers are in position of power and superiority over the participants. Furthermore, representing social actors from an eye-level (straight on) vertical angle is an indication of their equality and it means that there is no power distance among them [10,63, 65].  

Light and color

Application of different colors in visual discourses may create different modes of feelings. As we know, particular symbolic meanings are attached to particular colors that may change according to context. Colors are described in terms of color purity, darkness or lightness of the color, color tone, and different degrees of color saturation. For example, light colors and, in general, lightness is an indication of hope, brightness, and happiness.  In contrast, darkness or dark colors may create a misery, hard times, and sad feelings for the viewers. The presence of a shadow may mean that something is concealed or not clear to the reader. Therefore, the image producers have to pay attention fully to all these factors which are influencing the meaning of their pictures.


The last factor which is very crucial at interactive (interpersonal) level is the modality which describes different degrees of truthfulness and credibility of a visual image.

It is crucial for interactive participants in a visual communication to understand whether a visual image is a real one or just a fiction. Although there are various degrees of truthfulness in a picture, modality plays an important role in viewers’ decision and feelings within the visual semiotic analysis. Image producers may represent people, things, and places as though they really exist in that way or as though they do not, just as fake imaginations, fictions, and fantasies (Kress & Van Leeuwen, 2006). Therefore, modality and its markers help us make distinction between what is real and what is not. As stated by Kress & Van Leeuwen (2006), modality markers are as follows:

  • Color saturation, which is a scale from full color saturation to the total absence of color (black and white).
  • Color differentiation, which is a scale from a maximum diversified range of colors to the minimum colors (monochrome).
  • Color modulation, which is a scale from fully modulated color to plain and unmodulated color.
  • Contextualization, which is a scale from the absence of background to the most fully articulated and detailed background.
  • Representation, which is a scale from the maximum abstraction to the maximum representation of pictorial detail.
  • Depth, which is a scale from the absence of depth to the maximum depth of perspective.
  • Illumination, which is a scale from the fullest representation of light and shade to its absence.
  • Brightness, which is a scale from the maximum brightness to just two degrees: black and white, or dark grey and lighter grey, or two brightness values of the same color (pp. 160-162).

According to Royce (2002), visual images generally portrait various degrees of modality along a continuum from the highest to the lowest. This being said that the highest modality is not achieved at either ends of the continuum, but the modality increases as it proceeds along the modality scale to the middle. At this point, modality is in the highest position, and then it again decreases when moving forward to the other end of the scale [22]. Among all the factors and visual resources being discussed, gaze, distance, and angle are the most important semiotic resources to decide who is depicted as ‘others’ [63]. He continues with mentioning three important strategies used for making represented participants as others and strangers to the interactive participants. These three strategies are as follows:

  • Disempowerment which represents participants as below us and subservient.
  • Distanciation which shows social actors are not close and intimate to us through medium and long shot distances.
  • Objectivation which depict social actors as objects of viewers’ scrutiny rather than as subjects addressing the viewers with their gaze [63].  

Compositional meta functions

According to Kress & Van Leeuwen (2006) [1996], compositional meaning, which relates representational and interpersonal meanings, is realized in terms of these three interrelated semiotic resources:

  1. Information value: The placement of elements attached to various zones of an image may create specific meanings for the viewers. If the image is placed in the left of a given page, it is an indication of old and given information which the viewers are familiar with. This kind of information is treated as already known, agreed-upon, reasonable, practical, and understandable. If the picture is located in the right position of the page, it represents novelty and new information which the viewer is not familiar with. This information value implies that viewers need to pay special attention to it and it is doubtful and disputable. If the image is at the top of the page, it is an idealized and most salient information and if it is located at the bottom, it represents real and factual information dealing with practical and down-to-earth practices [22, 65, 40, 39].
  2. Salience: This semiotic resource deals with how represented participants are depicted and represented to catch the viewers’ attention. Image producers usually try to make some elements more salient than others. This could be done in terms of the element’s relative size, color, tone contrast, color sharpness, saturation, distance, and placement (information value) in a given picture. Therefore, this ‘visual weight’ makes the viewers to pay much more attention to that particular picture. In general, those elements and pictures which are located at top and left position of a page, they become more prominent and salient than others [22].
  3. Framing: This semiotic resource refers to the presence of dividing lines, actual frame lines, and fractions trying to connect or disconnect the elements in a picture. According to Van Leeuwen (2005), framing is the disconnections between the elements of a composition realized by frame lines, pictorial framing devices, empty spaces, or discontinuities of color. In contrast, lack of the existence of frame lines which connects two pictures, means that the elements within the image belong together. It is assumed that disconnected elements in pictures often have independent, separate, and even contrasting meanings whereas connected elements and visual images usually belong together and are complementary [22, 64].

Visual Grammar

Kress and Van Leeuwen’s (1996, 2006) visual grammar treats visual images as words of a language which have meanings only when they are combined and integrated to each other to form a meaningful whole. In other words, images are made up of smaller elements (semiotic resources) just as the language is composed of smaller units of words. Kress and Van Leeuwen (2006 [1996]) in their book entitled: “Reading images: The grammar of visual design”, and then later Van Leeuwen (2008) in his book entitled: “Discourse and Practice: New tools in Critical Discourse Analysis”, attempted to offer a method to interpret visual images based on their context and formal aspects. Through grammar of visual design, it is possible for the researchers to compare and contrast the grammars of verbal and visual forms of communication. As callow (1999) states, when readers or students look at an image and shape some initial thoughts and reactions, the process of critical reading begins in that image.  Although Kress and Van Leeuwen were mostly under the influence of Ronald Barthes’ theory, they have a different viewpoint about the relation between images and texts. Barthes (1977) believed that the meanings of images and texts are interdependent, but Kress and Van Leeuwen regard text and image as two independent modes of communication. Based on their visual grammar, meaning of an image is just related to a given text not dependent on it [22, 65].  

The visual grammar which is constructed on the basis of Michael Halliday’s systemic functional grammar uses various design elements of color, perspective, mood, social distance, framing, and composition to demonstrate how grammar of visual design leads to meaning making [39, 62]. Visual grammar acts as a descriptive tool and “a methodology for research in areas such as media representation, film studies, children’s literature, and the use of illustrations and layout in school textbooks” [22].

The main difference between the visual content analysis and social semiotic analysis is that the former quantifies the variables and test the hypothesis (hypotheses), but the latter deals with the text in a qualitative manner. In other words, social semiotic analysis does not stop at description level, but it goes beyond description and critically reads and interprets the visual images to explore their hidden meanings and meaning potentials as a form of critical visual discourse analysis [50].

Review of the Related Literature

Many researchers have attempted to uncover how certain groups of people with different race, gender, age, and social class are represented to the public in multimodal texts. They have drawn upon multimodal critical discourse analysis in analyzing various linguistic, non- linguistic, and semiotic resources, for instance, photograph and other graphic representations [55], children's toys [17], political cartoons [11], music [32], TV commercials [5, 7, 20, 53], websites, and online communication [2-3, 29, 33, 56, 57], newspaper [16, 31, 67], and video games [12].

Among all these studies, printed books, software packages, and gender bias have received much attention among applied linguistics, and following Erving Goffman’s (1979) Gender Advertisements,  many scholars such as Hartman and Judd (1978), Hellinger (1980), Porreca (1984),  Graci (1989), Peterson & Kroner (1992), Reese (1994), Poulou (1997), [26], Gharbavi & Mousavi (2012) and [1] to cite a few, have conducted multimodal textbooks analysis from a gendered perspective.

In an attempt to uncover the hidden meanings, stereotypical and biased portrayals, and depictions of various groups of people in multimodal educational texts, many researchers have drawn upon  multimodal critical discourse analysis and conducted a visual analysis to investigate linguistic or non-linguistic semiotic resources [3, 5-7, 11-13, 17, 19-20, 26, 29-33, 36, 38, 42, 52-53, 55-59, 67].

Through a survey of several English learning materials and textbooks’ visual images, Hartman and Judd (1978) have found that women are stereotypically associated with traditional roles such as housework and baby care, while men were doing more masculine jobs such as fixing the car and mowing the lawn.  In 1989, Patricia Kaye conducted a multimodal analysis of Collins Cobuild Dictionary to check the neutrality of visual images and illustrations in men and women’s portrayals. She concluded that despite of verbal mode and written texts’ neutrality, visual images and illustrations have suffered from sexism and stereotypical portrayals of women. She further gave examples to support her findings including giving reference to unknown gender in use of singular pronouns and also addressing negative adjectives and nouns to women.

Peterson and Kroner (1992) studied contents of 27 introductory psychology texts and 12 development texts to evaluate gender bias and reached to the conclusion that male participants have been significantly much more represented than their female counterparts in the construction and evaluation of texts, frequency, and type of representation within texts.

Attempting to investigate gender bias in English learning textbooks, Ansary and Babaii (2003) conducted a multimodal discourse analysis of two ESL textbooks (Right Path to English I and II) and concluded that a very limited and restricted range of occupations are attributed to women such as student and nurse, while more diversified range of jobs and professional occupations were attributed to men such as teacher, policeman, doctor, soldier, farmer, and dentist.

In addition, Machin and Van Leeuwen (2009) conducted a critical multimodal analysis on the children’s war toys over the last 100 years to provide a detailed multimodal analysis of contemporary war toys. They concluded that manufacturing and sending war toys to the children across the world is in complete harmony with American industry, global economic ambitions, and the American military forces especially after the end of World War II.

In Martinez Lirola and Chovanec’s (2012) study, a critical multimodal discourse analysis has been conducted to reveal that how rhetorical and multimodal strategies in cosmetic surgery advertisements have the ability to persuade women to achieve the male-defined ideal body shape. The result of this study showed that body-oriented ideologies and male-defined femininity can be identified and revealed in their multimodal and non-verbal representations . Salbego, Heberle, and Soares da Silva Balen (2015) have investigated the extent to which visual mode in multimodal texts can enhance students’ understanding through applying Kress and van Leeuwen’s visual grammar. The result of their study revealed that existence of visual resources can scaffold and enhance the chance of students’ comprehension especially for beginner level and it also help them understand verbal and textual discourse more effectively. Beyond educational texts and L2 learning contexts, there are a few studies conducted on inflight magazines as a particular form of tourism discourse in international settings. One of the most important instances is Thurlow and Jaworski (2003) who conducted a study that examined 72 international inflight magazines in order to see how a micro-level phenomenon such as inflight magazine as a tourism genre is connected with macro-level structures and processes of globalization. The analysis indicated that inflight magazines reinforce globalization through various discursive and visual strategies such as representing global destinations, cities, and celebrities and displays of global route maps. Inflight magazines reinforce globalization as they are both global medium (genre) and carriers of global messages. Six dominant content categories for inflight magazines are generated in their study: (1) Travel and destination, (2) lifestyle and culture, (3) games, (4) passenger/inflight information, (5) airline news, and (6) business information.

In 2008, Small, Harris, and Wilson conducted a critical discourse analysis of inflight magazines’ advertisements to examine how inflight magazines’ advertisements create, mediate, and recreate discourses around air travel. After a selection of Qantas and Air New Zealand inflight magazines, the results of the content analysis demonstrated that majority of these magazine advertisements are designed for a particular “elite” minority of air passengers who are able to afford a luxury lifestyle and travels and have the ability to purchase the expensive products being advertised. It is further indicated that these financially wealthy airline travelers are mostly white people. Therefore, they concluded that visual and discursive elements in advertisements of inflight magazines function as a socially sorting air travelers.

To investigate how inflight magazines, recreate themselves as global product, Maci (2012) examined 10 American and European inflight magazines through the use of semiotic and discursive analysis. The findings of the study suggested that global layout, format, pictures, and content of inflight magazines make readers to perceive it as a global and international product and the airline industry, like other international corporations, tends to apply marketing strategies to promote and differentiate “national interests in an international context” via their inflight magazines (p. 213).  Conradie (2013) conducted a critical discourse analysis of race and gender in advertisements in the South African Domestic in-flight magazine Indwe. The content analysis of this study was done at 2 levels: product categories and presence of people. The classification of advertised products in Conradie’s (2013) study was based on Small’s et al. (2008, p. 24) criteria which is subdivided into 10 product categories: airlines services, business and finance cars, clothes and body products, food and alcohol, homewares/white goods, jewelry and watches, property and real estate, shopping, technology, travel and tourism, etc. The results indicated that regardless of gender or group type of the represented participants, there is a clear tendency for age range of the 20-40 years old to show within the advertisements which is in congruence with Qantas and Air New Zealand. Similar to Small et al. (2008), images contain white people more than other non-whites including Asian and colored-groups. In other words, white people are still the dominant race. However, this trend is less marked in Conradie’s (2013) study and it reveals the advertisers’ assumption of demographics of this elite minority. Therefore, the results of this study suggest that portrayals of race in Indwe are more balanced, but gender depictions are still significantly different from Small et al. (2008) research.


Last decades, around 1980s and 1990s in which multimodal discourse analysis has gained more importance and vitality among applied linguistic and other major scholars, the verbal mode was emphasized and other important modes of communication like visual images, gestures, aural sounds, and music have been left out of consideration [34].  With the introduction of Grammar of Visual design by Kress and Van Leeuwen, in 2006 [1996], these other modes of communication including visual images were also taken into account. As a consequence, many applied linguistic scholars started to study and examine the role of visual images in multimodal discourse analysis and language learning processes. Through looking at most of these studies, it can be stated that multimodal discourse analysis on EFL/ESL textbooks, educational software packages, media including newspapers, and television, and also inflight magazines were the most favored kind of research in this field. Moreover, the focus in most of these studies was mostly gender bias and other kinds of stereotypes were rarely investigated in multimodal discourse analysis. The significance of the study lies partly in the nature of social semiotics which is a branch of critical studies in applied linguistics. The findings of such studies can be of potential help and use for designers, researchers, advertisers, ESL/EFL learners, and also material developers to become visually literate and become aware of hidden messages that can be communicated by visual images.

Citation S. Sadat Ghasemi*, A Mini Review of Social Semiotic and Critical Visual Studies in Language-Related Fields of Study. Int. J. Adv. Stu. Hum. Soc. Sci. 2023, 12 (4):268-281.

Copyright © 2023 by SPC (Sami Publishing Company) + is an open access article distributed under the Creative Commons Attribution License(CC BY)  license  (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

  1. Abbas-Nejad-Konjin, A gender analysis of Iranian middle school textbooks (Doctoral dissertation, University of British Columbia), 2012. [Crossref], [Google Scholar], [Publisher]
  2. Appadurai, Theory, culture and society, 1990, 7, 295–310. [Crossref], [Google Scholar], [Publisher]
  3. Chiew, Multisemiotic mediation in hypertext. In K. L. O’Halloran (Ed.) Multimodal discourse analysis: Systemic-Functional perspectives (pp. 131-162). London: Continuum, 2004. [Crossref], [Google Scholar], [Publisher]
  4. Gharbavi, S.A. Mousavi, English Language and Literature Studies, 2012, 2, 85-93. [Crossref], [Google Scholar], [Publisher]
  5. P. Baldry, Multimodality and Multimodality in the Distance Learning Age, Campobasso, Italy: Palladino Editore, 2000. [Crossref], [Google Scholar], [Publisher]
  6. P. Baldry, Phase and transition type and instance: patterns in media texts as seen through a multimodal concordancer. In K. L. O’Halloran (Ed.), Multimodal Discourse Analysis (pp. 83-108). London: Continuum, 2004. [Crossref], [Google Scholar], [Publisher]




  1. P., Baldry, P.J. Thibault, Multimodal Transcription and Text Analysis, London: Equinox, 2006. [Crossref], [Google Scholar]
  2. Kumaravadivelu, Beyond methods: Macrostrategies for language teaching. Yale University Press, 2003. [Crossref], [Google Scholar], [Publisher]
  3. Kumaravadivelu, TESOL Quarterly, 1994a, 28, 27–48 [Crossref], [Google Scholar], [Publisher]
  4. C. Camiciottoli, The language of business studies lectures. Amsterdam / Philadelphia: John Benjamins Publishing Company, 2007. [Crossref], [Google Scholar], [Publisher]
  5. E.M. Mazid, Discourse & Communication, 2008, 2, 433-457. [Crossref], [Google Scholar], [Publisher]
  6. Jewitt, Discourse: studies in the cultural politics of education, 2005, 26, 315-331. [Crossref], [Google Scholar], [Publisher]
  7. Jewitt, In P. LeVine & R. Scollon (Eds.) Discourse and technology: Multimodal discourse analysis (pp. 184-195). Washington: Georgetown University Press, 2004. [Crossref], [Google Scholar], [Publisher]
  8. Thurlow, A. Jaworski, Journal of sociolinguistics, 2003, 7, 579-606. [Crossref], [Google Scholar], [Publisher]
  9. S. Jeffers, Art Education, 2000, 11, 40-45. [Crossref], [Google Scholar], [Publisher]
  10. Machin, A. Mayr, Discourse & Society, 2007, 18, 453–477. [Crossref], [Google Scholar], [Publisher]
  11. Machin, T. Van Leeuwen, Critical Discourse Studies, 2009, 6, 51-63. [Crossref], [Google Scholar], [Publisher]
  12. Wyse, R. Andrews, J.V. Hoffman, The Routledge international handbook of English, language and literacy teaching. London: Routledge. 2010. [Crossref], [Google Scholar], [Publisher]
  13. Adami, London: National Center for Research Methods, 2013. [Crossref], [Google Scholar], [Publisher]
  14. Babaii, H. Ansary, The structure of and stricture on TV Commercials in Iran. Paper presented at the 5th Conference on Theoretical and Applied Linguistics. Tehran: Allame Tabataba'ii University, (2001, March).. [Crossref], [Google Scholar]
  15. Aiello, Journal of Visual Literacy, 2006, 26, 89-102. [Crossref], [Google Scholar], [Publisher]
  16. Kress, T. Van Leeuwen, Front pages: (The critical) analysis of newspaper layout, Approaches to media discourse, 1998, [Crossref], [Google Scholar]
  17. Kress, TESOL Quarterly, 2000, 34, 337-340. [Crossref], [Google Scholar], [Publisher]
  18. R. Kress, Literacy in the new media age. Psychology Press, 2003. [Crossref], [Google Scholar], [Publisher]
  19. R. Kress, Multimodality: A social semiotic approach to contemporary communication. Taylor & Francis, 2010. [Crossref], [Google Scholar], [Publisher]
  20. Ansary, E. Babaii, Iranian Journal of Applied Linguistics, 2003, 6, 57-69. [Crossref], [Google Scholar], [Publisher]
  21. Callow, Image Matters: Visual Texts in the Classroom. Newtown: Primary English Teaching Association, 1999. [Crossref], [Google Scholar], [Publisher]
  22. Small, C. Harris, E. Wilson, Journal of tourism and cultural change, 2008, 6, 17-38. [Crossref], [Google Scholar], [Publisher]
  23. M. Guijarro, J.M.P. Sanz, Journal of Pragmatics, 2008, 40, 1601–1619. [Crossref], [Google Scholar], [Publisher]
  24. P. Graci, Foreign Language Annals, 1989, 22, 477-486. [Crossref], [Google Scholar], [Publisher]
  25. S. Knox, Discourse & Communication, 2009, 3, 145–172. [Crossref], [Google Scholar], [Publisher]
  26. Thompson, Research Studies in Music Education, 2002, 19, 14-21. [Crossref], [Google Scholar], [Publisher]
  27. Young, In Proceedings of the 17th International Conference on Computers in Education. Hong Kong: Asia-Pacific Society for Computers in Education, 2009. [Crossref], [Google Scholar], [Publisher]
  28. L. O’Halloran, Interdisciplinary Perspectives on Multimodality: Theory and Practice, Proceedings of the Third International Conference on Multimodality, Palladino, Campobasso, 2009. [Crossref], [Google Scholar], [Publisher]
  29. L. O’Halloran, Linguistics and Education, 2000, 10, 359–388. [Crossref], [Google Scholar], [Publisher]
  30. L. Porreca, TESOL Quarterly, 1984, 18, 704-707. [Crossref], [Google Scholar], [Publisher]
  31. S. Stolley, A.E. Hill, Teaching Sociology, 1996, 24, 34-45. [Crossref], [Google Scholar], [Publisher]
  32. Reese, Social Studies Reviews, 1994, 33, 12-15. [Crossref], [Google Scholar], [Publisher]
  33. Unsworth, J. Wheeler, Reading: Language and Literacy, 2002, 36, 68–74. [Crossref], [Google Scholar], [Publisher]
  34. Unsworth, Multiliteracies and Metalanguage: Describing Image/Text Relations as a Resource for Negotiating Multimodal Texts. In Handbook of research on new literacies(pp. 377-406). Routledge, 2008. [Crossref], [Google Scholar], [Publisher]
  35. Danesi, Messages, signs, and meanings: A basic semiotic textbook in semiotics and communication theory. Toronto: Canadian Scholars’ Press, 2004. [Crossref], [Google Scholar], [Publisher]
  36. Hellinger, Women's studies international quarterly, 1980, 3, 267-275. [Crossref], [Google Scholar], [Publisher]
  37. Prater, Art Education, 2002, 9, 12-17. [Crossref], [Google Scholar], [Publisher]
  38. A. Conradie, African identities, 2013, 11, 3-18. [Crossref], [Google Scholar], [Publisher]
  39. A.K. Halliday, Language as Social Semiotic: The Social Interpretation of Language and Meaning. London: Edward Arnold, 1978. [Crossref], [Google Scholar], [Publisher]
  40. E. Nelson, R. Kern, In Principles and practices for teaching English as an international language (pp. 47-66). Routledge, 2012. [Crossref], [Google Scholar], [Publisher]




  1. H. Tahririan, E. Sadri, Iranian Journal of Applied Linguistics (IJAL), 2013, 16, 137-160. [Crossref], [Google Scholar], [Publisher]
  2. M., Lirola, J. Chovanec, Discourse & Society, 2012, 23, 487–507. [Crossref], [Google Scholar], [Publisher]
  3. Salbego, V.M. Heberle, M.G.S. da Silva Balen, Calidoscópio, 2015, 13, 5-13. [Crossref], [Google Scholar], [Publisher]
  4. Bell, Content Analysis of Visual Images. In: Van Leeuwen, T. and Jewitt, C., Eds., The Handbook of Visual Analysis, Sage, London, 2001, 10-34. [Crossref], [Google Scholar], [Publisher]
  5. Kaye, ELT Journal, 1989, 43, 192–195. [Crossref], [Google Scholar], [Publisher]
  6. L. Hartman, E.L. Judd, TESOL Quarterly, 1978, 12, 383-393. [Crossref], [Google Scholar], [Publisher]
  7. J. Thibault, Multimodality and multimediality in the distance learning age, 2000, 31, 1-384. [Crossref], [Google Scholar], [Publisher]
  8. Barthes, Image, Music, Text. London: Fontana/Collins, 1977, 67. [Crossref], [Google Scholar], [Publisher]
  9. Iedema, Visual Communication, 2003, 2, 29–57. [Crossref], [Google Scholar], [Publisher]
  10. H. Jones, 14 Sites of engagement as sites of attention: Time, space and culture in electronic discourse, 2005. [Crossref], [Google Scholar], [Publisher]
  11. Norris, Discourse and technology: Multimodal discourse analysis, 2004, 101. [Crossref], [Google Scholar], [Publisher]
  12. Peterson, T. Kroner, Psychology of Women Quarterly, 1992, 16, 17-37. [Crossref], [Google Scholar], [Publisher]
  13. Poulou, Language Learning, 1997, 15, 68-73. [Crossref], [Google Scholar], [Publisher]
  14. M. Maci, Altre modernità, 2012, 196-218. [Crossref], [Google Scholar], [Publisher]
  15. Royce, TESOL Quarterly, 2002, 36, 191–205. [Crossref], [Google Scholar], [Publisher]
  16. Van Leeuwen, A multimodal perspective on composition, Ensink. T., & Sauer, C.(eds.) Framing and Perspectivising in Discourse. Amsterdam: John Benjamins, 2003. [Crossref], [Google Scholar], [Publisher]
  17. Van Leeuwen, Discourse and practice: New tools for critical discourse analysis, Oxford university press, 2008. [Crossref], [Google Scholar], [Publisher]
  18. Van Leeuwen, Introducing Social Semiotics. New York: Routledge, 2005. [Crossref], [Google Scholar], [Publisher]
  19. D. Royce, T.D. Royce, W.L. Bowcher, New directions in the analysis of multimodal discourse, 2007, 63, 109. [Crossref], [Google Scholar], [Publisher]
  20. P. Coelho, Estudos em Comunicação, 2008, 4, 1-14. [Crossref], [Google Scholar], [Publisher]