Google Corpus

A new corpus tool has been announced by Google, as reported in this NYTimes article. This represents a 500 billion-word corpus (yes, you read it right) of written English taken from a selection of their scanned books published in the 200 years between 1800 and 2000. The corpus allows for varieties of British and American English, or all varieties, as well as a selection of publications in other languages.

The researchers that developed the resource have enabled online searching of words & n-grams on a dedicated Google website. The website provides handy graphical comparisons of relative frequency over time, which can include combinations of words or phrases (n-grams). The example here compares common quantifying expressions. Links are then provided to 'search for' your word/phrase in Google books, making an instant web-concordance.

If that is not enough for you, you can also download the datasets from Googlelabs. These provide the already analysed relative frequency and distribution of strings, although both OpenOffice & MSOffice are unable to open the enormous files.

Two things that have struck me as surprising with the announcement of this resource are the reaction by some linguists and the approach taken by some of the principle researchers. In the NYTimes article, "Alan Brinkley, the former provost at Columbia and a professor of American history, said it was too early to tell what the impact of word and phrase searches would be. “I could imagine lots of interesting uses, I just don’t know enough about what they’re trying to do statistically,” he said."Admittedly this is a historian rather than a linguist talking, but the project should not be a surprise to anyone who has been involved in corpus linguistics for the last 20 years. The ever-increasing size of corpora, and access to the internet as a corpus in itself, have inspired projects such as webcorp (and see this special edition of Computational Linguistics from 2003). Also, the article reporting the project (available from Science with a free subscription) describes what n-grams are and what they do. The statistical background to n-grams is not hard to find. Also surprising is the inclusion of Steven Pinker, not just in the NYTimes story but also on the list of authors. He claims an interest in language change, but neglects to point out that access to vast amounts of language data is dramatically eating away at his many claims for an innate language. Real evidence of language use and acquisition suggest ever more strongly that language emerges as a result of constant meaningful interaction with the environment (more of that on another post).

The other surprising response is that the principle researchers make fairly exaggerated claims about culture based on the change in patterns of frequency of use that the data reveals. Clearly patterns of use will change over time (and 200 years is still only a snapshot for many words), but claiming to have invented a new subject - "culturomics" - is probably taking things too far.

Tuesday, October 26, 2010

ISFC Plenary - Christian Matthiessen

Here is the next in the series of Plenary talks from the 37th ISFC in UBC, Vancouver. The conference theme, Language Evolving, is explored in phylogenetic terms, here, by Christian Matthiessen.

Rather than me attempting to explain, let Christan do so. This is the abstract for the talk.

Language evolving: Notes towards a semiotic history of humanity

The theme of this International Systemic Functional Congress is "language evolving". This can be interpreted either very generally or more technically.

(1) Taken very generally, this could mean language changing in any of the three time-frames that have been explored in systemic functional linguistics (see e.g. Halliday & Matthiessen, 1999) - phylogenetic change (language changing in the human species, or in human societies, over a long period of time ranging from generations to history of the human species), ontogenetic change (language changing in human individuals [seen as organisms or as persons] in the course of a lifetime, or logogenetic change (language changing in the course of the unfolding of text).

(2) Taken more technically (i.e. with "evolution" in the technical sense introduced by Darwin), this means language changing phylogenetically (cf. Halliday, 1995; Matthiessen, 2004) - language evolving as part of the evolution of the human species (in biological terms) and as part of the evolution of human groups (in social terms), these two being complementary aspects of human evolution.

Here I will focus on the narrower, technical sense of "language evolving". More specifically, I will explore the "big history" of humans (cf. Christian, 2004) - a deep time view of human evolution in linguistic, or more generally in semiotic terms, starting with the emergence of the human line and moving up to the present.

This will mean combining accounts of different time frames in the evolution of language (cf. Figure 1) that have tended to be treated in isolation from one another by different groups of scholars - e.g. the evolution of modern language is explored by linguists, anthropologists, palaeontologists and neuroscientists, but the much more recent evolution of our current language families is studied by historical linguists using comparative methods and the even more recent evolution of particular languages by historical linguists using the methods of philology and (nowadays) of corpus linguistics, although studies based on texts selected to show the emergence of new registers (e.g. Halliday, 1988; Gunnarson, 1993; Nanri, 1993) are still fairly rare.

At the same time, it will also mean supplementing historical accounts that are based on social considerations (including economic factors) but which background human semo-history. A sweeping history such as Christian's (2004) account is a very important contribution but while he recognizes the significance of the emergence of language, he does not build an account of the evolution of language into his history. There have of course been linguistic histories of important periods of human evolution. One recent valuable contribution is Ostler (2005), but he focuses on "major" languages such as Latin, Ancient Greek, Sanskrit, English and Spanish - major in the sense of languages seen as important in world history (languages of "empires"), and "minor" languages are a crucial part of the picture even though (or especially because) many of them are now in danger of disappearing together with their speech communities (see e.g. Nettle & Romaine, 2002; Harrison, 2007). In other words, we have to focus not only on language growth but also on language shrinkage; both are key aspects of human history.


Figure 1: Time frames in the evolution of language and humans

In trying to work towards a holistic history of human evolution, I can obviously only make certain observations that will guide the development of a more detailed account. One key principle is that human evolution must be investigated multi-systemically, using an ordered typology of systems - physically (1st order systems), biologically (2nd order systems), socially (3rd order systems) and semiotically (4th order systems; cf. Halliday & Matthiessen, 1999). In the course of human evolution, the emergence of Homo sapiens sapiens was almost surely a key transition. Homo sapiens sapiens has been called Anatomically Modern Humans (AMHs), but I believe that they were also Linguistically (or Semiotically) Modern Humans (LMHs, or SMHs). Some time - up to 100 K years before the present - after the emergence of AMHs, we see clear evidence of the acceleration of human evolution (common associated with the Upper Paleolithic). This acceleration can be interpreted as a shift from lower orders of evolution to higher orders - from primarily biological evolution to primarily socio-semiotic evolution. This is obviously a matter of degree; but the point is that once AMHs / LMHs had emerged, the scene was set for social and semiotic evolution to take over as the primary levels of evolution, and socio-semiotic evolution is much faster than biological evolution, so the evolution gradually accelerated to today's dizzying pace (cf. Delsemme, 1998).

Drawing on Hallliday's (e.g. 1975) account of ontogenesis, we can postulate a model of the evolution of language in three phases, as shown in Figure 1:

  • Phase I - evolution of protolanguage: this phase must have started many millions of years ago, long before the emergence of the hominid line;
  • Phase II -evolution of pre-modern language: this phase will have started with the first "burst" in the evolution of the human brain, probably around 2.2 million years ago (Homo habilis);
  • Phase III - evolution of modern language: this phase will have begun with the emergence of modern humans, Homo sapiens sapiens.

Since around 200 to 150 K years before the present, all linguistic evolution has been Phase III evolution. During this period, there is thus a huge gap between the emergence of modern language (metafunctional and multistratal in organization) and time around 8 K to 10 K years ago when the methods of historical linguistics enable us to identify the protolanguages of the language families currently accepted in historical linguistics (thus excluding putative more ancient groupings such as Nostratic; cf. Ruhlen, 1994). Phase III evolution can be interpreted in registerial terms as an ongoing evolution of the registerial make-up of particular languages together with the ongoing evolution of the contexts in which registers operate (cf. Halliday, 1988; Rose, 2005; Nanri, 1993).


Thursday, October 14, 2010

Constructivist Foundations

I have to recommend the journal Constructivist Foundations, edited and maintained by Alex Riegler.

You will find contributions from von Glaserfield, Varela and many others discussing (radical) constructivism, practical realism, pedagogical constructivism, similarities and differences with other philosophical perspectives, among many others.

These approaches offer an alternative philosophical foundation to the cartesian representationist approaches that underpin many cognitivist models of consciousness. A lot of the contributors have a deep interest in the role that language plays in construing our understanding of the world.

The journal is peer-reviewed and of a very high standard, but FREE! All you need to do is register with an email address to gain access to pdf versions of individual papers as well as whole issues.

Click on the cover (Vol.5 No.3) to go to the homepage, and subscribe for free to access current & archived issues and sign up for email alerts for new issues.

ISFC Plenary - Michael Halliday

The 37th International Systemic Functional Congress was hosted by Geoff Williams at The University of British Columbia, Vancouver, BC, Canada in July 2010.

Thanks to Geoff and the team at UBC, we can all see the videos of the plenary sessions, as well as downloading the presentations slides. (During the conference, they also set up chat rooms for remote participation in the sessions.)

This talk is by the 'father' of systemic functional linguistics (SFL) - Michael Halliday (see de Beaugrande's extensive description). Halliday focuses on the way that SFL includes, quite naturally, an evolutionary perspective.

Michael Halliday - Language Evolving: Some systemic functional reflections on the history of meaning
Talk given at 37th ISFC, UBC, Vancouver, 19 July 2010

This talk is fascinating in that it covers not only major aspects of SFL theory and application to evolutionary theory, including language viewed from ontogenetic, phylogenetic and logogenetic perspectives, but also anecdotes, asides and insights from Halliday himself.

Tuesday, October 5, 2010

ISFC Plenary - Terrance Deacon

The 37th International Systemic Functional Congress was hosted by Geoff Williams at The University of British Columbia, Vancouver, BC, Canada in July 2010.

Unfortunately I was not able to attend the conference, as I was working. However, thanks to Geoff and the team at UBC, we can all see the videos of the plenary sessions, as well as downloading the presentations slides. (During the conference, they also set up chat rooms for remote participation in the sessions.)

I will add the plenary videos to this blog, one by one. The first is by a non-linguist (he's an anthropologist), and so a non-systemicist. However, what Terrance Deacon had to say about the evolution of language was music to the ears of people involved in systemic functional linguistics.

Terrance Deacon - Language and complexity: Evolution inside out
Talk given at 37th ISFC, UBC, Vancouver, 20 July 2010

In this fascinating talk, Deacon develops his thesis that humans are a Symbolic Species by noting that all species are effected genetically as they become domesticated. Domestication produces genetic 'degradation' in that many of the functions previously carried out instinctively become transmitted through our social ecosystems rather than through genetic transmission - what was instinctive becomes learned. He concludes by saying that even if God had come down and given homo erectus an innate language gene - or Universal Grammar - by now, as a result of domestication, that gene would have degraded.


Who likes to read - for fun, or pleasure? Who likes to learn a foreign language while reading - for fun or pleasure - in that language? It seems not many of us, and the numbers are dwindling. That is why the Extensive Reading Foundation has been set up.

Headed by ESL luminaries such as Richard R Day, Rob Waring, Anne Burns, Averil Coxhead, Alan Maley, Paul Nation, David R Hill and many others, "The Extensive Reading Foundation is a not-for-profit, charitable organization whose purpose is to support and promote extensive reading. One Foundation initiative is the annual Language Learner Literature Award for the best new works in English. Another is maintaining a bibliography of research on extensive reading. The Foundation is also interested in helping educational institutions set up extensive reading programs through grants that fund the purchase of books and other reading material." (from the ERF homepage)

ERF has just announced its First World Congress for September 2011, and you can follow their blog for all the latest developments in helping the world to read.

Tuesday, April 13, 2010

Construing Experience

Can you experience without language? Certainly. But can you make any sense of your experience without language? Language enables us to construe experience, to make meaning of the stream of experience that is our journey through life. Language helps to divide experience into categories which, through experience and socialisation, become recognisable to the extent that they seem natural. It must be pointed out that these categories do not exist 'out there'. The categories that we use to make meaning are those that have developed within our communities. They seem natural because they are the categories we use to make meaning in every interaction with our environment, including interactions with everyone else who is doing the same thing.
To learn more about this great book through carefully selected and categorized quotations look at Chris Cleirigh's "sys-fun" pages. Here is a very good example.
"Language is set apart, however, as the prototypical semiotic system, on a variety of different grounds: it is the only one that evolved specifically as a semiotic system; it is the one semiotic into which all others can be “translated”; and it is the one whereby the human species as a whole, and each individual member of that species, construes experience and constructs a social order. In this last respect, all other semiotic systems are derivative: they have meaning potential only by reference to models of experience, and forms of social relationship, that have already been established in language. It is this that justifies us in taking language as the prototype of systems of meaning." Halliday & Matthiessen (1999: 509-10).
That is, we can make meaning through a variety of modes (visual, including photography, graphic design and visual art, aural, including music, and others) and as a result of a register variation (either through natural language - academic text, scientific jargon or air traffic controllers instructions - or through semiotic systems derived from natural language, particularly mathematics), but only because we can mediate their meaning through language.

Wednesday, April 7, 2010

Bonobo 101 - TED Talk by SSR

In this 'trailer' for her work Sue Savage-Rumbaugh explores some of the most significant findings of the pan-species experiments that she has been involved in. She begins by debunking the belief that language is biologically 'hard-wired' into our brains, and offers compelling evidence for the importance of culture in ensuing the development and continuation of art, tool-use and language.

Savage-Rumbaugh's work on developing a human-bonobo culture that encourages both species to learn from each other has been documented in an amazing variety of sources. She has also made significant contributions to the Great Ape Trust. I found one book in particular to be a fascinating study of how man and ape may be considered as sharing a great deal of communicative skills.

Wednesday, March 17, 2010

Black Cat Readers

The place I work at (Khalifa University) hosted a presentation by Rob Hill, a leading author at Black Cat Publishers.

The presentation was a very 'live' version of the advice Rob gives in his useful "The Black Cat Guide to Graded Readers" available on the Black Cat Website

I am very impressed by the Black Cat catalogue of graded readers. When you look at the Black Cat website you will find a range of titles which you can preview. All of the titles that I checked included "dossier" pages which provide factual contextual background on the stories which range from Shakespeare and traditional fairy tales to modern high-tech thrillers. Young boys in particular like to read non-fiction, and so adding factual information that contextualises the stories should help to keep their interest. The Teacher's Corner also includes lots of extra material to help introduce readers into the classroom.

Sunday, March 14, 2010


It's that time of the year again. In this neck of the woods, in the UAE, March spells the return not only of the annual TESOL Arabia conference, but all the hoopla that goes with it. As a range of authors, experts and assorted academics are in town for the weekend, the world and his wife want a piece of the action.

The conference this year was just like most years, except more so. That is, disappointing, although this year the venue was the biggest culprit.

This year, however, I am glad to say that the READ Sig was launched, and seems to have generated a lot of interest. The sessions provided by Tom Le Seelleur were mainly well attended and provided a good springboard for the campaign to get the UAE reading. In addition, we have both been very busy preparing the inaugural issue of the magazine for the organisation, called "READ" (big surprise, there!!).

READ Magazine Issue 1 contains articles by Sheikha Bodour Al Qasimi, publisher and daughter of the leader of the Sharjah emirate, who describes how she reads to her children, Isobel Abdulhoul, previously head of Marudy's bookstores and now curator of the Emirates International Festival of Literature, commenting on the importance of literature, authors Peter Viney, Philip Prowse and Caroline Brandt along with a host of talented and committed teachers, librarians and educators who all explain how they are working to raise the profile of reading across the country.

Sunday, February 28, 2010

Interactive IPA

Hands up who loves the IPA (International Phonetic Alphabet)? Maybe not many outside the phonetics and phonology specialisations. It probably isn't my favourite part of linguistics either. All those difficult names to remember related to teeth and lips, tongue and throat.

I am very jealous of people who are learning the IPA in the computer age. No more are linguistics students required to sit alone making absurd noises and imagining they can see clearly the difference between a close-mid front and an open-mid front vowel sound. It isn't that we can't all hear the difference, it is just that it is really hard to feel it. Newcomers to the IPA can now have clearly pronounced interactive charts that clearly show how the sounds are made and the differences between them.

I came across the first one looking at extra pages from O'Grady et al.'s "Contemporary Linguistics". I think they took the idea from Paul Meier and Eric Armstrong. All you have to do is click on a chart which offers explanations by pointing the mouse at the different terms. Then click on a phonetic symbol to hear how it sounds.

I found another version prepared by the good people in University of Victoria in Canada (but I couldn't get the sounds to work).

Saturday, February 27, 2010

Software for Phonetics

Perhaps the best software available for examining speech is the PRAAT freeware developed by Paul Boersma and David Weenink at the University of Amsterdam. (Also available from here.) You can now download version 5.1.26. This programme was used extensively in Halliday and Greaves' Intonation in the Grammar of English. If that isn't a top rate recommendation, I do not know what is.

PRAAT offers
Speech analysis:
•spectral analysis (spectrograms)
•pitch analysis
•formant analysis
•intensity analysis
•jitter, shimmer, voice breaks
•excitation pattern

Speech synthesis:
•from pitch, formant, and intensity
•articulatory synthesis

Labelling and segmentation:
•label intervals and time points on multiple tiers
•use phonetic alphabet
•use sound files up to 2 gigabytes (3 hours)

Speech manipulation:
•change pitch and duration contours

•multidimensional scaling
•principal component analysis
•discriminant analysis

and much, much more.

What else can it do? Take a look at the screenshot:

The question should be: What can't it do?

I expect Paul and David would love to hear it if you have further challenges for them.

Wednesday, February 24, 2010

The Lowest form of Punctuation?

If sarcasm is the lowest form of wit, what does that make the SarcMark?

A bargain at just $1.99, you can make sure that your ironic and sarcastic messages in e-mails or other forms of written English do not get misunderstood. Just add the SarcMark© to the end of the sentence and your Uncle is Bob. Apparently, the correct selection of words is just not enough to ensure a giggle at the other end, and now you have to tell people when you are being sarcastic.

But, maybe I'm being a little unfair. Maybe we understand that someone is being sarcastic because of the look on their face, their tone of voice or the incongruity of the situation and what they say - all those things that are lost when writing.

Perhaps one day the SarcMark will be as frequent as the exclamation mark and we will all wish we had bought shares in I know this is not especially new, so you can review other people's comments in the Guardian.

Sunday, February 21, 2010

Literature, Language Teaching and Hype

I was reading a paper today - very similar to another conference paper I sat through about a year ago - which makes a wide range of claims about asking language learners to study "Literature" (the big L is intentional).

For instance, the paper quoted research that claims there is value in discussing the unusual use of language typical of much of the greatest literature because it stands in contrast to the typical patterns of non-Literature. This seems to me to be an irresponsible empirically-unverified assertion that could easily lead to wasting the time of a large number of language learners. It may be true that part of the value of Literature is its contrast to typical linguistic patterns, but this can only be understood if one is in the position of comparing Literature to the typical patterns. Clearly, a major aim of second language learning is to guide students to the typical patterns of a language. Without the typical patterns in place, how can one "appreciate" the use of atypical patterns?

Only after intensive and extensive reading has helped students reach an advanced level will a study of literature - for those that have an interest - benefit some students.

THE OTHER Linguistics Joke

Q: Two linguists were walking down the street. Which one was the specialist in contextually indicated deixis and anaphoric reference resolution strategies?

A: The other one.

[Unashamedly stolen from Geoffrey K. Pullum's blog. Unlike the excellent apology offered by Professor Pullum for posting a joke on his blog, I feel no such compunction. Unashamed, unrepentant and unapologetic.]

Wednesday, February 17, 2010

Stefanie Posavec

Stefanie Posavec provides us with one more way of visualising language - well, actually quite a few ways.
The picture you see here, which has been exhibited at a number of art galleries, is an anlysis of Jack Kerouac's 'On The Road'. The "Literary Organism" represents simultaneously the basic structure of the book (as a tree structure), the length of sentences (as 'thickness' of strands), and the main themes and protagonists (as colours). And, IMHO, it looks lovely, too.
Posavec uses other methods and produces similarly strking work by representing other aspects of a literary piece. She has also worked on Darwin's "Origin of Species" and the lyrics of various albums.
You can find out much more on her home page.

Tuesday, February 16, 2010

Tag Cloud

Tag Clouds have been all the rage since New York Times published 'Dubya's' State of the Union Speech because it demonstrated visually just how paranoid a speech it was.
So, what is a Tag Cloud? Well, if I were to do one of this blog it would look like this.
This image was produced using
Here's another example:


created at

This one was created by copying one of my articles into the website at Tagcrowd which produces a tag cloud with the top 50 items. Can you guess what the article might be about?
As you can see, the words are sized according to their frequency. The colouring can also be semanticised so in the State of the Union versions, newer words are in darker colours than older fading words.

And if that isn't enough for you the TagCloud Generator offers a 3-d dynamic all-singing and dancing version (ok, it doesn't actually sing). It also offers the chance to download and keep the swirling dynamic cloud you have made.

You can use web-based versions, and get fairly good results, or you can download your own Tag Cloud generator for use with any text from Chirag Mehta's website - look for the Download Tagline Generator link, but you will need some IT technical background to get it working.
Another good thing about this website is that Chirag shares some of the methods for creating Tag Clouds - without having to "reverse engineer" the software. Basically, a list of each unique word in a text is created and each word is counted. Then some words, like common grammatical words (the, this, is etc.) and words for organising text (thus, notwithstanding), are removed. From the frequency list, words are selected and then displayed so that font size increases with frequency. It's that easy!!

Visual Thesaurus

I have been a fan of Thinkmaps' Visual Thesaurus for a long time (since about 2001 by my reckoning which, in software years is, like, forever, man!!).
I have used their outputs a number of times in my teaching because I think they are fabulous combination of software know-how, visualisation and knowledge of language.
Revisiting the Visual Thesaurus homepage after a long absence (sorry, guys) I find that they have added a lot of ideas and links for teachers.
The basic deal is still the same: you can try it out for a few words, but if you want all the options - and there a more than enough to put a smile on any lexicographer's face - them you need to subscribe to a full version.
Basically, just like a thesaurus, you start with a word for which you want a replacement. The Visual Thesaurus (VT) does not just provide you with semantically related alternatives, it colour codes them for part of speech, and visually represents 'closeness of fit' by varying the distance between the search word and its alternatives. It provides definitions, will pronounce the word for you and gives the type of semantic relationship between words. From here you can find out how it works - or at least how you work it, cos they probably won't let you know their trade secrets.
The really fun part is to then just surf through the thesaurus, by clicking on a new word which becomes the central node in a new network. As you see your original word pushed to the boundary and then fade into the distance you are taken on a semantic journey of related ideas and connections. VT will even do this for you in "Autopilot" mode. Sit and watch the connections fly.
A few examples from the limited sampler will give you an idea of what they offer:

Friday, February 5, 2010

The Language of Blogs Blog

Here is a link to a blog about the language of blogs by a great linguist - Greg Myers.

The blog has developed into a new book which, if his previous work is anything to go by, will be an interesting read about how meaning is negotiated online and how traditional relations of power are challenged by new modes of publication.

I first noticed Greg Myers' work from his article in Applied Linguistics called 'The Pragmatics of Politeness in Scientific Articles' in which he discussed how scientific knowledge is transformed at least as much as the result of pragmatic, social factors as so-called scientific factors.

Also on Greg's blog is a list of linguistics and language-related blogs. I will try to find some time to see what other linguists have to say on their blogs.

Wednesday, February 3, 2010

Go Tom Go

"Big up" to friend, colleague & nutty boy Tom.

Tom was interviewed by the local rag in their "Me, Myself, I" feature, and spent most of the interview trying to move the subject away from himself on to the issue of making more people read.
Tom has spent a lot of his time trying to initiate coherent action towards improving reading habits. This does not apply to the rates of literacy in the UAE (which have grown miraculously over the last 3 generations), but to the love of reading, which does not seem to be very prevalent in the region as a whole. Tom plans to apply the lessons learned from the literacy campaigns in the UK, such as the The National Literacy Trust and campaigns such as Reading Champions.

In the Reading Champions campaign, role models and heroes of young people are portrayed engaging with their favourite books and interviewed about their reading experiences. The rationale is that by seeing reading as in important part of their heroes' lives, children will be more encouraged to read, and to see reading as 'cool'.

I wish Tom success in this attempt and in his work towards setting up a SIG (Special Interest Group) with TESOL Arabia to promote good practices in teaching reading - but more of that another day

Language Analysis Software for Free

I do not know where I would be without UAM CorpusTools (currently on version 2.4.2).

This free software allows you to create categories which you then apply to texts (or images). The results can then be searched, sorted, analysed and exported. Great for discourse analysis of any kind, but ideal for systemic functional linguistics.

Thanks to Mick O'Donnell for his hard work in constantly updating the package so that it consistently performs better and offers more functions.

Wednesday, January 27, 2010

Teachers are Cool

They are cool when they are as good as Taylor Mali.

Slam poet, stand-up comic and experienced teacher Taylor Mali puts a smile on your face and reminds us why good teachers are so important.

Here's a sample of his wit

entitled "What does a teacher make?"

And there's much more on his homepage

Tuesday, January 26, 2010

Daniel Everett: Don't Sleep, There are Snakes

What's not to like? A passionate description of an exotic culture. Wild jungle, wild animals and wild times. Stories of learning by an 'educated' white guy from 'uneducated' natives. And as if that wasn't enough - there's enough fieldwork data to severely rock Chomskian linguistics to the core.

Dan Everett spent 20 years or more living with the Pirahã on the banks of the Madeira river in the Amazonian rainforest. His mission – in more ways than one – was to learn the language of these ‘primitive’ people in order to translate the bible into their language so they could be converted – presumably into ‘civilised’ people who fear and worship God. As Dan learned more about the Pirahã, anthropology and linguistics it became clear to him why his mission was pointless. The ‘primitive’ Pirahã, it seems, have a core value which dictates a large part of their lives: they are pure empiricists. They do not believe anything that they have not witnessed themselves – or, at a stretch, what is reported by a reliable witness. So, when Dan the missionary is asked how he knows Jesus, the good book just doesn’t cut it with these people. Apparently, these primitive people do not believe when someone tells them about what someone wrote down thousands of years about someone else they had never met in another language in another country. Funny that – well, funny that we should believe it. In fact, they do not believe in any mythical metaphysical explanation for why we were put on earth. They just get on with it.

So, why is that important for linguistics? Bear with me for a moment. According to the generative school of linguistics, created by Noam Chomsky over 50 years ago, language is pre-wired into human brains. Chomsky and his followers, including Pinker, have spent enormous time, energy and research dollars trying to prove exactly what it is that unites ALL human languages – if language is innate, as they claim, one would expect to find a wide range of common features across all languages past and present. Well, it seems that they have only been able to find one common feature – it’s called recursion. Recursion is generally agreed to cover aspects of language which repeat themselves inside themselves. A great example is (This is the cat that chased the mouse that ate the malt that stood in) ‘The house that Jack built’. We see here how a structure in language can repeat itself inside itself – in English this can be accomplished using relative clauses – presumably ad infinitum. We can also see how, using conjunctions for example, clauses can be added to each other, and added to each other, and added to each other, and added to each other… etc. So, let’s get back to Everett. What Everett found that has upset generative grammarians is that the Pirahã language appears to have no recursion at all. There is no embedding of ideas inside other ideas. There is no joining of sentences to make longer sentences. Each idea is separate and self-complete. This claim, and the claim that the Pirahã cultural value of empiricism affects their language as well as their culture, has been tested by other anthropologists and linguists. In general, they confirm Everett’s analysis.

Just as in horticulture, the Amazon rainforest has provided us with an exotic species that turns our understanding of medicine, biology or horticulture on its head. In this case, an Amazonian culture and language has demanded that the inductive-driven linguistics of the late 20th Century rethink its fundamental principles. If recursion is the only factor common to all human languages, and recursion is not common to one language, then there is no common factor and so the assumption that humans come ‘pre-loaded’ with language can no longer stand the weight of evidence.

There is one thing that I do not like about the stories that Everett provides. It seems tragic that anyone wanting to become a missionary, even in the 20th-21st centuries, is provided with all the necessary resources to live with an indigenous group of people. Meanwhile, anthropologists and linguistic anthropologists can barely find the money to study language groups in their own countries. The happy ending for us (but not for Dan, who is now divorced from his missionary wife) is that Everett makes the transition from missionary to linguist and anthropologist as a consequence of his encounters with the Pirahã.

Thanks to Phyllis & Alistair Burns who gave me the book for Christmas.
Noam Chomsky & His Dog Predicate

This is a great comic strip based on the fact that Noam Chomsky is probably the most well-known 'dissident' US academic (he actually disagrees with the government and isn't afraid to say so) through the publication of books such as 'Manufacturing Consent' & 'Hegemony or Survival'*. There is almost no mention of linguistics in the comic (the one below being a notable exception).

The link takes you to the full list of Chomsky's adventures. Hope you enjoy them. I hope Noam enjoys them too.

*I've always thought it is no coincidence that the father of generative linguistics makes no reference at all to linguistics in his version of critical discourse analysis. I am forced to conclude that generative linguistics is unable to explain exactly the language that Chomsky identifies that is used by the media and the US government to manipulate power relations. Meanwhile, functional schools of linguistics have made great progress in identifying exactly how language works to perpetuate unequal power relations.

Wednesday, January 20, 2010

Real Phrasal Verbs


The following extract from the local rag offers a range of phrasal verbs in context. There's such a range of verb and particle here it is almost impossible to classify each one.

Let's start with 'make up' on its own:
Make up: K by Beverley Knight

I was into make-up at such a young age. Mum never wore much make-up so I think I made up for it.

Make-up, a noun probably derived from the phrasal verb make up, is used here in the same breath as make up for something.
The individual meanings of these two phrasal verbs are very different, but they each offer a chance for us to try and understand the use(s) of the particle "up".
Kiss and make up

As if that wasn't enough, Beverely Knight (for it is she) reportedly continues:

From about 12 or 13 I wanted to have a full face of make-up on and it never went down too well with her. I just found I had a knack for applying it and I ended up the girl who did everyone's make-up before we went out.

We have, in context and in order, the following verbs and particles:
have something on
go down well with something
end up
go out

Oh, that would go down well with some tortilla chips
As in all cases, no matter how you may wish to categorise them, the key is to see what meaning the particle contributes, before looking at how that meaning is modified by the verb.

Acknowledgment: K. Crane. 2010. A Knight in Beverley Hills. Tabloid, Gulf News, Jan 10, 2010, p.9

Tuesday, January 19, 2010

THE Linguistics Joke

From a linguistics lecture:

Lecturer: In general, when it comes to negation, languages, or more precisely dialects and registers of languages, tend to fall into two groups. On the one hand we find those languages which employ more than one negative to emphasise negativity. And on the other hand we find those languages, dialects and registers that follow the rules of logic so that a second negative cancels out the first. So we have languages where two positives make a positive, two negatives make a negative, and languages where two negatives make a positive. But we have never come across a language where two positives make a negative. Ha, ha, ha.
From the back of the room comes a voice: Yea, right.

Monday, January 18, 2010

Real Passive

It's difficult to find real examples of very low frequency passive forms - by definition. So here's one I heard recently that was produced perfectly naturally,in context, in a Sky News report:

it was highlighted to police that in this house dangerous dogs had been being bred.

Sky News 30 Nov 09

Say it aloud, as a news reporter, to get the right sound.
The corpus linguist's best friend, Google, offers 5 more examples, using bred, and the WebCorp Web Concordancer and Collocation extractor really only offers "for" to the right as a significant collocation.

Saturday, January 16, 2010


What is the definition of a drill?

A device for boring

Monday, January 11, 2010


Moodle is an open-source application designed to facilitate online learning. You can download everything you need in one go from the main Moodle site, but its always better if you can get some technical assistance so you do not have make your own computer double up as the server for your site.

Originally based on WebCT, Moodle has been further developed by the online community. A wide range of tools are available for use in education in 45,000+ sites in 209 countries.
Moodle LMS (Learning Management System) offers tools for: Delivery, Reporting, Communications, Collaboration & Assessment. (All SCORM compatible - see this guide).
Moodle 2.0 due out soon.
Roles: Administrator, Teacher, Student, Guest, Parent with different permissions.
Reports on classes, students (e.g. for activities carried out), activities (e.g. who has completed) & others.

Within a few hours I was able to set up a course with a wide range of resources, but I have tried it out before and I used to use WebCT a lot. I look forward to learning more about the productivity tools on offer.

Thanks to for today's presentation.

Sunday, January 10, 2010

Dave Willis: Rules, Patterns and Words

Dave Willis (2003) Rules, Patterns and Words - Grammar and Lexis in English Language Teaching (Cambridge University Press)

In case you missed it, the last 20 years has seen a debate raging about the relative value of grammar and vocabulary in language teaching. Some say you can make sense with the right vocabulary and no grammar in a foreign language better than you can make sense with the right grammar and no vocabulary. To assume that you can have one without the other is, of course, nonsense, but to assume that patterns can only be found in grammar is also denying learners the chance to start creating their own meanings.

What we need is someone who can keep the new clean baby, whilst throwing out the dirty old bathwater. If an over-emphasis on verb-based grammar (derived from 40-year old morphology studies) is the bathwater and the generative power of lexical patterns is the baby, then Dave Willis is our man. Like any good teacher, Willis is not really interested in the debate itself. What he wants to know is: How will it help us as language teachers, and how will it help language learners?

So what right does Dave Willis have to tell us language teachers that we need to look at language more from the lexical side of things than we have in the past? Dave Willis was a language teacher for many years, he worked with the COBUILD team from very early on, had access to most of the new ideas that were being produced from that research and then produced (with Jane Willis) the first language course to be based on a lexical rather than a grammatical description - the COBUILD English Language Course (which, incidentally, was also the first to use a task-based methodology throughout). Pretty good credentials, if you ask me.

But does this book rest on what was? No. This is an entirely new approach - a new thesis that moves us beyond Willis' 'The Lexical Syllabus' (click to access downloadable version) and into a much more classroom-friendly approach, but does not ignore the traditional calls for some grammar teaching. The book not only offers a fresh description of language for language teachers, it is also full of sensible ways to pass on these ideas to students and help them with the task of language learning.

It is extremely rare for me to read an EFL book from cover to cover, but I could not put this one down. It reads well, and should sound true to most language teachers. If you want to find out how best to help language learners deal with grammar and lexis, look no further than this book.

You can learn more about Dave (and Jane) Willis at

Saturday, January 9, 2010

Gerald Edelman: Wider than the Sky

G. Edelman (2004) Wider than the Sky - The Phenomenal Gift of Consciousness (Yale University Press)
Google Books

This is probably one of the most important books that I have read in the last few years. It is an easy read and very thought-provoking. Highly recommended.

This is a synopsis and some notes and quotes from 'Wider than the Sky'.
Edelman's work on both researching and describing neuroanatomy has significantly changed the way we see how the brain works. It is not too difficult to follow and should be enough to rock subjects like psychology to the core as they seem happy to proceed on the delusion that there is some kind of metaphysical (i.e. non-physical) mind that bears no resemblance to the brain. With people such as Edelman and Maturana and Varela on the case, metaphysical approaches to the mind should soon be a thing of the past (wishful thinking!!)

Re-entry within the dynamic core of the brain allows for primary consciousness: mediation of value-category memory (originating in bodily experiences, and thru re-entry can be re-enacted with or without motor function at any time) and perceptual categorisation (the here and now of sorting perception into different objects).
Higher-Order consciousness = re-entrant circuits mediating between primary consciousness and semantic capability. Symbolic nature of semantic dissociation between symbol and meaning combined with the flexibility of manipulating these symbols thru syntax releases the consciousness from the “remembered present” and thru these re-entrant circuits enables remembered past, imagined past and future, and planned future.

“although the conscious process involves representation, the neural substrate of consciousness is non-representational” (104)

“mental images arise in a primary-consciousness scene largely by the same neural processes by which direct perceptual images arise. One relies on memory, the other on signals from without.” (105)
[it is thru re-entry that these processes are so similar]

This view rejects the notion of computation and the idea that there is a “language of thought.” Meaning is not identical to mental representation. Instead it arises as a result of the play between value systems, varying environmental cues, learning, and non-representational memory. (105)
[also Thibault Jnl of Prag.]

“…much of cognitive psychology is ill-founded. There are no functional states that can be uniquely equated with defined or coded computational states in individual brains and no process that can be equated with the execution of algorithms. Instead, there is an enormously rich set of selectional repertoires of neuronal groups whose degenerate responses can, by selection, accommodate the open-ended richness of environmental input, individual history and individual variation. Intentionality and will, in this view, both depend on local contexts in the environment, the body and the brain, but they can selectively arise only through such interactions and not as precisely defined computations.” (111)
[embodied and grounded!!]

Constructivist brain:
“Filling in of the blind spot, the phenomena of apparent motion, and gestalt phenomena can all be explained in terms of temporal synchrony in re-entrant circuits. The same is true of time, of succession and of duration. The re-entrant brain combines concepts and percepts with memory and new input to make a coherent picture at all costs.” (124)
e.g. saccades: eye movements are erratic, with the eye ‘jumping’ to a new point of focus, often as a result of peripheral vision, and then resting. Our experience of vision, however, is one of a smooth transition from one scene to the next.

”Given the continual sensorimotor signals arising from the body, subjectivity is a baseline event that is never extinguished in the normal life of conscious individuals. But there is no need for an inner observer or a “central I” – in James’s words, “the thoughts themselves are the thinker”.” (134)

Higher order consciousness may be considered as a trade-off of absolute precision for rich imaginative possibilities. (135)

The pervasive presence of degeneracy in biological systems is particularly noticeable in neural systems, and it exists to a high degree in the rentrant selective circuits of the conscious brain. In certain circumstances, natural languages gain as much strength from ambiguity as they do under other circumstances through the power of logical definition. Association and metaphor are powerful accompaniments of (135) conscious experience even at early stages, and they flower withy linguistic experience. (136)

…the study of consciousness must recognize the first-person, or subjective, point of view. (140)

Consciousness is a property of neural processes and cannot itself act causally in the world. (141)

Whether in the dreams of REM sleep, or in imagery, or even in perceptual categorization, a variety of sensory, motor, and higher-order conceptual processes are constantly in play… in visual imagery, the same reentrant circuits used in direct perception are reengaged but without the more precise constraints of signals from without. In REM sleep, the brain truly speaks to itself in a special conscious state – one constrained neither by outside sensory input nor by the tasks of motor output. (144)

When you speak, my brain speaks your speech for me

Okay, so what is a functional linguist doing reading Current Biology?
Not a lot, of course, but you find inspiration in the strangest of places. In a recent article, "The Motor Somatotopy of Speech Perception", D’Ausilio, Pulvermüller, Salmas, Bufalari, Begliomini, and Fadiga point out that our concept of perception of speech sounds - often considered a fairly passive process of understanding the inputs we all receive from the environment - needs revising. According to their very careful research, when we listen to someone speaking to us, we do much more than try to work out what they are saying. It seems that we are very active in the process. This may not come as a surprise to a lot of people - especially language teachers - but just how active has perhaps not been realised before. (Some people of course are not particularly pleased to hear these results, though). What D'Ausilio and colleagues have discovered is that, more than likely, as we listen, our brains implement the motor programs that are required to produce the speech we are listening to. That is, to put it simply, as you speak my brain is doing everything except physically articulate the words you are speaking. The time saved by not actually articulating and physically producing the sounds is roughly equivalent to the time necessary to comprehend speech. This suggests that a major part of the listening skill is speaking. Although this development is based on a very cognitive and neurological part of the comprehension process, it is similar to theories of understanding developed by the 'mirror neuron' theorists, such as Arbib and Rizzolatti. That is, the process of communication is not a simple one of Sender-Medium-Receiver. We are constantly in the process of making meaning, as we listen, as we speak, as we write and as we read. Listeners and readers do not play the role of trying to make their thoughts match those of the speaker or writer. They make their own meanings that, because of the conventions of the socially accepted rules of a language, bear resemblances to the meanings that the language producer was attempting to make.

The full article is available to subscribers at


You can lead a whore to culture, but you can't make her think.

Dorothy Parker

He who laughs last

...laughs lastest

(Thanks to Ritchie Stevenson)

Say it out loud and it gets better!!

Friday, January 8, 2010

Top Tech Tip

I have recently been converted to Google Wave.

I am a beta user, and have been very impressed.
If you have used Microsoft's OneNote, you will be familiar with multi-media documents that contain text, pictures, video, other documents, sound clips etc. Google Wave is like that, with drag-and-drop ease, except it is online, so you can access it from anywhere. And not just you. Wave is designed to be a multi-author tool. When you start a Wave you can invite others to join in (and they can invite others, too). So it is ideal for collaboration, either synchronously or asynchronously. Seriously, don't knock it til you've tried it. Just like when gmail started, though, you need to get invited to start a Wave. Or you can be joined on a wave, and then you're in.
As with most technological innovations, there are a range of pedagogical applications that can be imagined. Clearly, there are dangers of plagiarism. But you only need to be concerned about this if the individual is the only concept that your pedagogical system can tolerate. Google Wave was made for group work. (It should also provide a major boost for research groups - particularly as many of these are geographically dispersed these days.) When the product is the result of collaboration, the Wave offers ease and power. Try it.
Ride the Wave!

There's a lot more info here - a 1 hour-plus video
and here - a much shorter video.

Thursday, January 7, 2010

Go for the jugular

Start as you mean to go on...
The first topic of discussion for this blog about LLL starts with the sentence.
The sentence was mainly created in the minds of language theorists for the benefit of teachers and linguists.
To be more specific, the sentence can only operate in the context of modern writing, especially writing with punctuation marks. Sentences are literally defined by punctuation and commas. However, we never speak full stops or commas. That means that before we wrote language down, before we used punctuation, we never had full stops or commas because we never say them. This means we did not have the concept of a sentence before we wrote language down.

Of course, it may be that we did not discover sentences until later, but when we look at how we speak and how we write we find that language is structured quite differently. Write down a natural conversation and try to work out where to put the full stops. In a lot of cases, it is almost impossible to decide what a sentence is, in the traditional sense that we are to write, Some linguists like to imagine that the written form of the language is the true or 'pure' form and that spoken language is generally a degenerate form. However, the opposite must be true as (phylogenetically) all human cultures have developed some form of spoken language, but not all have written forms, and not all forms of written language construct sentences defined by punctuation. Further, (ontogenetically) we all speak before we write - and not all of us manage to write.
In short, linguists are obsessed with the written form - see Per Linell's book "The Written Language Bias in Linguistics" for more detail.
If we look at the history of the written form of the English language (and most western European scripts) we can trace the beginnings of punctuation to a corresponding shift to silent reading.
That is, punctuation is designed to help readers so that they can read silently rather than having to read aloud. In earlier centuries, western European written scripts were a string of continuous letters, known from antiquity as scriptura continua.
The only way that people knew how to read at that time as aloud. Our image of "murmuring monks" is derived from the practice that without voicing the script, the scribes in ancient monasteries could not make any sense of what they were copying. Paul Saenger has looked at this in great detail in his book published by Stanford University Press.
To be continued...