2.1.1. Learning objectives of the module:
This module provides an introduction to corpus linguistics for LSP teaching. It offers an overview of what corpora are and to what extent they can be a reliable source of naturally occurring language in different specialised discourse communities (academic and professional). You will become familiar with a variety of corpus linguistics tools (corpus building tools, concordancers) and learn how to perform corpus data mining. You will also consider what to do with the results of corpus data mining and, more generally, the benefits of corpus linguistics in LSP teaching. The module also provides opportunities for you to build your own specialised corpus and to reflect on its use in lesson planning and classroom practice.
At the end of this module, you will
2.1.2. Activities to complete in order to validate the module:
There are three possible levels of engagement in the course, and therefore three different validation levels with specific requirements:
2.1.3. Introductory activity
Write a post (in the “comment” box below) and/or comment on someone else’s post to share your thoughts on this quote:
“[W]e are, all of us, corpus users, because we use the internet.”
(McCarthy 2008: 566)
You must be logged in to post a comment.
To add more to my (contrarian) comment, I suppose it is the generalization of the search function that makes the use of the internet more akin to corpus using.
I might sound as a contrarian, but in my opinion all readers of written texts are corpus users ,and I am not quite sure how using the internet might make us more of corpus users. I suppose I need to read the article this quote is from to understand the context.
The quote at hand to me poses the question of the medium as a transmitter of content or as a body of content itself. Linguists (Daniel Bougnoux, Edgar Morin) traditionally differentiate medium and content but the internet tends to blur the lines, because it has properties that other media don’t:
– it’s immersive and overwhelming (the experience of browsing, writing or exchanging online has the user totally disconnected from reality, unlike the radio or even TV, which can be listened to or watched while doing something else like cooking, running, working…);
– to some extent it constitutes a fully-fledged realm that exists independently from reality (unlike at the food market or a family dinner, on the internet discourses -which includes words but all manner of other media- are not only exchanged but also recorded, so the internet is a place of authority -in the sense that a user can become an author, even unwillingly- and it’s a place that is ever-expanding, to the point that it is taking over reality in terms of screen time versus non-screen time, in many people’s lives, scary!).
To me using the internet doesn’t turn a person into a corpus user any more than using a sewing machine: for the former you have to get used to specific vocab and procedures, for the latter too (if you take the time to read the instructions manual!). What’s more you can probably get the sewing machine manual on the internet, and you can also get printed manuals on how to use the internet in book shops. The internet is a medium that may include corpuses, but that doesn’t make it a corpus, just a medium.
I believe the quote refers to the internet as a body of internet, so we are all “corpus users”.
Absolutely! The internet has become an integral part of our lives, and by interacting with it, whether through search engines, social media, or browsing websites, we are constantly contributing to the vast corpus of digital information. Each click, search query, or post adds to this ever-expanding repository of data, shaping and reflecting the collective knowledge and culture of our digital society
I may be alone here but I don’t like this quote – as arguably most internet time is either roaming time or passively receiving information pushed to us through social media, or even following ‘given’ links and relying on the ‘objective’ search powers of Google – we may fall into the trap as believing that this language, as part of an ‘independent internet’ is truly representative. I think we would need to identify how individuals use the internet before we can make such a claim. We could say a similar thing about the mainstream media – when we have too much choice, we really have no choice – when we are overwelmed with information it is easier to find a ‘tool’ that feeds us the information we want but I think it’s quite dangerous to suggest this could be ‘objective’. Having said that, this argument could arguably be applied to ‘real’ corpora in that someone (with their own ideas and viewpoints) has compiled such corpora according to their own cultre / belief system – but if we approach corpora with a view of its limitations (date of material, sources…) we can reduce some of the potential for bias.
To me, this quote suggest that the Internet may be, in itself, a broad corpus of texts that each of us uses and engages with with when we search, share, comment, etc. Therefore, we are, in a way, already familiar with copora, although not fully aware of it.
That’s exactly it, Isabelle! 😉
After what I saw and read in this section, and also as an enthousiaste about the ICT, I am looking forward to learning more about creating a corpus, specially in this age of AI!
AI also came to mind when reading this quote, as they are trained using a corpus… In this case, the larger and the more diverse the corpus, the better the AI model.
I like this quote because it shows that building and using a corpus is not something only linguists know how to achieve.
However, I think that the Internet is too broad to be the only tool used in a classroom. Being trained to corpus-driven approaches is more than useful for a teacher, especially one who wants to focus on authentic content.
Yes, good points, Marion ! 😉
This perspective highlights the collective nature of the internet. We all shape and are shaped by the information it contains.
It raises questions about privacy and ownership of data within this ever-expanding corpus. Thus, prompting us to consider the potential biases and limitations inherent in a corpus built on user-generated content.
We are all living people, exchanging ideas with others, using internet daily. We can’t refuse it any longer.