ข้ามไปที่เนื้อหาหลัก

Using Corpus Analysis Software to Analyse Specialized Texts



Using Corpus Analysis Software to Analyse Specialized Texts




1. What is a corpus?
 In corpus linguistics, a corpus can be generally defined as… ‘a collection of naturally-occurring texts in a computer-readable format which can be retrieved and analyzed using corpus analysis software (Kennedy, 1998; McEnery & Wilson, 2001; OKeeffe, A., McCarthy, M., & Carter, R. , 2007; Teubert & Cermakova, 2007)

2.Sources of language corpora
- Paraconc’ (http://www.athel.com/para.html) 

3. Designing a specialized corpus
           Corpus size
-          There are no fixed ruled; depending on research purposes, availability of data and time.
-          Large, general corpora may be less useful than small, focused corpora if searches are made on context-specific terms.
-          There are limitations of too small corpora e.g. not enough concepts, terms, or patterns under investigation.
-          It is preferable to create a monitor or open corpus because specialized words/usage are dynamic.
Text extracts vs. full texts
-          Depends on the aim of corpus compilation.
-          Whole text offers more coverage because words or terms to be looked at may be randomly distributed throughout the text.
-          Specific sections may be helpful if we are looking for words or phrase under particular content areas or want to create purposeful sub-corpora.
Number of texts
-          Choices can be made between collect few texts of large size or a number of texts with smaller sizes.
-          Choices can also be made between selecting texts written by one or two key writers or sources, or texts retrieved from different sources or written by different authors.
-          Depends on your research focus e.g. to study overall language use or to study idiosyncrasy or linguistic choices preferred by particular writers.
Medium
-          Can be spoken or written texts or mixed.
-          Depends on research questions.
-          Some practical factors should also be considered e.g.compiling spoken corpora can be time-consuming and needs special types of tagging.
Subject and text type
-          Should mainly focus on the specialized text under investigation, although this is less clear-cut in multidisciplinary subjects.
-          Texts may come from different subject if the research focus is on the study of particular language features rather than term extraction.
-          Text types within a specialized subject field may vary fromexpert-to-expert texts to expert-to-non-expert texts, or in other words, from technical to popular texts.
Other considerations
-          Authorship: Texts written by experts in a field tend to present more reliable and authentic examples of specialized language.
-          Language: Specialized texts can be stored and retrieved in the form of monolingual, comparable, or parallel corpora.
-          Publication date: Texts should come from recent publications unless queries are made in relation to particular periods of time.

      4. Sources of specialized texts
-          Printed materials
-          Word document
-          CD-ROMs
-          Texts on the Web
-          Online databases

     5. Getting started with Antconc
Download the latest version of Antconc watch YouTube tutorials from http://www.antlab.sci.waseda.ac.jp/antconc_index.html



               1.Run the program.      
               2. Open Files (browse and select targeted files) or Open Dir (to select targeted folders)      
                3.Choose the function.     
                4.Clear All Tools and Files before selecting opening new files.      
                5. Save Output to Text File to save output e.g.concordance lines.



ความคิดเห็น

โพสต์ยอดนิยมจากบล็อกนี้

Myself

Hello Everyone 👱👱 It's me 🍒🍒 My name is Oraya  Panalee. You can call me "Palm" My student ID is 5881114020 I'm student at Nakhon Si Thammarat  Rajabhat University. My contact

Acronyms

🌲🌲🌲🌲🌲🌲🌲🌲 Directions: Find words or phrases standing for the following acronyms with  short descriptions. IT - the abbreviation for “information technology” (the study and use if electric processes and equipment to store and send information of all kinds, including words, pictures and numbers.) ICT - the abbreviation for “information and communications technology” (the study and use of computers, internat, video, and other technology as a subject at school.) CAI -  computer-assisted instruction หรือ computer-aided instruction CALL - computer assisted language learning WBI -  Web based instruction CBI - Computer Based Instruction CMC - Computer Mediated Communication TELL - Teaching English Language Learners MUD - Multiple User Dialogue MOO - MUD Object Oriented 🌲🌲🌲🌲🌲🌲 Directions: Describe the following terms. 🍩 Synchronous Tools 🍩     If using the “same time, different place” model of communicat...

The Use of multimedia in English Language Teaching

The Use of multimedia in English Language Teaching Definition           Multimedia is the exciting combination of computer hardware and software that allows you to integrate video, animation, audio, graphics, and test resources to develop effective presentations on an affordable desktop computer. The Current Status of Multimedia Teaching Method in College English Teaching           College English teaching is to set up a harmonious and high-effective teaching atmosphere in the English class to make students take part in the practice. Thus, we can cultivate their listening, speaking, reading and writing abilities, which are the final teaching aim-developing the students’ English intercommunicative ability. Relationship between the Qualities of the College English Teachers and Multimedia Teaching. Misunderstandings and Disadvantages of Multimedia Teaching in College Englis...