Slovak Terminological Activities Present and Future

Slovak Terminological Activities Present and Future

1

Slovak Terminological Activities – Present and Future

For more than 17 years Slovak language has seen spontaneous and uncontrolled accumulation of terms referring to new concepts due to the sweeping political and economic changes, not to mention the arrival of information technologies, which resulted in coinage of excessive and often unnecessary terminological variants of both domestic and foreign provenience.

In spite of a rich history of terminological activities, Slovak society has been facing a double vacuum in its post-war history - on one hand in terms of analysis and development of foreign and domestic terminological theories, technologies and methodologies and on the other studies comparing foreign and Slovak terminological systems. Moreover, the new political and economic situation has caused a massive braindrain in the academic and scientific sphere, thus Slovakia lacks available full-time and skilled terminologists.

At the same time, there has been an external demand for consistent and unified Slovak terminologies for the purpose of drafting and translating European legislation into Slovak as they proved to be unsatisfactory because of the increase of polysemes, synonyms, and variants.

Although terminological activity in Slovakia has suffered the greatest fallout in the last 50 years, it did not die out. Small-scale quality terminology work has been carried out by teams and individuals and has been appearing in form of thesis, glossaries and specialised dictionaries, which have, however, seen only modest feedback.

Some institutions, aware of the urgent need, started with scarce but key terminological activities, i.e. setting up of terminology databases – e.g. the National Bank of Slovakia and the Slovak Institute of Technical Normalisation. But those are only domain limited databases with specific, very narrowly defined aims – the former is an in-house tool for employees of the bank while the latter is used by the creators and translators of technical norms. As a mater of fact, these banks do not provide any access for lay public.

Contemporary Slovak state of affairs does not contribute to coherent and intelligible science neither for specialists, producers and lawmakers nor for teachers, translators and interpreters. It is therefore more than obvious that challenges Slovakia is facing in terminology work are following: centralisation, coordination, communication, and consensus.

Importance of a Corpus in the Context of Terminological Activities

Terminology monitoring, coordination, analysis of the special vocabularies and unification of pertinent results in the form of glossaries, dictionaries, terminological standards or terminology databases is considered by the Ľudovít Štúr Institute of Linguistics, Slovak Academy of Sciences, to be apriority.

Centralisation of various terminologies under one administration, continuous modification and updating of term records and narrow collaboration of terminological boards, translators and specialists is nowadays regarded to be the only way of terminological harmonisation, and consequently standardisation. Contemporary terminological tendencies stress the model of the text and corpus approach as a sine qua non prerequisite of every terminological project. The process of systematic gathering of terms is based exclusively on representative corpora, supervised and validated by specialists and terminologists. As Sager (1990:131) puts it, information extracted from a text represents a reliable indicator of changes and ensures the only plausible data for building and revising terminological records.

Thus, it was a logical decision that a terminology database project is to be started by the Corpus Department of the Institute for it has had all the resources and tools at its disposal – textual base of the Slovak National Corpus (SNC) itself, software for automatic annotation of Slovak texts and know-how for developing new ones, essential for automatic terminology data extraction.

Project of the Slovak National Terminology Database

The SNC project of the term bank aims to set up a monolingual database provided with both conceptual and linguistic information, inspired by foreign examples, mostly Canadian ones. The team expects to cooperate and exchange the data with leading European database IATE which is why the EUROVOC 4.2 Thesaurus was chosen as the classification system.

The starting point of the Slovak Terminology Database Project dates back to the autumn of 2005 when the SNC team launched the analysis of existing terminology banks, subsequently proceeding to the design of the term record layout based partly on translators' needs survey. The team chose and adapted the WikiWikiWeb editing system for the purpose of the database along the way. The SNC policy of text acquisition also had to be modified and the focus was shifted towards economic and legal texts for the purpose of creating specialised subcorpora and further automatic extraction of terms as well as other terminological data from specialised corpora.

Project methodology, as it was already mentioned, has adopted textual approach to the terminology extraction of lexical units – potential terminological units from running specialised texts and identification of the concept they refer to.

As far as the term record design is concerned, the team drew inspiration especially from the ISO 10241:1992 International terminology standards -- Preparation and layout. Resulting term record comprises 11 data categories, 7 out of which are obligatory. In order to satisfy the needs of professionals, lay public and last but not least the translation and interpreting public, obligatory categories include definition, domain, context, related terms and sources of both definition and context. Remaining 4 optional fields of the term record feature synonym, foreign language equivalent, comment and links to relatively reliable web pages.

Table 1. Model term record (concrete)

Changes in the project strategy

Nevertheless, the project started for different reasons in a less ambitious form of terminological work, i.e. instead of creating new records by filling in previously mentioned obligatory fields, the emphasis was shifted towards reusing and adapting existing quality terminology resources published in Kultúra slova revue in particular, as the team received a copyright license for their non-commercial use as well as some of those that had been elaborated by our close collaborators.

For the time being, the database available online in its pilot version since May 2007 offers almost 3000 terminological records covered by more than 20 domains (e.g. Astronomy, Security and Law, Migration Policy, Construction, Corpus Linguistics, Phraseology, Phonetics and Phonology, Bilingualism, Civil Security, Historical Linguistics, Fire Protection, etc.) with partly completed terminology fields, i.e. term and usually definition, source, less often synonym, sometimes related terms, and comment.

Reusing terminology collections designed and compiled for different aims and users brought about different issues in terms of harmonisation of different editorial practices in order to meet one common form of the term record.

Terminology collections that were entered into the term bank show differences/differ greatly and have to be harmonised on two levels.

On the formal level, proofreading revealed that term records most frequently present four groups flaws of incorrect Slovak such as substandard words, orthographical errors and syntax mistakes;abbreviations accompanying head terms, excessive punctuation and using brackets within the definition (one sentence and no capital letter and no full stop) and last but not least formal treatment of polysemous terms. As far as the polysemous terms within the same domain are concerned, every concept they refer to, is to be treated in a separate term record differing by the arabic numeral following the head term.

As for the content of the term record is concerned, the first step after the automatic incorporation of terminological data into the term bank is to identify and classify these data especially in case of encyclopaedically drafted term records splitting of original entries into definition, context and comment fields. Further step/phase is the evaluation of data. DEletion/omitting If there happen to be records for non-terms these are deleted. evaluation of relevancy of terms belonging to the terminology of a specific author or school;

Editing activity concerns mostly/usually:

1. definition field

2. domain field

3. acceptability field

4. related terms field

1. DEFINITION FIELD

In spite of all rules and efforts, definition-related discrepancies include incompleteness, inconsistency, amateurishness and subjectivity. Varying definitional practices resulted in one case phrasing definitions and in other descriptions, i.e. instead of creating a sentence which would enable to describe, circumscribe and distinguish the concept these context only desribe it. institutions, treaties (other proper names) – planning to create a set of descriptors/metalanguage to properly describe (description instead of a definition)treatment of nomenclatures and accompanying data etc. Some of the so-called definitions are in fact definitional contexts[1] but can be kept in this field.Many definitions comprise unknown, ambiguous or vague words/terms, auxiliary verbs (for example byť, znamenať). Sometimes one can choose from two different definitions for one term/concept

For the time being, With the increase of terms and domains in the term bank, there will be more and more multiplication of term records for so-called border concepts, defined from different points of view and will have to be harmonised or classified.

Communication with authors in order to achieve the rephrasing of definitions - problems

2. DOMAIN FIELD

Using EUROVOC descriptors and non-descriptors for existing term records create problems when one seeks for the equivalent among them for the classification of the original ones, which are usually fine-grained - covers the fields in which the European Union is active. Therefore/accordingly, the team are considering the possibility to use both classification system – original one and EUROVOC, e.g. the some of the records in the terminology collection Security and Law are classified into the subdomain protection of buildings that EUROVOC does not know and thus one has to usepublic safety which is a higher level of classification and abstraction.

3. ACCEPTABILITY FIELD

Ascribing the status of terms is exclusively the job of experts and expert domain commissions. If a terminology collection was compiled by such a commission, its term records can receive the marker/qualifier recommended or in contrast deprecated. Other qualifiers include legal, standardised, obsolete, proposed terms or neologism which might be enriched in future by eurolegal.

4. RELATED TERMS FIELD

Field for related terms is filled in from scratch/no original records included it. It can be done by lay people but validated by experts afterwards. identification and distinction of different types of related terms; the question is whether and how the term bank will reflect hierarchical and non-hierarchical conceptual relations

Future plans and Conclusion

Gather/obtain other collections of quality terminology sources - cooperation with authorities and university research groups (library of the economic university)

Editing and Completion of incorporated term records

Creation of new ones - social protection, construction, history, information technologies

References:

BÉJOINT, Henri: La définition en terminographie. In: Aspects du vocabulaire. Ed.P. J. L. Arnaud a Ph. Thoiron. Travaux du CRTT: Lyon, Presses Universitaires de Lyon, s. 19 – 25.

CABRÉ, Maria Teresa: La terminologie – théorie, méthode et applications. Ottawa: Armand Colin/PUO 1998. 322 s.

CONDAMINES, Anne: Linguistique de corpus et terminologie. In Langages – La terminologie: nature et enjeux, 2005, mars, č. 157, s. 36 – 47.

ČERMÁK, F. et alii.: Manuál lexikografie. Praha: H&H 1995. 283 s.

GAUDIN, François: Socioterminologie. Une approche sociolinguistique de la terminologie. Bruxelles: De Boeck et Larcier 2003. 288 s.

GESCHÉ, Véronique: Évaluation des définitions d’ouvrages. In Meta, 1997, roč. 42, č. 2, s. 374 – 390.

HORECKÝ, Ján: Základy slovenskej terminológie. Bratislava: Vydavateľstvo SAV 1956. 146 s.

HORECKÝ, Ján: Vývin slovenskej terminológie. In: Studia Academica Slovaca. 14. Prednášky XXI. Letného seminára slovenského jazyka a kultúry. Red. J. Mistrík. Bratislava, Alfa 1985, s. 265-277.

ISO 704 (2000): Travail terminologique – principes et méthodes. International Organization for Standardization.

ISO 860 (1996): Travaux terminologiques – Harmonisation des termes. International Organization for Standardization.

ISO 1087-1 (2000): Travaux terminologiques – Vocabulaire. International Organization for Standardization.

ISO 12620 (1999): Aides informatique en terminologie. International Organization for Standardization.

KOCOUREK, Rostislav: La langue française de la technique et de la science. Wiesbaden: Brandstetter 1991. XVIII + 327 s.

LERAT, Pierre: Les langues spécialisées. Paris: PUF 1995. 208 s.

MASÁR, Ivan: Príručka slovenskej terminológie. Bratislava: VEDA 1991.

NAZARENKO, Adeline, HABERT, Benoît, SALEM, André: Les linguistiques de corpus. Paris: Armand Colin 1997, 240 s.

OTMAN, Gabriel: Les bases de connaissances terminologiques: les banques de terminologie de seconde génération. In Meta, 1997, roč. 42, č. 2, s. 244 – 256.

PAVEL, Silvia, NOLET, Diane: Handbook of Terminology. Translation Bureau, 2001

SAGER, Juan Carlos: APractical Course in Terminology Processing. Amsterdam/Philadelphia: John Benjamins 1990. 254 s.

SEPPÄLÄ. Selja: Composition et formalisation conceptuelles de la définition terminographique. Mémoire pour l´obtention du DEA. Genčve: Université de Genčve 2004. 185 s.

SCHWARZ, Jozef: Vybrané teoretické a metodologické problémy terminografie: poznatky z tvorby České terminologické databáze knihovnictví a informační vědy. In Národní knihovna, 2003, roč. 13, č. 1, s. 21 – 41.

Internet sources:

Mgr. Jana Levická, PhD

[1]„defining context contains definitive information that may look very much like a definition, but is incomplete or doesn't have the right form for a definition“