Controlled vocabularies provide a way to organize knowledge for subsequent retrieval, particularly in metadata. They are used in subject indexing schemes, subject headings, thesauri and taxonomies. Controlled vocabulary schemes mandate the use of predefined, authorised terms that have been preselected by the designer of the vocabulary.
ASD-STE100 Simplified Technical English (formerly AECMA Simplified English) is a specification for writing aircraft documentation. The principles can be applied to all industry sectors. ASD-STE100 provides a set of writing rules and a dictionary of words and their meanings. It has a limited number of words; a limited number of clearly defined meanings for each word; a limited number of parts of speech for each word; a set of rules for writing text. This article outlines the standard, and shows how it helps to prevent ambiguity in text.
Search engine accuracy is important, but convenience may be more important than squeezing the last few ounces of performance out of your system. Peter Van Dijck demonstrates simple but effective query analysis, best bets, and controlled vocabularies -- tools to make your search engines more effective.
A clearinghouse of web sites that have applied or adopted standard classification schemes or controlled vocabularies to organize or provide enhanced access to Internet resources.
Plain English is good for increasing the quality of written documents. Unfortunately, it has limits in many technical situations. We need a special form of language, known as a controlled language, to overcome those limits. One particular controlled language is ASD Simplified Technical English.
Your readers will get confused if you aren’t consistent in the terminology you use in your documents. Does phrase A mean the same as phrase B? If the words are similar, but not quite the same, are they different things or the same thing? If your reader has to hesitate to figure it out, invariably that means that you’ve confused them. If the words mean the same thing, then you must use the same term for it throughout the document. And always use your organization’s official version of the term, if one exists.
Having consistent terminology, and using that terminology consistently, is crucial. Terminology that isn’t consistent, and which isn’t used consistently, can cause more than just a little confusion. And documentation that doesn’t use that terminology consistently can cause more problems than it clears up. Not only with customers, but within your company and project as well.
The documentation used in manuals and other technical writing worldwide is predominantly created in English. Though much discussion has been devoted to it in academia and elsewhere for years, technical English continues to be written in a way that is difficult for many people to understand.
A controlled vocabulary makes a database easier to search. Since we have many different ways of describing concepts, drawing all of these terms together under a single word or phrase in a database makes searching the database more efficient as it eliminates guess work. However, arriving at this efficiency requires consistency on the part of the individual indexing the database and the use of pre-determined terms.
The prevalance of digital information raised issues regarding the suitability of conventional library tools for organizing information. The multi-dimensionality of digital resources requires a more versatile and flexible representation to accommodate intelligent information representation and retrieval. Ontologies are used as a solution to such issues in many application domains, mainly due to their ability explicitly to specify the semantics and relations and to express them in a computer understandable language. Conventional knowledge organization tools such as classifications and thesauri resemble ontologies in a way that they define concepts and relationships in a systematic manner, but they are less expressive than ontologies when it comes to machine language. This paper used the controlled vocabulary at the Gateway to Educational Materials (GEM) as an example to address the issues in representing digital resources. The theoretical and methodological framework in this paper serves as the rationale and guideline for converting the GEM controlled vocabulary into an ontology. Compared to the original semantic model of GEM controlled vocabulary, the major difference between the two models lies in the values added through deeper semantics in describing digital objects, both conceptually and relationally.
You have probably heard information architects discussing the benefits of their latest taxonomy project and how you should be implementing one. But how, you might wonder, can you get started? In the next installment about Controlled Vocabularies, our authors go into detail about one methodology.
This paper outlines the assumptions, process and results of a pilot study of issues of interoperability among a set of seven existing controlled vocabulary schemes that make statements about the audience of an educational resource.
Firms that export to the USA are faced with the challenge of having to deliver accompanying TD that meets the requirements of that country. This is true not only in legal or safety-relevant terms, but also in terms of the language used. Production and translation of multi-lingual documentation are part of an overall process. Even while creating the source text, the technical writer must keep in mind the translation into the target language. Unambiguous rendering, consistency in the terminology, wording that is appropriate for the target group and reader-friendliness are some of the highest criteria which would justify the use of a controlled language.
Der DTT e.V. ist ein Forum für alle, die sich mit Terminologie und Terminologiearbeit beschäftigen. Er hat sich zum Ziel gesetzt, durch Beratung und Koordination sowie durch die Veranstaltung von Symposien und Workshops zur Lösung fachlicher Kommunikationsprobleme beizutragen.
We need a word for the class of comparisons that assumes that the status quo is cost-free, so that all new work, when it can be shown to have disadvantages to the status quo, is also assumed to be inferior to the status quo.
In projects like 'Wikipedia', collaborative work also necessitates a common language. This was one of the reasons why a 'Wiktionary' or a 'Wikiwoerterbuch' came into being. Thus, the open source community has already set out to develop ideas for the management of terminology and its implementation.
In this research, the development of a 'concept-clumping algorithm' designed to improve the clustering of technical concepts is demonstrated. The algorithm developed first identifies a list of technically relevant noun phrases from a cleaned extracted list and then applies a rule-based algorithm for identifying synonymous terms based on shared words in each term. An assessment of the algorithm found that the algorithm has an 89-91% precision rate, was successful in moving technically important terms higher in the term frequency list, and improved the technical specificity of term clusters.
Mining association rules from large databases of business data is an important topic in data mining. In many applications, there are explicit or implicit taxonomies (hierarchies) for items, so it may be useful to find associations at levels of the taxonomy other than the primitive concept level. Previous work on the mining of generalized association rules, however, assumed that the taxonomy of items remained unchanged, disregarding the fact that the taxonomy might be updated as new transactions are added to the database over time. If this happens, effectively updating the generalized association rules to reflect the database change and related taxonomy evolution is a crucial task. In this paper, we examine this problem and propose two novel algorithms, called IDTE and IDTE2, which can incrementally update the generalized association rules when the taxonomy of items evolves as a result of new transactions. Empirical evaluations show that our algorithms can maintain their performance even for large numbers of incremental transactions and high degrees of taxonomy evolution, and are faster than applying contemporary generalized association mining algorithms to the whole updated database.
Many moons ago I waited tables. One day our manager came down to tell us that from now on we were to refer to our customers as 'guests.' We also were to refer to courses as 'first course' and 'second course.' Our chef was French, and found the American use of 'entrée' for the main course annoying--in French 'entree' means appetizer. This was my first experience with a controlled vocabulary. A controlled vocabulary is simply what it sounds like: a way to control the meaning of the vocabulary used as well as keeping track of the related terms.
This paper describes the process of creating a controlled vocabulary which can be used to systematically analyse the copyright transfer agreements (CTAs) of journal publishers with regard to self-archiving. The analysis formed the basis of the newly created Copyright Knowledge Bank of publishers' self-archiving policies. Self-archiving terms appearing in publishers' CTAs were identified and classified, then simplified, merged, and discarded to form a definitive list. The controlled vocabulary consists of three categories describing `what' can be self-archived, the `conditions' and the `restrictions' of self-archiving. Condition terms include specifications such as `where' an article can be self-archived; restriction terms include specifications such as `when' the article can be self-archived. Additional information on any of these terms appears in `free-text' fields. Although this controlled vocabulary provides an effective way of analysing CTAs, it will need continual review and updating in light of any major new additions to the terms used in publishers' copyright and self-archiving policies.
The development of TermWiki provides organizations with an open-source, easy-to-use environment for managing terminology. Uwe Muegge explains the benefits of this system and how it works.
Choosing the right word is a deliberate decision. Making sure everyone in your company uses the same term for the same concept requires discipline. All of this becomes even more complex as you attempt to provide this same information in multiple languages. If a word has multiple meanings, translating terminology from one language to another can be extremely complex, time consuming, and expensive. That’s why approximately 15 percent of all translation project costs arise from rework, and the primary cause of rework is inconsistent terminology.
Personal experience shows that all localization clients are interested in terminology--without exception. Only very large organizations, however, actually seem to maintain terminology databases.