More Information

Identifying and assigning subject terms for digital collections requires thinking about the substance of individual objects and how objects relate to other materials in and beyond the digital collection. Controlled vocabularies are a great tool for this work, yet not all controlled vocabularies will fit with every project.

In general, MEAP recommends that teams consult recognized vocabularies from the Library of Congress, the Getty Institute, OCLC and other organizations but recognizes that there are many instances in which established vocabularies are neither the most relevant nor the most useful for describing objects in a collection. Their outdated terminology, inaccessibility, and unfamiliarity can make them less relevant and less inclusive for local community users. In these instances, MEAP encourages teams to consider thematic and linguistically relevant vocabularies (when available) and to develop their own vocabularies when external vocabularies aren’t available or won’t work. Below we outline how to identify and when to use these other vocabularies.

The MEAP Metadata Handbook(opens in a new tab) introduces three types of controlled vocabularies(opens in a new tab): recognized vocabularies, thematic or linguistically relevant vocabularies, and project specific vocabularies. Each type represents a different approach to subject terms, and each has different benefits and drawbacks.

Understanding the pros and cons of different controlled vocabularies can help teams find and use the right resource for their collections. If you have specific questions about how to make the most use of controlled vocabularies for describing objects in your collection, don’t hesitate to write to us at meap@library.ucla.edu.

Recognized vocabularies

Recognized vocabularies are publicly accessible and internationally recognized lists of terms and concepts. These vocabularies include ones published and maintained by large organizations like the Library of Congress, the Getty Institute, and OCLC and recommended by MEAP for subjects (e.g., Library of Congress Subject Headings (LCSH)(opens in a new tab)), names (e.g., Getty Union List of Artist Names (ULAN)(opens in a new tab), Virtual International Authority File (VIAF)(opens in a new tab)), and genres (e.g., Getty Art and Architecture Thesaurus (AAT)(opens in a new tab)).

  • Pros: These vocabularies are widely accepted and extensively used, so they provide opportunities to find broad connections by linking collections and objects to scholarship and publications on similar topics and themes. Recognized subject-headings can be especially useful for connecting digital objects to published items in library catalogs and academic bibliographies.
  • Cons: These vocabularies provide terms (mostly) in English. Therefore, they may be less useful for making collections accessible to local communities that speak or use other languages. Also, because these vocabularies reflect long-standing cataloging practices, their terms, spellings, and usages may be outdated and harmful (a problem many libraries acknowledge(opens in a new tab)). Additionally, because terms are organized hierarchically—with broader terms linked to narrower terms—they may not be appropriate for digital collections where users search by individual keywords.

Recommendations: If your collection includes objects with well-known figures or widely studied topics, then using one or two broad terms from these databases may help to link your collection to other resources on the same topic. Employing recognized cataloging terms at the collection level can also enable new digital collections to challenge established scholarly narratives.

Thematic or linguistically relevant vocabularies

Thematic or linguistically relevant vocabularies are structured lists like those mentioned above, but they tend to revolve around a specific language, topic, or region. Examples of thematic or linguistically relevant controlled vocabularies include the following:

  • Homosaurus: An International LGBTQ+ Linked Data Vocabulary(opens in a new tab): is designed to function as a “companion” to other broad vocabularies like those of the Library of Congress but was developed specifically “to support LGBTQ research by enhancing the discoverability of [a collection's] LGBTQ resources.” With its focus on LGBTQ+ terms, this resource is especially useful for collections relating to gender and sexuality studies and may offer terms that are more specific and more relevant to members of those communities.
  • Tesauro de derechos humanos(opens in a new tab): is a Spanish-language database of terms related to human rights. Terms are organized into categories like “teoria de los derechos humanos” (theory of human rights) and “derechos civiles y políticos” (civil and political rights) and then organized hierarchically with broader terms leading to more specific terms. This resource is useful for Spanish-language collections and collections looking to reach Spanish-speaking users as well as those related to human rights.
  • The African Studies Thesaurus(opens in a new tab): is a structured list of terms relevant for the field of African Studies. This vocabulary, which contains over thirteen-thousand terms, is hosted and maintained by the African Studies Center at Leiden University. Users can search for specific terms or browse the terms alphabetically and can also browse by subject category. With its focus on African Studies, this resource may be useful for project teams based in Africa or working with materials from African diaspora communities.

These are just three examples. Additional resources are available in this document compiled by the American Library Association(opens in a new tab). When deciding to use a thematic or linguistically relevant vocabulary, here are some of the pros and cons to consider:

  • Pros: Thematic and linguistically relevant controlled vocabularies provide more specialized, precise, and meaningful terms for describing objects in a collection. Because many of them focus on specific topics and themes, they are more easily revised and updated to reflect changes in terminology, thereby reducing the risk of using outdated or harmful terminology. Furthermore, these vocabularies use familiar language, which can make collections more accessible to local communities.
  • Cons: The specialized nature of these vocabularies may make it difficult to reach broader audiences. They may also encourage narrower descriptions that can make it more difficult to grasp the collection’s overall themes.

Recommendations: Thematic and linguistically relevant vocabularies can be a powerful tool for describing objects in unique and specialized collections. Using a relevant resource can make it easier to create consistent metadata that is also accessible to the community.

Project Specific Vocabularies

In most cases, it may also be necessary for a project team to develop its own vocabulary. A custom vocabulary (list of terms) can support a team’s efforts to create inclusive and relevant metadata, though this approach can also be time and labor intensive. Some of the pros and cons to consider are as follows:

  • Pros: A custom approach allows teams and communities to identify and define the terms that are most relevant to their collections and to employ those terms in a way that is meaningful to them. Custom vocabularies allow teams to provide subject terms in whatever language is relevant to the collection or to the community.
  • Cons: Custom vocabularies provide a lot of flexibility, but they can be difficult to create in the middle of a digitization project. Developing structured lists requires deep knowledge about the topics and ideas represented in a collection and how they relate. Such knowledge may not be available until after the entire collection is digitized and described, when it is likely too late to redo the descriptions. What is more, using a vocabulary that is unique to an individual project can make it more difficult to identify connections between collections.

Recommendations: Most teams use custom vocabularies to some extent. To make these custom vocabularies useful and consistent, MEAP recommends recording and maintaining a running list of the terms and periodically updating the list to add and revise terms. (Look for a future post outlining specific workflows for developing a custom vocabulary for your project.)