Skip to content. Skip to main navigation.

Clinical BioInformatics Ontology (CBO®)

  • Molecular diagnostic and cytogenetic techniques generate findings that require a vocabulary with rich terms and consistent representation.

    The ideal vocabulary is controlled in scope, curated with a consistent methodology, and relevant to current clinical practice. The Clinical Bioinformatics Ontology (CBO®) was initiated with these goals in mind and covers the clinical areas of molecular genetics, molecular pathology, cytogenetics and infectious disease.

    Key attributes of the CBO include:

    • Controlled vocabulary – uniquely identified concepts
    • Machine readable – multiple formats
    • Semantic network / ontology – provides biological context to clinical findings
    • Curated resource – Consistent application of content creation methodology, quality control process.
    • Focused on current clinical practice – controlled scope

    This site provides guest users with access to a subset of the CBO that includes all of the high level navigational concepts and a subset of the concepts associated with the cystic fibrosis gene (CFTR). Registered users can download the full ontology in multiple formats.

    Content included in the CBO is generated through a content curation process which includes the following steps:

    Identify new concept or relationship between existing concepts

    • Survey and interview molecular diagnostics laboratories
    • Proactive identification of concepts through:
    • Genetests
    • Literature review
    • Requests from stakeholders and public

    Generate new concept

    • Apply CBO methodology (as described in the CBO white paper available to registered users)
    • Utilize accepted naming conventions as appropriate
    • Create relationships between new concept and existing concepts
    • Create facets (external links) and terms (alternative names)

    Review new content

    • Clinical significance
    • Compliance with CBO methodology
    • Accuracy
    • Completeness

    Technical processing

    • Assign global unique identifier (GUID)
    • Import into nomenclature modeling application
    • Export CSV and RDF files

    Post for public

    Hoffman, M; Arnoldi, C; Chuang, I "The Clinical Bioinformatics Ontology: A curated semantic network utilizing RefSeq information" Pac. Biocomput. Symp. 2005

    Natalya F. Noy, Daniel L. Rubin, Mark A. Musen. "Making Biomedical Ontologies and Ontology Repositories Work," IEEE Intelligent Systems, vol. 19, no. 6, pp. 78-81, November/December 2004

    The Clinical Bioinformatics OntologyTM is funded by Cerner Corporation.

    Sponsorship opportunities are available, please contact for additional information.

    1. What is the CBO?

    The CBO is a semantically structured controlled vocabulary for clinical molecular diagnostics. The CBO addresses the need for consistent representation of clinically relevant molecular biological and cytogenetic entities in a standardized and machine-readable format. It combines the attributes of a controlled vocabulary by providing standardized naming conventions and coded values, as well as the attributes of an ontology. Structuring the CBO in an ontology format provides a platform for advanced queries, inference logic, correlation, making assertions and managing the complex data of current clinical practice.
    Content that meets the criteria for inclusion in the CBO is curated using public domain resources. The CBO naming conventions utilize existing standards when possible and alternative methods when necessary to ensure consistently reproducible and traceable content. Quality assurance processes are used to verify new concepts prior to inclusion in the CBO. The standardized content is represented in codified values and structured in the CBO ontology.

    2. What is an ontology?

    An ontology is a "specification of a conceptualization" or "a set of definitions of a relational and formal vocabulary." Ontologies are useful for representing and sharing knowledge of a specified domain among people and computers. In the case of the CBO, the formal vocabulary describes clinically relevant molecular biological entities, in which these definitions are semantically structured or networked. For more information about ontologies, visit

    3. Why is there a need for the CBO?

    Existing medical terminologies do not sufficiently support the description of molecular diagnostic and cytogenetic findings due to significant gaps and inaccuracies. The CBO addresses the need for consistent descriptions and representations of molecular entities performed in clinical diagnostics in a standardized and machine-readable format. Other notable ontologies of interest in molecular biology are oriented toward research and focus primarily on functional attributes, which do not meet the requirements for clinical use. The scope of the CBO focuses on this gap to meet the needs of clinical molecular diagnostics and cytogenetics.

    4. What is the scope of the CBO?

    Targets of current diagnostic analysis are considered clinically significant and are therefore included in the CBO. However, the CBO also includes place holders for some domains primarily of research interest that are not currently a part of clinical diagnostic practice.

    5. How can I use the CBO?

    The CBO can be used for:
    Codification of clinical reports
    Data mining
    Interfaces between clinical information systems

    6. How do I gain access to the CBO?

    A set of demonstration files is available to any visitor to the Web site. Registered users can download the entire CBO and detailed documentation, including the white paper describing the methodology used to add new content.

    7. Why is the CBO publicly available?

    The goal of making the CBO publicly available is to encourage its widespread use and consideration. We hope to receive feedback on the design and growth of the ontology from the larger scientific community.

    8. What are future plans for the CBO?

    The continued growth of the CBO will include development of concepts for cytogenetics, infectious disease and continued growth of the content related to clinical genetics and molecular pathology. As proteomic and transcriptomic technologies become more widely adopted, concepts necessary to describe these findings will be added.

    9. What naming conventions are used in the CBO?

    The CBO applies existing standards when possible. However, alternative naming conventions are applied when necessary. A complete guide to the CBO naming conventions can be found in the CBO white paper.

    10. Will there be additional CBO content available in the future?

    Yes, continuous content additions will be made to the CBO. All content additions will be publicly available through the CBO web site. For this reason, it is important to utilize the most current version of the CBO.

    11. Is the scope of the CBO limited to human concepts?

    Human and nonhuman molecular biological entities are represented in the CBO. The CBO includes nonhuman species that are clinically significant to humans, such as pathogenic viruses and bacteria. Of these, only concepts needed to accurately and clearly document diagnostic results are included.

    12. How much information is in the CBO?

    There are currently 13,520 Concepts, 28,962 Relationships, 6,133 Facets and 7,703 Terms in the CBO.

    Alpha Globin Design Elements

    The representation of alpha thalassemia variants presented a CBO modeling challenge. The schematic1 below shows the complexity of the alpha-globin cluster and the associated variants. In many cases, the variants are defined by which genes and/or pseudogenes are included or excluded from the cluster, which makes use of positional variation recommendations difficult to apply. Other new conceptual challenges included representations for gene clusters and pseudogenes.

    For more detail, download a schematic representation of the a-globin gene cluster.

    Terminology axes to represent locus variants, pseudogenes, and gene clusters were created to characterize alpha-globin variation, as were new defining relationships to provide context:

    Has Constituent Element [HAS_CONS_ELE] - Relates a gene cluster to the elements that comprise it
    Involves [INVOLVES] - Relates a human multi locus variant to the genes and/or pseudogenes involved in the variation

    The modeling schema illustrates the new relationships. Other defining relationships also exist for genes and pseudogenes, such as chromosome location, mode of inheritance, etc, but those have been excluded for the sake of clarity here. The white paper includes a thorough discussion of all concepts and their relationships and can be downloaded from the site.

    1. Tan, et al. A rapid and reliable 7-deletion multiplex polymerase chain reaction assay for alpha-thalassemia. Blood 2001;98:250-251

    Cytogenetics Design Elements

    Cytogenetics presents a modeling challenge due to the number of possible observed chromosome abnormalities. Decisions regarding the approach to modeling were made such that the parental subsumes relationship would maintain its accuracy. For instance, Human Chromosomal Deletion “is a” Human Chromosomal Variation. The CBO Diagram illustrates the hierarchical structure for variation represented within the CBO.

    Each of these terminology axes shares lateral, defining relationships both within the cytogenetic-specific hierarchies and to other CBO hierarchies when appropriate. The deletion example here demonstrates the hierarchical structures and definitional relationships that provide context for the abnormality described. The International System for Human Cytogenetic Nomenclature (ISCN 2005) is utilized whenever appropriate to ensure adherence to standards.

    You can view the CBO cytogenetics concepts in the CSV files in the Downloads section of the website.