Skip to search boxSkip to navigationSkip to main content

Clinical Trial Search Using Lucene and UMLS

  • Yanqing Jia(Author)
    ,
  • Yun Tiand(Author)
    ,
  • Hao Yingc(Author)
    ,
  • John Tranb(Author)
Research Output: Contribution to conference Paper Peer-review

Abstract

We approached the clinical trial search task of the 2021 TREC Clinical Trials Track as a query problem. A query (also known as a topic in 2021 TREC) is the free text description of a patient record, while the corpus is a large set of clinical trials descriptions. A commercial search engine, Lucene, was utilized for this clinical trial matching process. Namely, given a query, the system searches in the corpus and returns a subset of clinical trials with specific requirements. In this study, Unified Medical Language System (UMLS) was employed to convert the free text of both topics and clinical trials to more meaningful biomedical concepts, each of which is represented as a Concept Unique Identifier (CUI). An expansion technique based on Medical Subject Headings (MeSH) was used to expand all the condition terms for each clinical trial to their child terms.