Background: During the COVID-19 pandemic, clinical trial recruitment cannot be carried out due to travel restrictions, transmission risks and other factors, resulting in the stagnation of a large number of ongoing or upcoming clinical trials.

Objective: An intelligent screening app was developed using artificial intelligence technology to rapidly pre-screen potential patients for phase I solid tumor drug clinical trials.

Methods: A total of 429 screening process records were collected from 27 phase I solid tumor drug clinical trials at the First Affiliated Hospital of Bengbu Medical College from April 2018 to May 2021. Features of the experimental data were analyzed, and the collinearity (principal component analysis) and strong correlation (χ2 test) among features were eliminated. XGBoost, Random Forest, and Naive Bayes were used to sort the weight importance of features. Finally, the pre-screening models were constructed using classification machine learning algorithm, and the optimal model was selected.

Results: Among the 429 screening records, 33 were data generated by repeated subject participation in different clinical trials, and of the remaining 396 screening records, 246 (62.12%) were screened successfully. The gold standard for subject screening success is the final judgment made by the principal investigator (PI) based on the clinical trial protocol. A Venn diagram was used to identify the important feature intersections of machine learning algorithms. After intersecting the top 15 characteristic variables of different feature screening models, 9 common variables were obtained: age, sex, distance from residence to the central institution, tumor histology, tumor stage, tumorectomy, the interval from diagnosis/postoperative to screening, chemotherapy, and ECOG (Eastern Cooperative Oncology Group, ECOG) score. To select the optimal subset, the 9 important feature variables were expanded to 12 and 15 feature subsets, and the performance of different feature subsets under different machine learning models was validated. The results showed that optimal performance, accuracy and practicability were achieved using XGBoost with the 12 feature subset. The final model could accurately predict the screening success rates in both internal (AUC =3D 0.895) and external (AUC =3D 0.796) validation, and has been transformed into a convenient tool to facilitate its application in the clinical settings. Subjects with a probability exceeding or equaling to the threshold in the final model had a higher probability to be successfully screened.

Conclusion: Based on the optimal model, we created an online prediction calculator and visualization app -- ISSP (Intelligent Screening Service Platform), which can rapidly screen patients for phase I solid tumor drug clinical trials. ISSP can effectively solve the problem of space and time interval. On the mobile terminal, it realizes the matching between clinical trial projects and patients, and completes the rapid screening of clinical trial subjects, so as to obtain more clinical trial subjects. As an auxiliary tool, ISSP optimizes the screening process of clinical trials and provides more convenient services for clinical investigators and patients.

This content is only available as a PDF.
This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. For a full description of the license, please visit

Article PDF first page preview

Article PDF first page preview