CerviAssistAI is an intelligent system for assisted detection of cervical cancer from Papanicolaou smear images. It builds a complete pipeline from slide preparation and digitization to automatic cell detection, segmentation, classification according to the Bethesda system and, in the next phase, estimation of a global risk score per microscopic field.
Clinical workflow and data pipeline
In the first project phase, the priority has been to operationalize the clinical and digitization workflow and to build a robust, multi-level database:
- Around 100 patients with Pap smear slides have been processed according to standardized clinical protocols (collection, fixation, Papanicolaou staining).
- Slides are selected mainly from patients in the 30–65 age range, matching the target screening population and excluding inconclusive or poor-quality samples.
- Digitization is carried out using two complementary systems (e.g. Nikon Eclipse microscope and VENTANA iScan Coreo scanner) at an equivalent magnification of 40×, producing high-resolution colour images. Differences between devices are documented and handled later in pre-processing.
From these slides the project has constructed:
- ≈1500–2000 microscopic fields of view (“tablouri”) used for detection, segmentation and classification;
- 10,000–15,000 individual cells, separated into squamous vs. glandular types and labelled with Bethesda risk classes (normal and various grades of lesions, including AGC when present), with double reading on a subset of cases.
All data are organized in a multi-level structure (slide, field, cell), aligned with the CerviAssistAI work-packages (WP1–WP4).
Cell detection, segmentation and classification
On top of this database, the project has implemented and validated the core AI components:
- Automatic cell detection (WP2): a CNN-based detector combined with classical image-processing stages running on sliding-window patches. On an extended test set of 7773 cell samples, the detector achieves TPR > 99% (7754 correctly identified cells), with high TPR/TNR/TSR also confirmed on a validation set of 95 slides and 1021 real cells.
- Cell segmentation prototype (WP2): binary reference masks have been generated for a subset of fields and used to train U-Net-type and related biomedical segmentation architectures, with a strong focus on accurate nuclear contours, which are critical for risk scoring. The models are currently in an advanced prototype stage, with ongoing refinement for crowded regions and background artefacts.
- Binary cell classification (WP3): a first CNN-based classifier (healthy vs. at-risk cells) has been trained on the proprietary database and evaluated on a held-out test set, reaching ≈90% accuracy. This serves as an initial benchmark towards the project’s ambitious performance targets (up to 99.5% in specific scenarios, as stated in the proposal).
Clinical integration, training and education
Beyond algorithms, CerviAssistAI is already touching real workflows:
- A marking application is used by pathologists to visualise fields and annotate relevant cells directly in the interface, enabling real-world use of the tools and providing high-quality labels for training.
- A dedicated ingest tool automatically integrates newly marked images (fields, bounding boxes, labels) into the database structure used by the ML models, reducing manual effort and ensuring traceability.
Overall, 2025 marks the transition from separate algorithmic components and raw data towards an integrated, clinically informed pipeline that can eventually deliver reliable, explainable assistance in cervical cancer screening.