Keynote lecture

Jeffrey Fine, UPMC [Download]

21st Century Digital Pathology: Computer-Assisted Diagnosis for Pathologists (pCAD) [Download]

Computer-assisted diagnosis for pathologists (pCAD) is a conceptual framework intended to guide development of automated diagnostic pathology. It is a hypothetical, intelligent computer assistant that would automate easy tasks and thus focus pathologists’ attention on the very hardest decisions that only they can make. It would also support those decisions and provide advanced analytics that could augment the prognostic power of pathology data. pCAD is comprised of three parts: 1) advanced image analysis; 2) total integration with laboratory information systems (LIS); and 3) expert ongoing adaptation. Formerly hypothetical, advanced image analysis is rapidly developing within the computational pathology community, using techniques such as spatial image statistics or deep learning. LIS integration is crucial as this is how the pCAD system is tasked with work and also how it provides organized diagnoses and other report data back to the electronic health record. Finally, human experts (e.g. pathologists, scientists and engineers) are necessary for pCAD systems to improve over time and adapt to changing medical practice. Although hypothetical, pCAD may play an important role in the creation of computational pathology as a discipline. It not only helps pathologists and researchers articulate how IT could be applied to real clinical challenges, it also may serve to convince the greater computational biology community that digital pathology’s challenges are novel ones that merit widespread effort.

High-throughput Analysis of Biomarkers Using Deep Learning

Niels Grabe, Dep. Of Medical Oncology National Center for Tumor Disease NCT, Heidelberg, Germany; Hamamatsu Tissue Imaging and Analysis Center, University Heidelberg; Steinbeis Center for Medical Systems Biology, Heidelberg, Germany; Institute of Pathology, University Hospital Heidelberg [Download]

In his presentation Dr. Grabe discusses strategies for the high-throughput analysis of biomarkers in clinical trials. He shows that conventional “off-the-shelf” software is less suitable and propagates the development of machine learning based custom imaging assays. Such assays measure fully automatically a limited set of biomarkers in high-throughput in in clinically defined cohorts and are characterized by an objectively measured accuracy.

In collaboration with the US National Cancer Institute (NCI) his group developed deep- and machine-learning based algorithms allowing the fully automatic liquid based cytological screening for cervical cancer using the biomarkers p16 and Ki-67. On a first cohort with 341 patients the algorithms showed a superior accuracy compared to manual reading. First results for a second cohort with more than 1500 patients are shown.

Secondly, biological, technical and operational challenges in analyzing immunohistological biomarkers of cell populations in immuno-oncology are discussed and how to address them using deep learning based image analysis. Lastly, application cases are presented in connection with automatic slide registration. At the example of prostate adenocarcinoma it is shown how on the basis of multiplex IHC stains prostate carcinoma can be reliably detected automatically. For supporting immuno-therapy of lung, deep learning is applied to deal with anthracotic pigment and to distinguish different growth patterns in lung adenocarcinoma.

Shedding a Different Light on Disease: An Introduction to Infrared Based Spectral Pathology

Alex Henderson [1], Peter Gardner [1]

Given that the eye is an excellent photon detector and the human brain is one of the most advanced image processing systems known to man it is not surprising that visible light microscopy has been the mainstay of pathological analysis. However significant advances in detector technology and computer processing power make other regions of the electromagnetic spectrum attractive for tissue analysis. In this presentation I will introduce the revolutionary new techniques of infrared based technology that can facilitate detailed tissue analysis. Hyperspectral imaging coupled with sophisticated computer algorithms enable cancerous tissue to be indentified and graded and, in favourable cases, an indication of prognosis to be obtained. This technique lends itself to automation and would be particularly useful for screening large numbers of biopsy samples for the common types of cancer.

Cytomine for collaborative and semantic analysis of digital pathology images

Raphaël Marée [1], Loic Rollus [1], Renaud Hoyoux [1], Benjamin Stévens [1], Gilles Louppe [1], Rémy Vandaele [1], Jean-Michel Begon [1], Philipp Kainz [2], Pierre Geurts [1], Louis Wehenkel [1] [Download]

Cytomine ( is an open-source, rich internet application, for remote visualization of whole-slide images (à la Google Maps), collaborative and semantic annotation of regions of interest using user-defined ontologies, and semi-automated image analysis using machine learning.

Here we will describe our design choices that allow data scientists and image analysis software developers to use and extend the platform in various ways. In particular we will describe our vocabulary-driven annotations of images, HTTP based RESTful API to import/export data through web services, and our supervised learning workflows including our semantic proofreading tools for object classification, image segmentation, content-based image retrieval, and landmark detection.

We will then brielfy present our latest applications of the software as it is now being actively used by many research groups working on large sets of images in lung/breast cancer research, renal pathology, toxicology and developmental studies, ... (see publication list:

Comparison of vascular networks from high resolution 3D whole organ microscopic analysis

Michael J. Pesavento [1], Pranathi V. N. Vemuri [1], Caroline Miller [1], Jenny Folkesson [1], Megan Klimen [1] [Download]

Understanding hemodynamics in circulatory systems is a critical component to identifying pathophysiologic states in tissue. Significant progress has been made in vascular network imaging; resolution has increased for high volume methods (eg microCT and MRI), and volume has increased for high resolution methods (eg multi-photon and confocal microscopy). 3Scan’s Knife Edge Scanning Microscope (KESM) spans the gap between high volume and high resolution imaging modalities [1].

Bright field images of resin-embedded, whole-organs (brain and pancreas) were obtained from mice following systemic perfusion with India Ink. Images are taken with a resolution of 0.7 um per pixel in XY and a typical slice depth of 5 um in Z, enabling large-scale analysis and comparison of vascular networks of whole organs consisting of up to 5 TB of imaging data in 3D and a maximum physical volume of 50 x 50 x 20 mm. Vascular features are identified via parallelized vessel segmentation and vectorization methods.

Comparison of vascular features within a single organ reveals significant differences between the area analyzed within target tissue, largely as a result of the fractal dimension of the vascular network. Comparison of vascular network features between organs yields significant differences between vascular networks that are commensurate with the function of the vascular network for that organ.

Rapid throughput analysis of high volume vascular data provides an unprecedented ability to compare vascular features between different vascular networks, as well as identify pathological states within those networks.

Recognizing the BRAF mutant-like tumors from whole-slide pathology images

Vlad Popovici [1] [Download] [Download]

Introduction: Tumor heterogeneity plays a central role in the observed variability of treatment responses and survival of cancer patients. At the same time, it represents a major hurdle on the path towards a personalized medicine, with a plethora of molecular biomarkers being recently proposed to partially resolve this heterogeneity. This is the case for the BRAF mutant-like (BL) gene expression signature [1], which identifies a high risk subpopulation of colorectal cancers (CRCs). These tumors, while not harboring the BRAF V600E mutation, display a similar pattern of gene activation (for a selected set of genes) with the mutants and, more importantly, share the same dismal outcome. It is thus of great importance that the BL tumors are identified early on and currently the only method relies on a 64-gene signature [1], which is not yet implemented in clinical practice. We propose to build a tissue-based proxy biomarker which would provide an indication whether molecular testing should be performed and which could be integrated in the daily practice without disturbing any protocol, since it would rely on routine H&E-stained slides. We will restrict, for the moment, this tissue biomarker to stage III, microsatellite-stable (MSS) CRCs, which form a more homogeneous subpopulation.

Methods and Results: The data collection consisted of n=113 samples for which both histopathology whole-slide images and clinical data were available, along with the corresponding BL status (a real-valued score, with positive values indicating BRAF mutant-like cases). All samples are stage III, MSS CRCs. The collection was divided into a training (n=40) and testing (n=73) disjoint sets. The images were scanned at 40x magnification and later down-scaled to an equivalent of 2.5x. Tumor regions were extracted based on expert annotation and the color images were further denoised (Gaussian filtering) and hematoxylin intensity estimated via color deconvolution [2]. All later processing was performed only using these gray scale (hematoxylin intensity) images. Local feature descriptors (vectors of d=64 values) were generated using the SURF method [3] and a bag of features [4] representation generated for each image, based on a dictionary of size k=50 (optimized on the training set). A DLDA (diagonal LDA) classifier was trained to predict the BL status (binary classification). The dictionary consisted in k=50 image feature vectors corresponding to patches of sizes varying between 14x14 and 54x54, from highly variable (high content) regions of the images. Of these, 9 feature vectors were significantly associated with BL status (t-test and correlation test p < 0.05) and also with the mucinous status of the tumors. The DLDA classifier was built on 30 variables (image features, the number was optimized via cross-validation on the training set) and achieved an accuracy on the test set of 91.78% (95% CI=82.89-96.49) corresponding to a sensitivity of 93.75% and a specificity of 90.25% (6 misclassified samples out of 73). The stratification induced by the classifier was marginally significant in survival analysis (survival after relapse): HR=1.62, p=0.06. For the same set of patients, the molecular biomarker has HR=2.22, p=0.02.

Conclusions: On a relatively small data set we were able to build an image-based proxy biomarker for BL CRCs achieving good test performance. This biomarker may provide a starting point for a screening test (e.g. by adjusting its threshold) for identifying additional high risk patients. Since it uses standard histopathology images and by integration with other automatic image analysis tools (e.g. tumor region identification), the proposed method can be integrated in daily clinical practice without disturbing the protocols in place and can work autonomously to provide complementary diagnostic and prognostic information.

Nobody likes the chubby peewee

Jennifer Scheidel [1], Hendrik Schäfer [1], Jörg Ackermann [1], Marie Hebel [1], Tim Schäfer [1], Claudia Döring [2], Sylvia Hartmann [2], Martin-Leo Hansmann [2], Ina Koch [1] [Download]

We present an analysis of the spatial distribution of Hodgkin and Reed-Sternberg cells in classical Hodgkin lymphoma. Hodgkin lymphoma is a tumor of the lymphatic system. Large tumor cells called Hodgkin/Reed-Sternberg (HRS) cells characterize the classical Hodgkin lymphoma (cHL). Typically, in round numbers only 1 % of the lymph node are HRS cells. Clinical diagnosis generates a large number of histological images in which HRS cells are immunohistochemically stained by CD30. Such images are snap shots of the disease available for a broad variety of medical cases and offer the opportunity to systematically study the morphology and spatial distribution of HRS cells in the tissue. The automated analysis of images of histological tissues may enable for valuable conclusions on the co-operative migration behavior of malignant cells within a lymph node. We analyzed in total 35 images of tissue sections of the cHL subtypes, nodular sclerosis (NScHL) and mixed cellularity (MCcHL) as well as images of an inflammation of the lymph node called lymphadenitis (LA) [1]. Our imaging pipeline identified the profiles of in round numbers 400.000 CD30 positive cells in the tissue sections [2]. The distribution of the diameter of the cells had its maximum in the range of 20 to 22.5 μm for cHL and of 15-17.5 μm for LA. The estimated mean diameter of HRS cell profiles in NScHL was 30.6±10.2 μm, whereas the mean diameter for MCcHL was slightly smaller, i.e., 28.6±9.3 μm. Further, we assigned each individual cell to one of eight predefined classes according to the morphological features, eccentricity, solidity, and area of its profile and analyzed the neighbor relations of the cells belonging to different profile classes. In the choice of their next neighbors, the cells had statistical significant preferences and aversions for distinct classes. Each class of cell exhibited specific patterns of preferences and aversions, e.g., round and small cells tended to stay in the neighborhood of its own kind and were shunned by cells of other classes. The patterns of preferences and aversions were differently pronounced depending on the medical diagnosis. We analyzed the distribution of distances to the nearest neighbor to check whether attractive or repulsive forces between cells of specific classes were the source of the patterns of preferences and aversions. The distribution of distances proved a clustering of the cells in the tissue but, e.g., the comparable large mean distance between small and round cells contradicted the hypotheses of an attraction that forces small and round cell profiles to stay among their own kind. The influence of the complex structure of the lymph node and specific cell interactions, e.g., by chemokines and cytokines, are possible explanations for the overall clustering of the cell profiles. The patterns of preferences and aversions of the specific profiles class were more likely an effect of different motilities of the cells in the tissue.

Analysis of mass spectrometry imaging data with autoencoders

Spencer A. Thomas [1], Alan M. Race [1], Rasmus Havelund [1], Melissa Passarelli [1], Alasdair Rae [1], Rory T. Steven [1], Josephine Bunch [1] [2], Ian S. Gilmore [1],

The use of mass spectrometry imaging (MSI) techniques, such as secondary ion mass spectrometry (SIMS) and matrix assisted laser desorption ionization (MALDI), are important techniques in bio-medicine and pharmacology due to their ability to image tissue samples. In particular MSI can provide insight in to the differences between healthy and diseased tissues or the properties of drug within a sample. Improvements to instrument design and acquisition have resulted in the generation of enormous amounts of data. Frequently this data is not fully utilised during analysis due to its size and complexity.

Machine learning can enhance our understanding of these systems and provide insight through automated mining and analysis techniques. One such technique is the autoencoder, a type of neural network, which can perform nonlinear multivariate analysis through unsupervised dimensionality reduction. The ability of the autoencoder to capture nonlinearities in the data make is favourable over linear dimensionality reduction techniques such as principal component analysis (PCA) or non-negative matrix factorisation (NMF).

Here we demonstrate the use of autoencoders on large MSI data for dimensionality reduction and compare to standard methods such as PCA and NMF. We show the effectiveness of the low dimensional representation of data using autoencoders in terms of reconstruction accuracy and ease of interpretation. Moreover, we demonstrate how autoencoders can maintain high resolution data during compression, where files sizes can cause PCA and NMF to become computationally prohibitive and require lower resolution (down-binned) data. We compare different modality data of the same tissue samples highlighting the features captured with different techniques.

Whole slide scanning to speed up and improve pancreatic beta cell volume quantification

Willem Staels [1], Gunter Leuckx [1], Yves Sucaet [1], Yves Heremans [1], Nico De Leu [1], Peter In't Veld [1], Harry Heimberg [1]

Quantifying the beta cell volume is an essential tool for the study of diabetes. For this, researchers currently rely on laborious immunohistochemical and bioinformatical analyses of pancreatic tissues. Most time-consuming is the image acquisition step from stained paraffin embedded tissue slides to digital images of entire tissue sections. Whole slide scanning is revolutionizing clinical pathology and research as it allows for fast and effortless acquisition of high resolution images. We compared current methods relying on inverted microscopy with whole slide scanning technology in combination with the Pathomation software platform for quantification of the pancreatic beta cell volume. In both nonpregnant and pregnant mice beta cell volumes measured via both methods are highly similar, while the scanning technique proves to be up to 4 times faster. Whole slide scanning offers a reliable and faster alternative for quantification of the pancreatic beta cell volume.

Learning based detection of early neoplastic changes in histological images

Mira Valkonen [1] [2], Matti Nykter [1], Leena Latonen, [1], Pekka Ruusuvuori [3] [Download]

Digital pathology has been rapidly expanding into a routine practice, which has enabled the development of image analysis tools for quantification of histological images. Prostatic intraepithelial neoplasia (PIN) represents premalignant tissue involving epithelial growth confined in the lumen of prostatic acini. In the attempts to understand oncogenesis in the human prostate, we studied early neoplastic changes in mouse prostatic intraepithelial neoplasia (mPIN) confined in prostate. We implemented an image analysis pipeline for describing early morphological changes in hematoxylin and eosin (H&E) stained histological images. The model is based on manually engineered features and supervised learning with random forest model. For training, we used a set of mPIN lesions of abnormal epithelial cell growth and glands of normal tissue segmented by an expert. The extracted features include 102 local descriptors related to tissue texture and spatial arrangement and distribution of nuclei. These extracted features provide a numerical representation of a tissue sample and were used to computationally learn a discriminative model using machine learning. The implemented random forest model is an ensemble of 50 classification trees and it uses bootstrap aggregation to improve stability and accuracy. Leave-one-out cross-validation (LOOCV) was used to evaluate the performance of our random forest model. The classification model was able to discriminate normal tissue segments from early mPIN lesions and also describe the spatial heterogeneity of the tissue samples. The model can be easily interpreted and used to assess the contribution of individual features. This feature significance provides information about differences in the histology between normal glands and early neoplastic lesions.

Keynote lecture

Jeroen Van der Laak, UMC Radboud

Computer aided diagnosis will change the way we practice anatomical pathology [Download]

Although currently feasible, not many pathology laboratories make the transition to a full digital workflow. High costs and regulatory issues hamper wide scale introduction of whole slide image scanners. Availability of validated computer aided diagnosis (CAD) algorithms may dramatically change this situation. These algorithms have the potential to extract clinically meaningful information from scanned tissue sections in a reproducible manner. In this talk I will show the state-of-the-art in this exciting field of research, proving that implementation of CAD in routine pathology workflow is closer than many pathologists think. We already started using CAD to support pathologists in tedious and poorly reproducible tasks. Ongoing research may also lead to entirely new ways of assessing the information present in tissue sections. Taken together, these developments may significantly change the way we currently practice pathology.

Sponsored by