Related Work
CNNs have been proved effective and widely applied for mpMRI-based PCa classification with promising performance. Wang and Wang[13]a attempted to explore optimal mpMRI sequence combinations as the CNN’s input, and their model achieved an AUC of 0.95, which was reported to outperform all models in the PROSTATEx Challenge. Rather than PCa classification only, Kiraly, Abi Nader, Tuysuzoglu, Grimm, Kiefer, El-Zehiry and Kamen[27] developed a model with an encoder-decoder architecture to detect prostate lesions and simultaneously classify the lesion malignancy. However, these studies required manually-cropped regions of prostate, which would be time-consuming and expensive[22a, 28].
End-to-end PLDC frameworks have also been investigated, with the aim to avoid the need for manual prostate segmentation. Yang, Liu, Wang, Yang, Le Min, Wang and Cheng [2] incorporated CNN for automatic segmentation in advance to the PLDC. Insufficient prostate image features extracted by the shallow network (i.e. in five-layer) could deteriorate much the overall segmentation accuracy. Later, Wang, Liu, Cheng, Wang, Yang and Cheng [29] proposed a deeper prostate segmentation model capable of detecting more complex features. Apart from improving the segmentation performance, fusing spatial features using 3D CNNs is also another means to enhance the PCa classification accuracy. Mehta, Antonelli, Ahmed, Emberton, Punwani and Ourselin [30] employed a patient-level 3D model for binary classification using volumetric mpMRI, achieving an AUC of 0.79 and 0.86, respectively, on their local cohort dataset and PROSTATEx. However, only single-cohort datasets were used to evaluate the model. Domain shift would occur when it is directly applied to an unseen cohort [17-18]. Provided with very few studies (e.g., Mehta, Antonelli, Ahmed, Emberton, Punwani and Ourselin[30]) mpMRI sequences from multiple cohorts, they could just directly combine the heterogeneous images, giving rise of samples sufficient for model training, but inevitably ignoring data source heterogeneity. It would be prone to severe domain shift, thus biasing predictions by particular cohorts.
Very recently, there are many research attempts in investigating DA approaches to alleviate inter-site distributional variability, among which UDA methods demonstrated their advantages in exploiting unlabeled target samples [20]. Such UDA methods can be categorized into two groups: (1) image translation and (2)feature alignment approaches. The former one performs image appearance alignment [17, 22]. The resultant models translate images across domains using GAN-based networks[23]. However, texture similarity between the image of synthesized target and the source would be crucial for the PLDC problem. The DA process would fail with insufficient texture similarity, particularly found in the generated lesion area[22c]. Besides, lesions could be missed during the translation process due to various transferability among image regions, thus worsening the DA process [31]. Moreover, the GAN models would distort the non-lesion region’s appearance, further causing unreliable lesion assessment results [24].
By using feature alignment approaches, domain-invariant features are extracted to reduce domain shift [26]. A common way is to minimize distribution similarity (e.g. second-order correlation [25]) between domains using Siamese network architecture. Adversarial learning [26a]can also align features by enforcing the cross-domain features indistinguishable using a domain classifier. For instance, Wang, Feng, Zhang, Wang, Lv and Yi [14] developed a GAN-based method to learn domain-invariant features on mammographic images acquired for breast cancer screening. However, these models were usually trained with the entire images, treating all voxels equally[26b, 28]. Previous works [24, 26b] revealed that not all image regions can facilitate knowledge transfer across domains. Roughly aligning the features in the whole image set would introduce irrelevant knowledge, resulting in ineffective DA. It is hypothesized that the background regions on mpMRI sequences, such as regions outside the prostate gland, would not attribute to DA well in our PLDC problem. To our knowledge, only few works reported PCa classification using multi-site ultrasound images[32], histopathology images[33], or T2 image slices only[13b].

2. Results and Discussion

2.1. Datasets

Table 1. Characteristics of the five MRI datasets for prostate segmentation and PLDC.