loading page

DMSeqNet-mBART: A State-of-the-Art Adaptive-DropMessage Enhanced mBART Architecture for Superior Chinese Short News Text Summarization
  • +5
  • Kangjie Cao,
  • Yiya Hao,
  • Jueqiao Huang,
  • Yichao Gan,
  • Ruihuan Gao,
  • Junxu Zhu,
  • Jinyao Wu,
  • Weijun Cheng
Kangjie Cao

Corresponding Author:[email protected]

Author Profile
Yiya Hao
Jueqiao Huang
Yichao Gan
Ruihuan Gao
Junxu Zhu
Jinyao Wu
Weijun Cheng

Abstract

Mandarin Chinese, a widely spoken language globally, has abundant, regularly updated short news texts online. Generating precise summaries of these texts is vital for effective information transmission and comprehension. This article introduces DMSeqNet-mBART, an enhanced mBART-based model, as a state-of-the-art solution for Chinese short news text summarization. This model incorporates Adaptive-DropMessage technology, a novel approach that intelligently discards or retains information based on the attention mechanism's output. This paper demonstrates that DMSeqNet-mBART excels across several benchmarks, including BERTScore, BLEU, and ROUGE metrics, surpassing other advanced models like GPT-4, T5, and MLC. The paper outlines the Adaptive-DropMessage mechanism, enhanced dynamic convolutional layers, gated residual connections, custom feed-forward networks with batch normalization, and improvements to self-attention and cross-attention. Results from comparative experiments on six recognized Chinese short news text summary datasets indicate that the model's performance in terms of fluency, completeness, robustness, and accuracy significantly outperforms leading industry models. The DMSeqNet-mBART's success is attributed to its unique combination of architectural and methodological enhancements, suggesting its suitability for various complex text data processing tasks. The model provides novel insights and methods for processing similar complex text data.
01 May 2024Submitted to TechRxiv
03 May 2024Published in TechRxiv