A Novel Neural Network Architecture Utilizing Parametric-Logarithmic-modulus-based Activation Function: Theory, Algorithm, and Applications
KNOWLEDGE-BASED SYSTEMS(2024)
Nantong Univ | Shandong Univ Sci & Technol | Hohai Univ | Brunel Univ London
- Pretraining has recently greatly promoted the development of natural language processing (NLP)
- We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
- We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
- The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
- Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
CRITICAL-DEPTH RAMAN SPECTROSCOPY ENABLES HOME-USE NON-INVASIVE GLUCOSE MONITORING
被引用62
Uncovering the Hidden Diversity of Litter-Decomposition Mechanisms in Mushroom-Forming Fungi
被引用48
Auto-Regressive Time Delayed Jump Neural Network for Blood Glucose Levels Forecasting
被引用21
PFLU and FPFLU: Two novel non-monotonic activation functions in convolutional neural networks
被引用22
Feature Selection Using Bare-Bones Particle Swarm Optimization with Mutual Information
被引用160
RSigELU: A Nonlinear Activation Function for Deep Neural Networks.
被引用54
被引用19
Fractional-order Convolutional Neural Networks with Population Extremal Optimization
被引用9
被引用8
被引用17
Enhancement of Neural Networks with an Alternative Activation Function Tanhlu
被引用46
Reconstruction of Central Arterial Pressure Waveform Based on CNN-BILSTM
被引用17
被引用14
被引用33
被引用4
Review—Electrochemistry and Other Emerging Technologies for Continuous Glucose Monitoring Devices
被引用79
被引用32
被引用4
Diagnosis of Arrhythmias with Few Abnormal ECG Samples Using Metric-Based Meta Learning
被引用21
Empirical Study of the Modulus As Activation Function in Computer Vision Applications
被引用7
被引用9
Application of Improved Multi-Strategy MPA-VMD in Pipeline Leakage Detection
被引用10
被引用4
被引用14
被引用20
Switching Triple-Weight-SMOTE in Empirical Feature Space for Imbalanced and Incomplete Data
被引用5
Probabilistic Orthogonal-Signal-corrected Principal Component Analysis
被引用2
被引用18
Role of Insulin in Health and Disease: an Update.
被引用77
Optimal Evolutionary Framework-based Activation Function for Image Classification
被引用1