A Novel Neural Network Architecture Utilizing Parametric-Logarithmic-modulus-based Activation Function: Theory, Algorithm, and Applications
KNOWLEDGE-BASED SYSTEMS(2024)
Nantong Univ | Shandong Univ Sci & Technol | Hohai Univ | Brunel Univ London
Abstract
This paper introduces a novel parametric-logarithmic-modulus-based activation function (PLM-AF) designed to significantly enhance the nonlinear expression capabilities of high-dimensional spectroscopy data. A one-dimensional CNN-LSTM (1D-CNN-BiLSTM) model is subsequently developed to capture long-term dependencies within glucose Raman spectroscopy. To the best of our knowledge, this is the first work to simultaneously optimize the predictive performance of the model from the perspectives of both network architecture and activation functions. The effectiveness of the model is comprehensively evaluated against state-of-the-art methods using a public Raman spectroscopy dataset. Compared to the sub-optimal glucose prediction models, the proposed model improves the training root mean square error (RMSE) by 41.89%. The improved prediction accuracy demonstrates that the proposed regression model with the novel PLM-AF can significantly facilitate non-invasive glucose concentration prediction, thereby advancing the auxiliary diagnosis and healthcare industry.
MoreTranslated text
Key words
Activation functions,Convolutional neural networks (CNN),Bidirectional long short-term memory (BiLSTM),Healthcare,Raman spectroscopy,Glucose concentration prediction
求助PDF
上传PDF
View via Publisher
AI Read Science
AI Summary
AI Summary is the key point extracted automatically understanding the full text of the paper, including the background, methods, results, conclusions, icons and other key content, so that you can get the outline of the paper at a glance.
Example
Background
Key content
Introduction
Methods
Results
Related work
Fund
Key content
- Pretraining has recently greatly promoted the development of natural language processing (NLP)
- We show that M6 outperforms the baselines in multimodal downstream tasks, and the large M6 with 10 parameters can reach a better performance
- We propose a method called M6 that is able to process information of multiple modalities and perform both single-modal and cross-modal understanding and generation
- The model is scaled to large model with 10 billion parameters with sophisticated deployment, and the 10 -parameter M6-large is the largest pretrained model in Chinese
- Experimental results show that our proposed M6 outperforms the baseline in a number of downstream tasks concerning both single modality and multiple modalities We will continue the pretraining of extremely large models by increasing data to explore the limit of its performance
Upload PDF to Generate Summary
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Related Papers
2020
被引用48 | 浏览
2020
被引用21 | 浏览
2021
被引用22 | 浏览
2020
被引用160 | 浏览
2021
被引用19 | 浏览
2022
被引用8 | 浏览
2022
被引用17 | 浏览
2023
被引用33 | 浏览
2022
被引用79 | 浏览
2023
被引用32 | 浏览
2022
被引用21 | 浏览
2023
被引用4 | 浏览
2023
被引用14 | 浏览
2023
被引用20 | 浏览
2024
被引用5 | 浏览
2023
被引用18 | 浏览
Data Disclaimer
The page data are from open Internet sources, cooperative publishers and automatic analysis results through AI technology. We do not make any commitments and guarantees for the validity, accuracy, correctness, reliability, completeness and timeliness of the page data. If you have any questions, please contact us by email: report@aminer.cn
Chat Paper
Summary is being generated by the instructions you defined