A new formula to determine the optimal dataset size for training neural networks

Lim Eng Aik,Tan Wei Hong,Ahmad Kadri Junoh

semanticscholar（2019）

Cited 0|Views1

No score

Abstract

In neural networks, training a network with a large datasets put a heavy load to computation time and does not guarantee networks accuracy. As dataset may contains outlier or missing value that leave a gap that possibly cause the overall shape of dataset to be affected during training session. A datasets with too limited data points or too much data points is not an optimal size for training the neural network. Hence, suitable size is requires ensuring the neural network is trained using optimal dataset size which able to reduce computational time and does not affect the accuracy significantly. This paper presents a dataset size reduction formula that can provide suitable number of training dataset size for the neural networks and does not affect the accuracy significantly. The formula derived from the Fibonacci retracement that has been reported its usage in many literatures. The experiments were performed on four literatures function and four real-world datasets to validate its efficiency. The experiments tested on groups of dataset with their data reduce from 0 percent to 95 percent with 5 percent step size. The results are compared to proposed method for root mean square error (RMSE) and time usage in radial basis function network (RBFN). The proposed method yielded a promising result with an average over 50 percent reduction in time usage and 20 percent in RMSE.

Translated text

AI Read Science

Must-Reading Tree

Example

Generate MRT to find the research sequence of this paper

Chat Paper

Summary is being generated by the instructions you defined