Feedback Control Systems Stabilization Using a Bio-inspired Neural Network

The proportional–integral–derivative (PID) control systems, which have become a standard for technical and industrial applications, are the fundamental building blocks of classical and modern control systems. In this paper, a three-layer feed-forward neural network (NN) model trained to replicate the behavior of a PID controller is employed to stabilize control systems through a NN feedback controller. A novel bio-inspired weights-and-structure-determination (BIWASD) algorithm, which incorporates a meta-heuristic optimization algorithm dubbed beetle antennae search (BAS), is used to train the NN model. More presicely, the BIWASD algorithm identifies the ideal weights and structure of the BIWASD-based NN (BIWASDNN) model utilizing a power sigmoid activation function while handling model fitting and validation. The results of three simulated trials on stabilizing feedback control systems validate and demonstrate the BIWASDNN model’s exceptional learning and prediction capabilities, while achieving similar or better performance than the corresponding PID controller. The BIWASDNN model is compared to five other high-performing NN models, and a MATLAB repository is accessible in public through GitHub to encourage and enhance this work. Received on 22 December 2021; accepted on 02 February 2022; published on 04 February 2022


Introduction
The proportional-integral-derivative (PID) controllers have been used successfully in processcontrolled fields of industry such as machinery, metallurgy, power, and light industry since they first emerged decades ago [1]. The PID controller is a feedback-based control loop technique, and two of the main reasons for its continuous use are its simplicity of design and analysis, as well as its simplicity of implementation. However, PID controllers have downsides despite having a simple structure and being simple to comprehend and apply to many control systems [2]. Tuning the PID parameters, notably K p , K i , and K d , determines the performance of the PID control system. Improper tuning will result in inferior or even unstable performance of the controlled system [3]. A neural network (NN) approach is presented in * Corresponding author. Email: vaskatsikis@econ.uoa.gr this research for substituting the PID controller with a NN feedback controller in control systems, resulting in equivalent or even better performance of the feedback controlled system. It is worth mentioning that the main advantage of a NN feedback controller over a PID controller is that it requires less computing during the feedback process because integration and derivatives are not used.
The rapid advancement of artificial intelligence, as well as modern electronics and information technologies, has resulted in a plethora of excellent theoretical research findings on artificial NN. Artificial NNs may be used to model and anticipate complicated problems and patterns, such as predicting summary statistics [4], diagnosis of breast cancer [5], financial time-series forecasting [6], optimize a financial portfolio [7], calculating specific matrices in maths [8], tracking robotic motion and mobile object [9,10], solving perturbed timevarying underdetermined linear systems [11]. In general, determining the best structure of the NN is important and useful. Obtaining the ideal linking weights and number of hidden-layer neurons (HLNs), particularly in the generic multi-input NN, may significantly reduce computational complexity, increase hardware realization, and therefore improve the NN's efficiency [12]. One of the most significant and common feed-forward NN models is the error back-propagation (BP) training algorithm or its variations, which has extensive theoretical studies and real-world applications. BP algorithms are gradient-based iterative approaches that alter the artificial NN weights in a gradient-based descent direction to bring the input/output behavior into a desired mapping. BP-type NNs, in particular, appear to have the following flaws: 1. the probability of being trapped in some local minima; 2. difficulty selecting suitable learning rates (or, say, speed of training); 3. inability to design the optimal or smallest NN structure in a deterministic manner (or, say, high computational complexity). As a result of the aforementioned inherent flaws, many improved BP-type algorithms have been developed and investigated. It is worth noting that many studies focus on the learning algorithm itself in order to improve the performance of BP-type NNs [13]. Nevertheless, the vast majority of improved BP-type algorithms have yet to overcome the aforementioned fundamental flaws. [14]. As a result, NNs determined from BP methods have a high computational complexity for obtaining the ideal connecting weights [13], and establishing the optimal NN structure is still a difficult process [14].
A number of weights-and-structure-determination (WASD) algorithms are presented as superior alternatives in [15] to prevail over the problems arising from BP algorithms and to define the appropriate NN structure for finer implementations. The WASD algorithm uses the weights-direct-determination (WDD) method to directly specify the optimal linking weights among the hidden and output layers while also acquiring the optimal amount of HLNs. Note that three major issues must be resolved during the design of a NN model for any application: 1. the activation function; 2. the number of HLNs (or, say, the structure); 3. the computation of connecting weights between two separate layers. According to the previous analysis, obtaining the optimal connecting weights and the optimal number of HLNs for the multi-input NN are useful and important, especially in the general multi-input NNs, because they can considerably reduce the computational complexity and promote hardware realization. That is, they improve the efficiency of the NNs [15].
Another approach for improving the performance of artificial NNs is to use meta-heuristics. In this concept, the Beetle Antennae Search (BAS) algorithm has been used to optimize Elman NN [16], feed-forward highdimensional NN [17], fog computing networks [18], and back-propagation NN [19]. In this paper, we also employ BAS to strengthen the proficiency of a WASD algorithm. It is worth mentioning that BAS is capable of effective global optimization and has been widely used in a variety of scientific domains in recent years, such as robotics [20], engineering [21], and finance [22][23][24]. In this research, a novel bio-inspired WASD (BIWASD) algorithm for training NN is developed by integrating the BAS and WASD techniques, and a three-layer feed-forward BIWASD-based NN (BIWASDNN) model is presented. The BIWASD algorithm identifies the ideal weights and structure of the BIWASDNN model utilizing a power sigmoid activation function (AF), while employing cross-validation to address bias and prevent being stuck in local optima during the training process. More particularly, the BIWASD algorithm discovers the ideal number of HLNs, as well as the best power of the AF at each HLN, to minimise the model's error throughout validation. The NN structure is optimized in this way, while the BIWASD method discovers the ideal weights. As a consequence, the computational cost is reduced even further than with conventional WASD approaches, that might require a large number of HLNs, and the cross-validation throughout the training phase improves the accuracy of the anticipated results even further. The BIWASD and the other WASD algorithms in [15] uses the WDD method and are liable for training the NN model, but the BAS algorithm incorporation, the power sigmoid AF, and the cross-validation throughout the training procedure are the differences. The results of three simulated trials on stabilizing feedback control systems validate and demonstrate the BIWASDNN model's exceptional learning and prediction capabilities, while achieving similar or better performance than the corresponding PID controller.
The following are the work's highlights: • A NN approach to feedback control systems stabilization is proposed and investigated; • A three-layer feed-forward BIWASDNN model is presented and studied, and a new BIWASD algorithm for training WASD-based NN model is proposed by merging the algorithms of BAS and WASD; • Three simulated trials on stabilizing feedback control systems through a NN feedback controller are presented. In these trials, the BIWASDNN model is compared against five other highperformance NN models, and the numerical stability of the BIWASD is investigated.
The following is the layout of the paper. Section 2 provides preliminaries on feedback control systems based on PID and the NN approach is described. Section 3 introduces the BIWASDNN model and its theoretical basis is analysed. Section 4 includes three simulated trials on stabilizing feedback control systems which examines the prediction ability of the BIWASDNN model on PID controllers output and the performance of NN feedback controller on feedback control systems stabilization. It also includes a comparison of the BIWASDNN model to five other high-performing NN models, as well as a concise overview and relevant information about the MATLAB repository, which is available on GitHub. Finally, in section 5, the final remarks are stated.

PID Feedback Control and NN Approach
Open loop control and closed loop (feedback) control are the two fundamental types of control loops. In open loop control, the controller's control action is independent of the process variable (PV), whereas in closed loop control, the controller's control action is dependent on process feedback in the form of the PV's value. A feedback loop assures that the controller exerts a control action to manage the PV to be identical to the reference input in a closed loop controller. Closed loop controllers are also known as feedback controllers because of this [25]. Control theory introduces feedback to prevail over the shortcomings of the open-loop controller. That is, a closed-loop controller employs feedback to control a dynamical system's states or outputs.
The PID controller is a typical feedback controller (or closed-loop controller) architecture. A PID controller computes an error value e(t) as the difference between a desired setpoint and a measured PV on a continuous basis and makes a correction using proportional, integral, and derivative components. Note that these three components operate on the error signal to generate a control signal. They have been employed in almost all analogue control systems since the 1920s, and their theoretical understanding and application date back to that time. Assuming that u(t) is the control signal sent to the system, r(t) is the desired output, y(t) is the measured output and e(t) = r(t) − y(t) is the tracking error, the general form of a PID controller is as follows: where K p , K i , K d ∈ R + 0 denote the coefficients for the proportional, integral, and derivative terms, respectively. Adjusting these three factors, frequently iteratively by tuning and without specialized knowledge of a plant model, yields the desired closed loop dynamics.
A NN feedback controller is presented in this section. To train the NN feedback controller, we must first discretize the entire feedback control system procedure. The purpose is to generate data that can be used to train the NN feedback controller. As a result, we construct a discrete-time PID controller ready for programming in Prop. 2.1, based on the continuous-time PID controller (2.1), where each of the terms is a discretized version of its equivalent continuous-time terms.
Proposition 2.1. The discrete version of (2.1) can be formulate as follows: where ∆t denotes the sampling period and k denotes the sample index.
Proof. Differentiating both sides of (2.1) employing Newton's notation yields: then approximating the derivative terms, we have the following: Approximating the rest derivative terms and then solving in terms of u(t k ), it is easily observable that we obtain (2.2), hence completes the proof.
Consider the single-input-single-output (SISO) control system of Fig. 1. The plant shown therein typically operates in continuous-time and it can be described by the following s-transfer function: where U (s), H(s) are the Laplace transforms of the input and output signals, respectively, and where G(s) is the transfer function of the plant with the coefficients a i , b i ∈ R.
To convert the discrete-time signal u(t k ) produced by the controller in (2.2) to a continuous-time piecewise constant signal, the zero-order-hold (ZOE) method is used [26]. In this way, the transfer function H(s) is converted from s-domain to z-domain. Then, the difference equation of the z-transfer function H(z) can be obtained as shown in Prop. 2.2.

Proposition 2.2.
Considering the following z-transfer function: then its difference equation can be formulate as follows: Proof. By cross multiplying (2.5), it can be reformulated as follows: or equivalent Taking the inverse transform of the above equation, we have that and thus we have the following difference equation: Reducing each time index by 2 and then solving in terms of y(t k ), it is easily observable that we obtain (2.6), hence completes the proof.
Acquiring the signal u(t) values of a PID controller during the control system stabilization process, the NN feedback controller in Fig. 2 can be trained to predict the signal u(t) produced by the PID using the error e(t) as input.
According to the aforementioned, the following Alg. 1 describes the whole discretized process both in the cases of a PID control system and a NN feedback control system. In the case of a PID controller, set u(t) according to (2.2), and in the case of NN feedback controller, set u(t) the NN prediction base on e(t).

5:
Set y(t) according to (2.6). 6: The error e(t), the signal u(t) produced by the PID or the NN feedback controller, and the system output y(t).

The BIWASDNN Model
This section introduces a three-layer feed-forward NN model with one input and n hidden layer neurons, as shown in Fig. 3. The first layer, more specifically, is the input layer, which receives and distributes X, i.e. the input, to the associated equal-weighted neuron in the second layer. The second layer, which includes no more than n power activated neurons, receives the value X as input from the first layer, and the corresponding AFs are power sigmoid functions with varying powers. The output layer, which contains a nonactivated neuron, is the final layer. The weight vector W is composed of the weights W v in the neuronto-neuron link among the neurons of the second and third layers, and is produced using the WDD method. BIWASDNN is the name of the NN model, which is trained using a novel BIWASD algorithm. Note that the BIWASD algorithm is responsible for finding the ideal weights W and the structure of the NN, i.e. finding the corresponding AFs F v (X) varying powers v. We will go through all of the details about the model's construction and structure in this section.

WDD Method and AF
Machine learning is a computationally intensive process, and calculating iterative training errors is a difficult task. The WASD algorithm for NN training is implemented to reduce the computational cost of this process and simplify the network composition [15].
Comprehensive interpretations of important key theoretical underpinnings and analyses are provided here for the construction of the BIWASDNN. To start with the Taylor polynomial (TP) approximation theorem [27] is determined as below.
Theorem 3.1. Assume that K is a nonnegative integer and, on interval [a, b], a target function f (·) has the (K + 1)order continuous derivative, then for Such that, with a fixed value h ∈ [a, b], f (x) may be approximated as below: where f (i) (h) signifies the value of the i-order derivative of f (x) at the point h and i! signifies the factorial of i. It is worth noting that P K (x) is the K-order TP of function f (x).
Proposition 3.1. The TP approximation theorem may be used to approximate multivariable functions [27].
For a target function f (x 1 , x 2 , . . . , x g ) with (K + 1)order continuous partial derivatives in an origin's neighborhood (0, . . . , 0) and g variables, the K-order TP P K (x 1 , x 2 , . . . , x g ) about the origin is: The following nonlinear function may be used to represent the relationship between the NN's input X and the output target Y : Note that our approach is similar to the power activated NN in [15], which is inline with the K-order TP. As a result, considering the variable X and, in an origin's neighborhood (0, . . . , 0), the (K + 1)-order continuous partial derivatives, the K-order TP P K (X) can map (3.4) as follows: where q v = F v (X) ∈ R signifies a power function of all inputs and w v ∈ R signifies the coefficient (or weight) for q v . The sigmoid AF is one of the most often used functions while it is most commonly used in models which call on us to anticipate the probability as a result. Due to its range, the sigmoid is the optimal choice because probability only occurs in the range of 0 to 1. Assuming that X ∈ R 1×N , the following power sigmoid AF is proposed and employed: where ⊙ and the superscript () ⊙ imply the Hadamard (or element-wise) product and the Hadamard exponential, respectively, X implies the function input, while the power value v ∈ Z + and the range of (3.6) is 1 2 , 1 . As a result, the HLNs of the K-order TP NN employ the AF of (3.6) to generate sigmoidal activation. It is worth mentioning that several AFs, such as Chebyshev and Euler polynomials, sine, square wave, and power are employed on WASD-based NN in [15,28,29].
Moreover, for a given number of samples S ∈ N with X ∈ R S , we set q S,v = F v (X) ∈ R S×n and, as a consequence, the input-activation matrix becomes the weight vector W = [w 0 , w 1 , . . . , w n−1 ] ∈ R n×S and the desired-output vector Y ∈ R S . The weights of the BIWASDNN in Fig. 3 are then calculated using the WDD approach, rather than iterative weight training in traditional NNs [27]. From [15], the following lemma can be simply deduced.
Lemma 3.1. The steady-state weights of the K-order TP NN may be acquired forthrightly as below: where the superscripts () T , () −1 and () † signify the transpose operator, the inverse operator and the pseudo-inverse operator, respectively.
According to that, the matrix Q can be computed as proposed in Alg. 2.

BIWASD Algorithm
The BIWASD algorithm, as well as the entire process of modelling and predicting with the BIWASDNN model, are discussed in depth in this subsection. BAS algorithm has been introduced in [30] and mimics the searching behavior of a beetle as presented in Fig. 4. Because of its simplicity, BAS permits novel optimisation algorithms to be developed (see [31][32][33][34][35]). As a result, the BIWASD algorithm uses the beetle searching behavior for finding the global minimum to specify the optimal weights of the NN.
The BIWASD algorithm is in charge of training the NN model. First, the data input is separated into two set of samples for fitting and validation. Note that this is the well-known cross-validation method, which is employed to evaluate the consistency of a machine learning model employing the validation data, as part of training. Validation tries to ensure that the model's performance generalizes beyond the training set because the validation set is separate from the training set. The parameter p ∈ (0, 1] ⊆ R allows the user to specify the precise percentage difference between the fitting and validation sets. Supposing that H is the sample size of X, then the first H 1 = pH samples of X are used for fitting the model and the last H 2 = H − H 1 samples for validation. Second, the beetle searching behavior in Fig. 4 is adapted. The beetle searching behavior is described by the following random route when x is the beetle's position at t-th time and f (x) is the smell strength, expressed as a fitness function: where n signifies the length of vector x and rand(·) signifies a random function. We set x l to represent the left antennae and x r to represent the right antennae, and their seeking behaviors are formulated as follows: x l = rnd(x t − Zd t ), (3.10) x r = rnd(x t + Zd t ), (3.11) where rnd(·) denotes a function that outputs a number's value rounded up to the next integer, while the capacity to exploit is correlated to the sensing diameter of the antennae d. Moreover, the detecting behavior can be formulated as follows: x t = rnd(x t−1 + Sδ t sign(f (x r ) − f (x l ))). (3.12) where sign(·) denotes a sign function, and δ signifies a size step that depicts the convergence speed after an increase in t throughout the searching. Last, the d and delta update rules are the followings: It is worth noting that the aforementioned process is a converted BAS algorithm that returns only integer solutions. Moreover, the fitness function was adjusted to identify the NN's ideal structure. Based on the results of (3.12), we do this by removing the HLNs that relate to a negative number. The fitness function then computes the matrix Q as shown in Alg. 2, and the corresponding W of this Q matrix is created, as suggested in Lemma 3.1. More precisely, we set W = Q † (H 1 )Y (H 1 ), where Q(H 1 ) and Y (H 1 ) denote the Q matrix and the targets on the fitting set H 1 . Then, the validation set H 2 predictions are calculated, and the root-mean-square error (RMSE) between these predictions and their target value is measured as shown in Alg. 3. That is, we set Y (H 2 ) = Q(H 2 )W , where Q(H 2 ) andŶ (H 2 ) denote the Q matrix and the predictions on the validation set H 2 . The RMSE is an evaluation of accuracy used in statistics to compare forecasting errors of different models and is commonly used in machine learning as a cost function for regression problems. Also, the RMSE values closer to zero are finer, and is calculated as follows:

B e e t l e S e a r c h i n g B e h a v i o r : • me a s u r e t h e i n t e n s i t y o f o d o r a t e a c h s t e p t • c o mp a r e t h e o d o r i n t e n s i t y f ( x ) t o e s ma
where H denotes the samples' number,Ŷ and Y denote the predicted and the target value, respectively. Provided a maximum number of HLNs n, the adapted BAS method identifies the ideal number of HLNs as well as the optimal AF power at each HLN, lowering the model's error throughout validation. Last, the initial x t at t = 0 in the aforementioned adapted BAS method must be a random vector x 0 ∈ Z n . However, we highly suggest setting as initial x t the following vector: Based on [15], the starting structure of the NN employs the suggested powers v for rnd(n/2) in number HLNs, while the process of adding/removing HLNs in the structure throughout the adjusted BAS iterations is made easier. BAS returns x * at the end of the iterations, and we only set the values of x * ≥ 0 in N * . Notice that N * denotes the AF's optimal powers v at each NN HLN, and its length denotes the ideal number of HLNs. The BIWASD algorithm determines and returns the optimal W on the entire training data set, as well as the RMSE of the validation set, according to that N * .
In conclusion, the BIWASD algorithm is used to train the NN model. It splits the data to fitting-validation sets and, stated a maximum HLN number n, sets the x 0 of (3.16) as starting powers v of the AF at each HLN of the NN model and finds the optimal N * by estimating the RMSE of the validation set. It is worth noting that the power sigmoid AF of (3.6) is not obligatory. This AF can be replaced with any of the AFs listed in [15]. Fig. 5a shows the whole procedure for the BIWASD algorithm, whereas Fig. 5b shows the entire process for modelling and predicting with the BIWASDNN model.

Experiments and MATLAB Repository
This section investigates the proficiency and the prediction ability of the BIWASDNN model on three simulated trials on stabilizing feedback control systems. It is equitable to compare the BIWASDNN model to certain other well-performing NN models because it is specifically built to handle only regression problems. The Hermite polynomial NN (HPNN) from [15], which is one of the best-performing WASD-based NN in solving regression problems, is one of the models, while the others include a MATLAB long short-term memory (LSTM) model, a MATLAB feed-forward (FF) model, and two models from MathWorks' Regression Learner App. As a consequence, the fine Tree (FTree) and squared exponential Gaussian process regression (SEGPR) models are used to tackle the regression problems which arise when stabilizing feedback control systems are considered. On the one hand, the MATLAB LSTM model has the following specs: optimizer = Adam, numHiddenUnits = 200; maxEpochs = 100; miniBatchSize = 20; LearnRate = 0.01. On the other hand, the MATLAB FF model has the following specs: hiddenSizes = 10. Furthermore, a short overview of the BIWASDNN's MATLAB repository is provided, along with some valuable information. The simulation trials employ Alg. 1 to generate three datasets containing samples of the tracking error e(t) and the corresponding PID controller's signal u(t). Each dataset corresponds to a PID feedback control system results in which the plant's s-transfer function and the PID parameters are declared in Tab. 1. Following the creation of each dataset, the BIWASDNN, HPNN, FTree, and SEGPR models are trained to produce the signal u(t) using the error e(t) as input. Setting the desired output r(t) = 1, the sampling period ∆t = 0.01 and the period end T = 20, Alg. 1 outputs 2000 samples. Splitting the data 50-50% for training and testing the model, 1000 samples will be used to train the models and 1000 samples will be used to test them.

Simulated Trials
In the case of the BIWASDNN model, we set p = 0.7, splitting the training data to 70-30% for fitting and validating the model. It is interesting to note that the closer the p value gets to 1, the fewer the samples for validating the model will be, and the closer the p value gets to 0, the opposite occurs, lowering the model's prediction accuracy in both circumstances. As a consequence, for the model's optimal prediction accuracy, the optimal value of p must be set in [0.6, 0.9]. In addition, the BIWASD's parameters are set to d 0 = 8, δ 0 = 5, the maximum iterations to t max = 30, and the maximum number of HLNs to n = 30. Assuming that the parameter p ∈ [0.6, 0.9] and the parameter t max specifies the maximum iterations, raising the parameter t max will commonly lead to a high prediction accuracy for the model. In Figs. 6a-6c, the training procedure of the model is depicted through the RMSE for the datasets with plant 1, 2 and 3, respectively, where we observe that the BIWASD algorithm demands less than 30 iterations to converge to the optimal NN's structure. Furthermore, the BIWASD algorithm returned N * ∈ R 16 and, as a result, the optimal structure of the NN has 16 in number HLNs for the specific runs.
In Figs. 6d-6f, the results of the NNs models on the test data are depicted for the datasets with plant 1,2 and 3, respectively. The NN models' prediction capability on 1000 samples is great and practically identical in Figs. 6d and 6e, but the findings are not as good in Fig. 6f, where the NNs model's predictions have a tiny divergence from the target in some samples. The BIWASDNN, HPNN, Ftree and SEGPR models statistics are presented in Tab. 2 in which the coefficient of determination (R 2 ), mean absolute percentage error (MAPE), mean absolute error (MAE), root-mean-square error (RMSE), the average number of training iterations (NTI) and the average time consumption (TC) of the train and test prices for the datasets with plant 1, 2 and 3 are included. Note that following the notations used in the RMSE formula (3.15), MAE = 100 Checking the results of Tab. 2, it is clear that BIWASDNN has the lowest TC in all the datasets while the LSTM has the highest. In all datasets, all of the models have nearly identical statistics, with the exception of the LSTM model, which has poorer statistics. In addition, the FTree model's train data statistics are the best in the dataset with plant 3, while the BIWASDNN's model are the third best. However, the SEGRP model's test data Feedback Control Systems Stabilization Using a Bio-inspired Neural Network

Input Data
Return Ŷ which is the predicted response of Z  Because Tab. 2 also contains the NTI, some observations concerning algorithm convergence can be made. The BIWASDNN model requires less than 30 iterations to attain the highest prediction accuracy for all the datasets, as also recorded in Figs. 6a-6c, whereas the HPNN model requires 13 iterations. In the case of the LSTM, since the parameter maxEpochs has been set to 100, the model requires 100 iterations to attain the highest prediction accuracy for all the datasets, while the FF model requires 20, 50 and 130 iterations to attain the highest prediction accuracy for the datasets with plant 1, 2 and 3, respectively. Note that the models for FTree and SEGRP are not optimizable, hence the number of iterations is not available through MathWorks' Regression Learner App. According to the aforementioned observations, the HPNN model has the lowest NTI, followed by the BIWASDNN model, with the LSTM model having the highest. However, it is worth mentioning that the HPNN model has an average NTI/TC ration of 1/0.6923, while the BIWASDNN model has 1/0.0033. That is, the BIWASDNN model is 90 times faster to train than the HPNN model. Now that the BIWASDNN model has been trained for each plant 1, 2 and 3, Alg. 1 is used with the NN feedback controller to stabilize the input signal of each control system. The results of each simulation trial are depicted in Figs. 6g-6i. We can see in Fig.  6g (plant 1 case) that the feedback control system with the NN feedback controller converges to the desired output in a similar fashion to the PID feedback control system. In contrast to the PID feedback control system in Figs. 6h (plant 2 case) and 6i (plant 3 case), the feedback control system with the NN feedback controller converges faster to the desired output. That is, the NN feedback controller has better performance than the PID controller in the cases of plant 2 and 3.
In general, according to Tab. 2 and the Figs. 6d-6f depicted results, the BIWASDNN model performed flawlessly in creating a model that may predict the PID controller's signal u(t) taking as an input the corresponding tracking error e(t), while its predicted capability is almost same to that of the HPNN, Ftree, and EGPR models on the specific datasets with plant 1, 2 and 3, and its TC being the lowest. Moreover, the NN feedback controller achieves similar or better performance than the corresponding PID controller.

BIWASD's Stability and MATLAB Repository
Numerical findings are presented in this subsection to prove the practicality of the suggested BIWASDNN model and the BIWASD algorithm on regression problems that involve datasets for training NN feedback controllers. The generalization capabilities of the suggested BIWASDNN equipped with the BIWASD algorithm are also investigated for completeness.
Because the input-layer weights and hidden-layer biases are produced at random, the final/optimal number of HLNs differs when the BIWASD algorithm is used multiple times. Each time, the BIWASDNN structure with a determined number of HLNs will not be the same. Each BIWASDNN model in Exp. 4.1 is trained and tested 100 times to ensure the stability of the suggested BIWASDNN equipped with the BIWASD method. The average length of N (Avg. len(N)) and the standard deviation of len(N) (σ (len(N))), are then computed 100 times along with the average values of R 2 , MAPE, MAE and RMSE. Note that the average len(N) relates to the average optimal number of HLNs. The findings are presented in Tab. 3. Therein, comparing the average len(N) and σ (len(N)) for each BIWASDNN model in Exp. 4.1, we observe that the final/optimal number of HLNs specified by the BIWASD algorithm is considerably stable since the maximum σ (len(N)) is less than 0.56.
In general, when Tab. 3 and Tab. 2 are compared, the average R 2 , MAPE, MAE and RMSE of BIWASDNN in Exp. 4.1 are almost identical. Furthermore, the BIWASD's stability on 100 runs is excellent, which makes the BIWASDNN's performance to be antagonistic or even better than the HPNN, Ftree, and SEGPR performances. Notice that the BIWASD algorithm not only determines the best BIWASDNN structure automatically and effectively, but also acquires the optimal hidden-layer weights directly. These simulation trials demonstrated that the proposed BIWASDNN, which uses the BIWASD algorithm to approximate target functions, is efficient and effective. Lastly, on GitHub, you can find the whole creation and implementation of the computational processes described in this work at this link: https://github.com/SDMourtas/BIWASDNN, where we created a MATLAB repository for stabilizing feedback control systems inline with Algs. 1-3 and the algorithms presented in the diagrams of Figs. 5a and 5b. For the simulation trials conducted in this section, the MATLAB repository contains detailed installation instructions and a complete implementation. Additionally, anyone can draw their own conclusions from their own tests by modifying the BIWASDNN model parameter values in the repository's main MATLAB function.

Conclusion
In this paper, the BIWASD algorithm is utilized to train a three-layer feed-forward NN model. The BAS and WASD techniques are combined to create the BIWASD algorithm for training NNs, and the BIWASDNN model is introduced. Using a power sigmoid AF, the BIWASD algorithm determines the proper weights and structure of the BIWASDNN while handling model fitting and validation. Three simulated trials on stabilizing feedback control systems have revealed the BIWASDNN model's learning and predicting performance. The results of the trials present the splendid precision and efficiency of the BIWASDNN model.
There are certain limits and suggestions that can be made about this work. 1. One disadvantage of this work is that the BIWASDNN model can only solve regression problems in machine learning, that narrowed the area of our investigation. 2. A different AF in the WDD process is proposed as a way to improve this research. 3. Applying akin methodologies to analogue filters for low-voltage operation and electronic adjustment of their frequency characteristics [36,37], where the NN structure and training algorithms should be properly built for efficiency and prediction precision, may be a intriguing future research path. 4. A future potential work might be to investigate the utilization of a BIWASD-based NN in industrial applications [38,39].