Journal of Fuzzy Systems and Control, Vol. 2, No 2, 2024 |
ANFIS-based LQR Control for Rotary Double Parallel Inverted Pendulum
Chi-Hung Nguyen 1, Van-Si Tran 2,*, Xuan-Hoang Nguyen 3, Quang-Bao Truong 4, Minh-Tuan Nguyen 5,
Nguyen-Phat Luong 6, Kha-Vy Ngo 7, Duc-Huy Nguyen 8, Thanh-Trung Nguyen 9, Thi-Thanh-Hoang Le 10
1, 2, 3, 4, 5, 6, 7, 8, 9, 10 Ho Chi Minh City University of Technology and Education (HCMUTE), Ho Chi Minh City, Vietnam
Email: 1 20151487@student.hcmute.edu.vn, 2 20151554@student.hcmute.edu.vn, 3 21151459@student.hcmute.edu.vn,
4 21151443@student.hcmute.edu.vn, 5 21151406@student.hcmute.edu.vn, 6 21151146@student.hcmute.edu.vn,
7 21151189@student.hcmute.edu.vn, 8 21151461@student.hcmute.edu.vn, 9 20145643@student.hcmute.edu.vn,
*Corresponding author
Abstract—This article explores two methodologies: Linear Quadratic Regulation (LQR) and the application of the Adaptive Neuro-Fuzzy Inference System (ANFIS) on the Rotary Double Inverted Pendulum in Parallel Type (PRDIP) model. This model belongs to a class of underactuated robots, representing a nonlinear system with a mechanically simplistic configuration yet exhibiting considerable nonlinearity. Therefore, ANFIS is utilized to learn the input-output data, responses, and feedback of LQR. The response of the system's output to both LQR and ANFIS is compared to demonstrate the effectiveness of ANFIS in learning from the principles of LQR. This demonstration is supported through three cases: one simulation case and two experimental cases. Both control strategies are applied to the PRDIP system at the zero and -π positions, where one pendulum remains upright, and the other descends to counteract oscillations. The study presents simulation and experimental results to evaluate the points above comprehensively.
Keywords—ANFIS; LQR; Rotary Double Inverted Pendulum in Parallel Type
ANFIS is an integrated technique of Fuzzy Logic and Artificial Neural Networks. ANFIS combines the power of fuzzy inference in representing uncertain knowledge with the learning and self-adjustment capability of data [1]-[6]. ANFIS has been effectively applied in learning various control methods including Proportional-Integral-Derivative (PID), Linear Quadratic Regulator (LQR), and nonlinear control. These methods have been applied to Rotary Inverted Pendulum (RIP) [7][8] or pendulum systems [9], PID-based ANFIS control for Inverted Double Pendulum System [10], Adaptive Fuzzy Control of Two-Wheeled Balancing Vehicle [11]. Also, an ANFIS controller for the double double-inverted pendulum is operated [12]-[15]. A comparison of outcomes of ANFIS with LQR on these models is examined [16]. ANFIS is widely applied in automatic control systems such as robots, self-driving vehicles, and automated production systems, enhancing accuracy and performance [17]. It is utilized in predictive applications such as weather forecasting, price prediction, image classification, and signal analysis, leveraging its capability to learn from data and adapt flexibly [18]. ANFIS effectively processes and analyzes signals and images, finding applications in fields such as medicine, remote sensing, and information technology [19].
This article is facing LQR-Based ANFIS to PRDIP [20] an optimal control method at zero-π position, where one pendulum is upright while the other pendulum is downward-facing. PRDIP is considered an advance over previous models like RIP or Furuta pendulum, with improvements in mechanical design, including the addition of a parallel link to the existing arm [20]. Despite increased flexibility, PRDIP still faces challenges from nonlinear and unstable dynamics, particularly due to differences in pendulum lengths. Current research is concentrating on utilizing results from previously tested LQR control [20] to apply for re-learning ANFIS in controlling PRDIP.
ANFIS is a novel and effective algorithm for system control, particularly excelling over LQR in handling nonlinear systems through its ability to learn from real-world data. Unlike LQR, ANFIS does not require system linearity, making it applicable to more complex systems. It autonomously learns and adjusts parameters based on actual data, enhancing system adaptability to changes and environmental noise. The design and tuning of ANFIS can be more intuitive due to its automatic learning capability, reducing the burden on designers. By integrating fuzzy logic and neural network theory, ANFIS effectively manages systems with uncertainty and high noise levels. Its scalability allows the application of large and intricate systems, utilizing neural networks and fuzzy system structures. ANFIS learns directly from real data, enabling informed control decisions without the need for precise system modeling.
In summary, using ANFIS to learn from LQR outcomes and potentially replace LQR offers significant advantages in scenarios involving highly nonlinear, time-varying systems, or situations where accurate system modeling is challenging. ANFIS's ability to learn from real-world data and its flexible adjustment capabilities can substantially improve control performance across diverse conditions.
The physical structure includes two pendulums, an arm, two encoders, a DC motor, and an iron frame in Fig. 1, and the system's physical parameters are detailed in Table 1 [20]. Table 2 provides an overview of parameters relevant to PRDIP [20].
Parameters | Definition | Unit |
ml | Pendulum I's mass | kg |
lhi | Pendulum I's length | m |
| Angular position of the ith pendulum | rad |
| The inertia of the "i-th" pendulum | kgm2 |
| Effective moment of inertia of the pendulum | kgm2 |
| The arm's angular position | rad |
| Arm’s length | m |
| Arm’s inertia | kgm2 |
| Motor’s torque | N.m |
| Gravitational constant | m/s2 |
| Coefficient of viscosity for "i-th" pendulum | Nm.s |
| Coefficient of viscosity for the arm | Nm.s |
Parameter | Pendulum 1 | Pendulum 2 | The Arm |
| 0.059 | 0.038 | na |
| 0.127 | 0.082 | na |
| na | Na | 0.51 |
| 0.0001526 | 0.0004693 | na |
| na | Na | 0.75 |
|
|
| na |
| na | na | 4.978 |
The mathematical equation of the system is calculated from the Lagrange function, denoted as.
| (1) |
The following is the Lagrange equation.
| (2) |
The kinetic energy equation
| (3) |
Velocities of the first and second pendulums are denoted as
respectively. Kinetic energy is re-written as
| (4) |
The system's potential energy is
| (5) |
Energy dissipation of PRDIP is contingent upon the frictional force
| (6) |
Lagrange equation is
| (7) |
PRDIP employs a DC servo motor. The relationship between torque and voltage is described as
| (8) |
Values of the DC motor are shown in Table 3 [20].
Parameter | Unit | Value |
| V/(rad/sec) | 0.064944 |
| V/(rad/sec) | 0.064944 |
|
| 6.835271 |
Dynamical equations of PRDIP are formulated in the form of state equations, presented as follows:
Dynamical equations of PRDIP are formulated in the form of state equations, presented as follows:
| (9) |
where
| (10) |
| (11) |
| (12) |
| (13) |
| (14) |
| (15) |
| (16) |
| (17) |
| (18) |
| (19) |
| (20) |
| (21) |
|
Consequently, linearization around the operational point (the upright position) becomes necessary for analysis and control.
| (22) |
| (23) |
The PRDIP's linearized state equation is as follows:
| (24) |
The matrices A and B are computed according to the following expressions:
| (25) |
where
| (26) |
| (27) |
| (28) |
Eigenvalues' positions are utilized to ascertain the stability of the system. The characteristic equation can be expressed in
where
are eigenvalues of
. Matrices
and
are computed by substituting parameters of this system.
| (26) |
| (27) |
The open-loop poles of this system are determined in MATLAB using the command eig
:
| (28) |
|
Consider a nonlinear system of the following form
| (29) |
is the system's state variable matrix; u is the control signal of the system.
Working point of is:
| (30) |
If u=0 then the system is balanced, and we can approximate the system in (1) to linear form
| (31) |
In there,
|
When the system works around this equilibrium position, we consider the system to be approximately a linear system so that designing a linear control algorithm is feasible. LQR control structure at the static working point of a linear system in (31) is shown in Fig. 2.
The selected control signal is
| (32) |
Computation of control matrix K typically involves solving complex Riccati equations. However, Matlab software simplifies this process by providing a lqr() command tool. Command to compute K is executed as follows:
| (33) |
In which, A and B are calculated in (3),
and
are weight matrices selected as follows.
| (34) |
With
and
are both positive constants
The LQR algorithm flow chart is shown in Fig. 3.
Select an input and output data set for training consisting of
samples:
| (35) |
Step 1: Choose the learning rate
, choose the maximum error
.
Step 2: Booting up:
Assigning the error
setting initial values for nonlinear parameters 
Step 3: Estimating linear parameters using the least squares algorithm:
| (36) |
| (37) |
|
Step 4: Update nonlinear weights using gradient descent algorithm: With k=1:K
Calculate error:
| (38) |
Update error:
| (39) |
Calculate cumulative error:
| (40) |
Step 5: End a training cycle If E <
, the learning process ends.
If E >=
, assign E=0, and return to step 3 to start a new training cycle.
The design methodology of LQR-based ANFIS is elucidated utilizing the algorithm delineated in parts A and B. ANFIS controller aims to emulate the functionality of a proficient LQR controller with a designated control matrix.
| (41) |
This matrix K is calculated according to (35) corresponding to the weight matrix Q, R found and optimized from the genetic algorithm (GA) for PRDIP with parameters in Table 1. Then, matrix K has the following values.
| (42) |
The dataset construction for ANFIS involves multiple inputs and outputs. It entails designing the optimal LQR map based on system parameters. The minimization of the cost function is treated as an optimization problem. Fig. 4 shows a program to create input and output files of the LQR algorithm. Fig. 5 shows the program used to collect input and output data of LQR to use for ANFIS learning.
The control parameters of ANFIS, including the number and type of membership functions, error tolerance, number of epochs, and learning method, are specified as follows:
To choose these parameters, we considered that our system is complex and difficult to control, so we chose these parameters based on the complexity of the system.
In ANFIS training, a structural block is formed by integrating parameters outlined in Part C to adapt to the collected dataset in Part D. Here, the dataset's "u" value serves as the output of the fuzzy inference system, while the remaining columns are treated as inputs. ANFIS training can be conveniently performed in MATLAB using the anfisedit command. The resulting .fis file serves as a state feedback controller for the designed system. Fig. 6 shows the interface of Toolbox ANFIS. We set corresponding parameters for learning to download data collected from the data collection program and start the learning process. Fig. 7 shows the interface of Toolbox ANFIS after completing learning and obtaining the data set for ANFIS. Error parameters of the learning process are shown.
Fig. 8 shows the program of the ANFIS system after learning the data of LQR. We use the FUZZY CONTROLER block to program ANFIS. Then, we see the results of ANFIS after learning LQR.
Fig. 9 shows the angular response of the arm bar. Initially, the arm deflection angle fluctuates strongly for the first 8 (s) with the largest amplitude of 0.22(rad), and after 10(s), the pendulum stabilizes around the working position, closely following the LQR response.
Fig. 10 shows the response to the angle of the first pendulum. Initially, the arm deflection angle fluctuates strongly for the first 8 (s) with the largest amplitude of 0.08(rad), and after 10(s) pendulum stabilizes around the working position, closely following the LQR response.
Fig. 11 shows responses to the angle of the second pendulum. Initially, the arm deflection angle fluctuates strongly for the first 10(s) with the largest amplitude of 3.43(rad), and after 13(s), the pendulum stabilizes around the working position, closely following the LQR response.
Fig. 12 shows the response to voltage. The system needs to supply a large amount of voltage for the first 15(s) to help two pendulum bars maintain balance, closely following the LQR response. The output response of ANFIS closely matching that of LQR indicates that ANFIS learns from LQR very effectively.
Fig. 13 is the actual model of the PRDIP system that we study in this article.
| (48) |
| (47) |
Fig. 14 shows the angular response of the arm. For the orange response line, which represents the response of LQR, the oscillation amplitude of LQR is not large. In the range [40; 50] (s), amplitude oscillates strongly but then stabilizes at around 3 (rad). For the blue response line, which represents the response of ANFIS, the oscillation amplitude of ANFIS is very small, oscillating around 1 (rad), and the response is good. Therefore, the ANFIS controller learns to operate very well from the LQR.
Fig. 15 shows the response to the angle of the first pendulum. For the orange response line, which represents the response of LQR, the oscillation amplitude of LQR is small, in the range [-0.01; 0.02] (rad). Therefore, pendulum 1 responds well, balancing at position 0. For the blue response line, which represents a response of ANFIS, the oscillation amplitude of ANFIS is almost identical to the LQR controller, with a very good response. Thus, the ANFIS controller learns to operate very well from the LQR controller and can control pendulum 1 to balance at position 0.
Fig. 16 shows the response to the angle of the second pendulum. For the orange response line, which represents the response of LQR, the oscillation amplitude of LQR is very small around -3.14 (rad). At around 30(s) and 60(s), we deliberately apply a force to pendulum 2 in a downward direction to deviate it from the equilibrium position at -3.14, causing the response to deviate from the equilibrium position. So, pendulum 2 performs its function of reducing motion and immediately returning to the equilibrium position. For the blue response line, which represents the response of ANFIS, the oscillation amplitude of ANFIS is almost identical to the LQR controller, with a very good response. At around 45(s), similar to 30(s) and 60(s) intervals of the LQR controller, it performs well in reducing motion and immediately returning to the equilibrium position. Therefore, the ANFIS controller learns to operate very well from the LQR controller and can control pendulum 2 to balance downward at the equilibrium position of
. The output response of ANFIS closely matches that of LQR. It indicates that ANFIS learns from LQR effectively.
radians mark, countering oscillation
| (43) |
| (44) |
Fig. 17 shows the angular response of the arm. The arm deflection angle fluctuates with an amplitude of 0.18(rad) within the range [-0.2;0.2]. For ANFIS control, the amplitude is 0.18 (rad) in the range [0.2; 0.6]. Therefore, the ANFIS controller can imitate the operation of the LQR controller which is very good.
Fig. 18 shows the response to the angle of the first pendulum. The deflection angle of the first pendulum will fluctuate within the range [-0.04;0.03]. For ANFIS control, it fluctuates within the range [-0.01; 0.04]. Therefore, the ANFIS controller has better performance imitation than the LQR controller.
Fig. 19 shows the response to the angle of the second pendulum. The deflection angle of the second pendulum will fluctuate within the range [-0.37;2.8]. For ANFIS control, fluctuating within the range [-3.2; -3.1]. For the angle of the second pendulum, the ANFIS controller has much better self-learning and adaptability than the LQR controller. It leads to better performance and greater stability in controlling complex systems, especially those requiring high precision such as LQR controllers. The output response of ANFIS closely matches that of LQR. It indicates that ANFIS learns from LQR very effectively.
In this study, we propose to use MATLAB’s ANFIS toolbox to intelligently simulate a previously successful controller - LQR controller. This LQR-based ANFIS method has been shown to be successful in simulation and testing. Furthermore, this algorithm is also applied to the high-level SIMO system - PRDIP. From there, this study provides a reference for further research on ANFIS control for this type of model. In addition, the real-time model can be a hardware platform for training algorithm algorithms for students in the laboratory. Instead of using simple and popular controllers such as LQR, PID, etc., ANFIS will be used more in the future because ANFIS has many strengths such as ANFIS often giving high performance in controller applications. SIMO complexity, especially in linear and uncertain systems. This makes ANFIS an attractive option in many fields such as robotics, energy systems, and resource management. In this study, we also found the results of ANFIS-based LQR to be very good, not only as responsive as LQR but also better, thereby representing an important development and contribution to the following research on the PRDIP system and the results of this study are an important document that can be used for future system control studies.
We want to give thanks to the PhD. Van-Dong-Hai Nguyen (HCMUTE) due to his support for us in operating the hardware. The operation of the system is shown in the link: https://www.youtube.com/watch?v=7A4pR9wB3dQ
Chi-Hung Nguyen, ANFIS-based LQR Control for Rotary Double Parallel Inverted Pendulum