Journal of Fuzzy Systems and Control, Vol. 2, No 3, 2024 |
Model Predictive Control for Rotary Inverted Pendulum: Simulation and Experiment
Phuc-Hoang Huynh 1, Minh-Hanh Nguyen 2, Nguyen-Phat Pham 3, Hoang-Viet-Phuc Duong 4, Huy-Ha Nguyen 5,
Duc-Chung Le 6, Minh-Khoa Nguyen 7, Ngoc-Liem Bui 8, Nguyen-Phi-Long Le 9, Van-Dong-Hai Nguyen 10,*
1, 2, 3, 4, 5, 6, 7, 8, 10 Faculty of Electrical and Electronics Engineering, Ho Chi Minh City University of Technology and Education (HCMUTE), Vietnam
9 Faculty of International Education, Ho Chi Minh City University of Technology and Education (HCMUTE), Vietnam
Email: 1 21151503@student.hcmute.edu.vn, 2 21142525@student.hcmute.edu.vn, 3 21161346@student.hcmute.edu.vn,
4 21161348@student.hcmute.edu.vn, 5 20151357@student.hcmute.edu.vn, 6 20151043@student.hcmute.edu.vn,
7 20142350@student.hcmute.edu.vn, 8 20161038@student.hcmute.edu.vn, 9 20151291@student.hcmute.edu.vn,
*Corresponding Author
Abstract—Rotary Inverted Pendulum (RIP) is one of the simplest nonlinear systems commonly used for validating control algorithms. In this study, two controllers, Model Predictive Control (MPC) and Linear Quadratic Regulation (LQR), are simulated and experimentally validated. These controllers are executed in real-time on a PC, while the STM32F407 chip handles control and data acquisition from the pendulum using a high-speed USB interface. Due to the custom-built nature of this model, there are inaccuracies in the model and parameter identification. However, results show that the MPC controller is better at trajectory tracking and maintaining balance near the set point compared to the LQR controller. On the other hand, the LQR controller responds more robustly to disturbances and external forces, highlighting distinct differences between MPC’s optimization over each prediction horizon and LQR’s single-solution approach for the entire prediction horizon.
Keywords—LQR; MPC; Rotary Inverted Pendulum; STM32F4
In nonlinear systems, RIP is considered an easily constructed object with a simple mechanical structure but high nonlinearity [1]. Therefore, this system is commonly used in experiments related to identification and control. Various control algorithms have been applied to the inverted pendulum (IP) model, including methods such as PID control [2], Back-stepping [3], fuzzy control [4], Reinforcement Learning [5], as well as optimal control like LQR [6], yielding significant success. However, MPC control is often used for SISO or SIMO systems. It is not popularly used for SIMO systems, such as RIP. The main difficulty in controlling this model in real models is the challenge of identifying the exact system parameters as the requirement of the MPC method. In [7] Quanser model is used to test MPC control. The experiment is successful due to the standard model of this company. However, this experimental model is expensive and the processor in that research is a professional board that cannot be popularized. Therefore, an MPC control that is successful on a self-made platform that is based on the STM32F4 board can be a solution. In this paper, we propose applying the MPC controller, one of the controllers that are used to manage overall processes in industries such as processing plants, oil refineries, and real-time applications [8][9]. Unlike LQR, MPC is an optimal control technique where control actions are calculated to minimize a cost function for a dynamically constrained system over a finite, receding horizon [10]. To highlight the differences between the two algorithms, we will compare MPC and LQR controllers on RIP to clarify the strengths, weaknesses, and advantages of these two control methods.
RIP consists of an arm and a pendulum, with a DC motor mounted at the end of the arm. The pendulum is normally stable in a downward position but unstable in an upright position. Therefore, a controller must be designed to keep the pendulum in an upright position and move it along a predefined trajectory. The specific structure is shown in Fig. 1 [1].
We use the Parameter Estimator toolbox in MATLAB to estimate system parameters [11]. The parameters of the pendulum are shown in Table 1.
Sympol | Description | Value | Unit |
Mass of pendulum | 0.24297 | ||
Half-length of pendulum | 0.20147 | ||
Length of pendulum arm | 0.14902 | ||
Moment of inertia of arm | 0.0045556 | ||
Inertia moment of pendulum | 0.0017725 | ||
Friction coefficient of arm | 0.0063986 | ||
Friction coefficient of pendulum | 0.0065929 | ||
Gravitational acceleration constant | 9.81 | ||
Pendulum arm angle | |||
Pendulum angle | |||
Armature voltage | |||
Torque constant | 0.053344 | ||
Back emf constant | 0.28834 | ||
Armature resistance | 0.72921 | ||
Moment of inertia of rotor | 0.012818 | ||
Viscous friction constant | 0.0033158 |
According [1], we have mathematical equations describing RIP as follows:
(1) |
The control signal here is the torque of the motor (). It needs to be converted into voltage to fit the real system. The torque produced by a DC motor is defined as [12].
(2) |
Combining equations (1) and (2), we obtain the mathematical equation for RIP in (3)
(3) |
Defining state variables as in (4)
(4) |
nonlinear state equations of RIP are listed in (5)
(5) |
The requirement is to control the arm to keep the pendulum balanced in the upright position. The working point is that both the pendulum angle and arm angle are stationary, and no voltage is applied to the motor. It is described in (6) below
(6) |
By linearizing RIP around this upright equilibrium point (where deviation angle β is less than 100), we obtain linearized state equations for the pendulum system in the following form:
(7) |
where
(1) | |
Substituting the parameters from Error! Reference source not found. into equations in (7) and we obtain matrix (8) below.
(8) |
In this section, we configure parameters for two controllers: LQR and MPC. Results are simulated in MATLAB Simulink and experimentally tested on a real model.
According to the LQR control method, to control the pendulum system, we need to design a state feedback controller [1]:
(9) |
where: K is the control matrix, and x(t) is the state variable matrix.
The value of matrix K needs to be optimized, meaning that we must find the value of K minimizing performance index J.
Performance index is chosen to be quadratic, with the final time being:
(10) |
Optimal control theory demonstrates that vector K that minimizes quality index (10) is determined by expression:
(11) |
where P is the solution to the Riccati algebraic equation and is computed by solving the Riccati equation:
(12) |
where are positive definite square matrices used to tune the LQR controller. Here,
,
,
,
are optimal weights corresponding to state variables
, respectively.
MPC is an optimal control algorithm that considers system constraints, such as its physical limits. As shown in Fig. 3, MPC uses a discrete-time linear model to predict future outputs of the system [13][14]:
(13) |
Where:
is the state vector,
is vector observed at the system's RIP,
, v(k), and nd(k) are dimensionless manipulated variables, measured disturbances, and unmeasured input disturbances, respectively. Discrete state-space matrices A, B, C, Bu, Bv, Bd, Dv, Dd are computed from the continuous linear model using a discrete sampling time Ts.
Consider the problem of predicting future trajectories of the model performed at time k=0. Set nd(i)=0 for all prediction instants i, and obtain [15] as
(14) |
The solution to the equation (14) is:
(15) |
Where:
Let m be the number of free control moves, and let . Then, it yields
(16) |
where JM depends on the choice of blocking moves
Based on predicted states, MPC calculates optimal control sequences to minimize cost function:
(17) |
Where:
LSy and LSu are diagonal matrices or outputs and MV scaling factors, respectively.
: Tuning weight for the jth plant output at the ith prediction horizon step (dimensionless).
: Tuning weight for the jth MV at the ith prediction horizon step (dimensionless)
: Tuning weight for the jth MV movement at the ith prediction horizon step (dimensionless)
εk : Slack variable at the control interval k (dimensionless).
ρε : Constraint violation penalty weight (dimensionless).
When a system contains physical limitations, such as motor voltage V, MPC accounts for these limitations in optimization problems using hard constraints.
(18) |
Let m be a number of free control moves, and let z= [z0; ...; zm–1]. Then
(19) |
where JM depends on the choice of blocking moves. Together with the slack variable ɛ, vectors constitute free optimization variables of the optimization problems.
Model predictive controller QP solver converts a linear MPC optimization problem to the general form QP problem [16]:
(20) |
Subject to linear inequality constraints
(21) |
where
In this section, we simulate RIP tracking the set-point signal at arm angle. The experiment will use Simulink Real-Time Target with STM32F407VE control chip. The sampling time for the system is Ts=0.01s, with control parameters as follows:
Because we use a microcontroller to control with a sampling time , the discrete-time matrices A, B and the weighting matrices for the LQR controller have the following values:
(22) | |
(23) | |
(24) | |
(25) |
From (22) to (25) we obtain the matrix K:
(26) |
The configuration of the system's input and output measurements is shown Fig. 4, along with the controller parameters as follows:
We use MATLAB/Simulink to simulate the output responses of two controllers. The simulation diagram is shown in Fig. 5. Simulation results of MPC and LQR controllers with white noise (Noise power = [0.000001]) and arm angle tracking set-point signal f=1 (rad) are shown in Fig. 6.
Simulation results show that the output responses of the arm and pendulum under both LQR and MPC controllers are similar. However, a control signal for the LQR controller has a larger amplitude oscillation, ranging from [−40, 40] (V), whereas the control signal for the MPC controller is [−12, 12] (V).
Simulation results of two controllers with white noise (Noise power = [0.0000001]) and arm angle tracking set-point signal f=sin(0.2πt) (rad) are shown in Fig. 7.
Simulation results show that the output response at the pendulum angle is similar for both controllers. However, the response at the arm angle is better with the MPC controller compared to the LQR controller.
The RIP system model is shown in Fig. 8. We set the reference signal for the arm angle of the MPC and LQR controllers from f=0 (rad) to f=1 (rad) at the 20th second. Experimental results for LQR and MPC controller are shown from Fig. 9 to Fig. 11.
The performance of two controllers, when the arm angle tracks set-point signal f=1 (rad), is shown in Table 2 and Table 3.
Average pendulum angle (rad) | Average arm angle (rad) | Control voltage (V) | (s) (5%) |
| |||
Range | |||||||
LQR | 1.15 | 0.15 | [1; 1.2] | 0.002 | [1.6; -1.5] | 38.7 | 23.5 |
MPC | 0.87 | 0.13 | [0.9; 0.84] | 0.002 | [0.9; -0.7] | 7.5 | 47.7 |
Arm angle (rad) | Pendulum angle (rad) | Control voltage (V) | |
LQR | 0.2020 | 0.8795 | 0.9950 |
MPC | 0.1560 | 0.8634 | 0.9214 |
These three reasons cause the controller to attempt to reduce reference signal error, leading to curve oscillation.
After evaluating the simulation and experimental results of MPC and LQR controllers under different operating conditions on RIP, we find that MPC frequently computes new solutions, whereas LQR uses the same single (optimal) solution for the entire time horizon [17]. For this reason, in terms of control quality, MPC performs well for trajectory tracking and handling system constraints, while LQR provides strong responses to system disturbances, external forces, and unforeseen system changes. Additionally, MPC is characterized by smooth changes in the control signal, whereas LQR produces rapid changes in the control signal, which is a significant drawback due to its substantial impact on actuator wear.
Regarding controller processing, the size of the control matrix for the LQR controller depends only on the number of internal states of the system. In contrast, the MPC controller sets up a control matrix for the entire prediction horizon. Therefore, size of the MPC control matrix not only depends on the number of internal states but also increases proportionally with extension of the prediction horizon and reduction in sampling time [18]. This result in an increasing number of calculations required to generate a control signal, limiting potential applications of the MPC algorithm and relying on the capabilities of the controller used.
At each step, an MPC controller receives or estimates the current state of the plant. It then calculates a sequence of control actions that minimize cost over the horizon by solving a constrained optimization problem that relies on an internal plant model and depends on the current system state. The controller then applies only the first computed control action to the plant, disregarding subsequent ones. The process repeats in the following time step [13]. Therefore, an accurate mathematical model is necessary, considering the uncertainties [19][20] and disturbance rejection [21].
This paper belongs to the project for students in HCMUTE for the year 2025. It is funded by HCMUTE. We, the authors, are grateful for this support. The operation of the system is shown in the link: https://www.youtube.com/watch?v=woHq5wdWdEM
Phuc-Hoang Huynh, Model Predictive Control for Rotary Inverted Pendulum: Simulation and Experiment