The ocean is a complex system. Ocean temperature is an important physical property of seawater, so studying its variation is of great significance. Two kinds of network structures for predicting thermocline time series data are proposed in this paper. One is the LSTMGRU hybrid neural network model, and the other is the temporal convolutional network (TCN) model. The two networks have obvious advantages over other models in accuracy, stability, and adaptability. Compared with the traditional autoregressive integrate moving average model, the proposed method considers the influence of temperature history, salinity, depth, and other information. The experimental results show that TCN performs better in prediction accuracy, while LSTMGRU can better predict abnormal data and has higher robustness.
As global warming continues to intensify, the living environment of humankind is facing increasingly severe challenges, and oceanrelated research has also received growing attention. The ocean plays an important role in regulating global temperature. However, the ocean is an open and complex system. Ocean research always includes seawater, dissolved and suspended substances, organisms, submarine sediments, the lithosphere, the atmospheric boundary layer, and estuarine and coastal zones. Seawater temperature is an important parameter in ocean research. Affected by the latitude and geographic environment, the ocean temperature is highly unstable in time and space. Ocean temperature affects rainfall and seawater evaporation around the world and affects seaair heat exchange^{[1]}. Thus, the study of ocean temperature prediction methods can provide more reference data for predicting climate and meteorological changes, thereby promoting atmospheric and ocean sciences, expanding the scope and time of natural disaster forecasting, and reducing the losses caused by natural disasters to humans as much as possible. The change in ocean temperature will affect the abundance of marine species and the production of marine organisms. Therefore, predicting ocean temperature is also conducive to fishery production scheduling and promotes the fishery economy.
Argo is an international program that uses profiling floats to observe temperature, salinity, currents, and, recently, biooptical properties in the oceans; it has been operational since the early 2000s^{[2]}. The collected data are used in climate and oceanographic research. Argo originally planned to put 3000 buoys in international waters and establish a global ocean observation network with a density of buoys with an average distance between buoys of 3 km × 3 km. There are three main tasks: Argo core, measuring temperature, salinity, and pressure above 2000 m; deep Argo, measuring temperature, salinity, and pressure above 6000 m; and BGCArgo, measuring temperature, salinity, pressure, pH, nitrate, chlorophyll, backscatter, oxygen content, and irradiance at a water depth of above 2000 m.
The Argo dataset provides the basis for marine research. Researchers use traditional physical models and machine learning methods to predict ocean temperature. In previous work, we proposed an SVRbased method to predict ocean temperature^{[3]}. We redefined the thermocline by using the information entropy method^{[4,5]} and analyze the association of the temperature and salinity data of seawater^{[6]}. Some scholars also put forward the long shortterm memory (LSTM) network and the gated recurrent unit (GRU), their improved algorithm, making a breakthrough in temperature prediction. Based on those studies, we continue to discuss the temperature prediction in this paper. We propose two temperature timeseries prediction methods, and our contributions are as follows:
(1) We propose an LSTMGRU hybrid neural network model and a model based on Temporal Convolutional Networks (TCNs). We compare them with traditional LSTM, GRU, and TCN, and the two networks surpassed them in experiments. The explanatory variable score of both exceeded 0.98.
(2) We evaluated the LSTMGRU model and TCNbased model to predict data with abnormal data inputs. In the normal data, the TCNbased model works best. However, it is easy to predict abnormal temperatures while inputting insufficient data. The explanatory variable score result of LSTMGRU can reach 0.85, which has high robustness.
This paper is arranged as follows. Section 2 introduces the application of Argo buoys and the current research status of ocean temperature prediction. Section 3 gives a detailed introduction to the LSTM model, GRU model, TCN, and the model proposed in this paper. Then, in Section 4, a comparison is made among several models, and we also evaluate the LSTMGRU and TCNbased methods with abnormal data. Section 5 summarizes the work in this paper and looks forward to the future.
Sea surface temperature (SST) is an important factor affecting water vapor exchange and heat flow. Therefore, ocean temperature prediction has been considered by many scholars. SST can be divided into two categories in the prediction of ocean temperature: one is the numerical model based on physics, and the other is the datadriven model based on data analysis. In the traditional method, Xue
For deep learning methods, Tangang
The marine time series models include the Gaussian, autoregressive moving average, autoregressive integrate moving average, Markov chain, and hidden Markov models^{[16]}.
The autoregressive integrate moving average (ARIMA) model is one of the most common statistical models for time series prediction. ARIMA(p, d, q) is composed of three parts: AR is the autoregressive model, which means that the value of a specific time point at present is equal to the value of several specific time points in the past. Integrate (I) calculates the difference between
LSTM is an artificial recurrent neural network (RNN) architecture^{[18]} used in the field of deep learning. Unlike standard feedforward neural networks, LSTM has feedback connections which can process not only single data points but also entire sequences of data. GRUs, introduced in 2014 by Cho
When predicting timeseries data, traditional fully connected neural networks are weak. RNNs can process timeseries data. The essential characteristics of RNNs are both an internal feedback connection and a feedforward connection between the processing units^{[19]}. The LSTM network is a special RNN model whose structural design solves longterm dependence issues. The key of LSTM is the cell state to save current status and pass it to the next moment. The LSTM model is shown in
The LSTM model diagram. LSTM: Long shortterm memory.
The forget gate in LSTM is used to determine which parts of the input information will be forgotten, as shown in
The TCN was first proposed by Lea
Structure of TCN. TCN: Temporal convolutional network.
Several structures of the TCN model, e.g., causal convolution, dilated convolutions, and the residual block, are imageoriented convolutional neural networks. Causal convolution builds neural networks based on statistical equations [Equation (2)]. For the sequence modeling problem,
Based on causal convolution, dilated convolutions skip part of the input (the isolated circles in
Firstly, we propose a hybrid model based on LSTM and GRU, establishing a TCN network model.
The LSTM unit controls the amount of new memory content added to the memory cell independently from the forget gate. The GRU controls the information flow from the previous activation when computing the new, candidate activation, but it does not independently control the amount of added candidate activation (the control is tied via the update gate). LSTM has a talent in new data, but deep LSTM networks always lead to overfitting. GRU concentrates on short old data, and it is at lower risk of overfitting. Thus, we combine these advantages and propose a hybrid network.
The LSTMGRU hybrid neural network constructed in this paper has five layers. The first layer is the LSTM neural network layer, which contains 30 hidden neurons; the second to fourth layers are the GRU neural network layers, and the numbers of neurons are 24, 6, and 3, respectively; and the fifth layer is a fully connected layer.
Referring to the model structure of the residual network, we design the TCN network structure shown in
Structure of the proposed model.
The network includes two improved ResNet units. Each unit of the network contains two branches. The first branch contains two convolution layers, and the second branch contains one convolution layer. Each convolution layer of the first unit contains 16 kernels, and the second convolution layer contains 32 kernels. In the second layer of convolution kernels, we set the dilated parameter to 2.
Moreover, we also add two networks for comparison. The first contains 1 unit with 16 convolution kernels. The second contains three units (16, 32, and 64 convolution core), and the dilated number is 1, 2, and 4, respectively. We also compare the network with the traditional TCN, and the experimental results are shown in Section 4.
The data used in this article were derived from the “Argo (V3.0)” and include over 2.15 million temperature, salinity, and depth profile data obtained from more than 15,000 automatic observation profile buoys worldwide from July 1997 to March 2021. The files in the original dataset are stored according to the buoy number and contain longitude, latitude, pressure, temperature, and salinity. This paper selects eight months of ocean data from April to December 2020 for ocean temperature prediction. After preprocessing, there are 94,075 data items. The data in the original dataset are stored according to the buoy number, each number corresponds to a unique buoy, and the position of each buoy is uniquely determined. This paper only predicts the temperature of the ocean surface. Thus, we only selected the ocean data within 10 m from the sea level, organized the extracted effective data, and stored them again according to the buoy number. This paper divides 90% of the dataset into the training set and 10% in the validation set. The experiments used the following environment: Intel Core i7 7700, 16 GB RAM, RTX 2080TI, Ubuntu 16.04, Python 3.7.4, and TensorFlow 1.13.1. The linear normalization method was adopted to normalize the original data. The batch size was chosen as 100.
In this paper, explained variance score (EVS) and root mean square error (MSE), commonly used in regression models, are selected as the evaluation indicators of the model. The formula for EVS is as follows:
MSE is a loss function in linear regression. Under the same conditions, the smaller is the value, the higher is the accuracy of the prediction model.
The ARIMA model can only predict ocean temperature for a single buoy. From the time perspective, it can only predict ocean temperature and cannot predict ocean temperature in different spaces. When the model parameters are the same, the prediction accuracy of the ARIMA model for different buoys is quite different. The prediction of the ARIMA model is shown in
ARIMA model prediction effect diagram. ARIMA: Autoregressive integrate moving average.
The prediction effect of the ARIMA model in
The neural network model can predict the ocean temperature of all buoys at the same time. We compared LSTM^{[14]}, GRU^{[25]}, LSTMGRU, and TCN. The compared TCN network is presented in^{[26]}. The parameters of networks are shown in
Networks’ structure and parameters








LSTM 
25 
25 
 
 
8626 
LSTM: Long shortterm memory; GRU: gated recurrent unit; TCN: temporal convolutional network.
LSTM, GRU, and LSTMGRU represent the three RNN networks mentioned in Section 3. TCNR is the TCN network with improved ResNet modules. Moreover, we also label the three traditional TCN casual networks as TCNC and two TCN casual networks with dilated convolution as TCNC. The number on the left is the number of convolution kernels, and the right is the kernel size. The number of brackets is the dilated number. The EVS and RMSE of each prediction result are shown in
Predictions of all models. LSTM: Long shortterm memory; GRU: gated recurrent unit; TCN: temporal convolutional network.
The EVS and MSE of prediction results on the models



LSTM 
0.9564 
3.5275 
LSTM: Long shortterm memory; GRU: gated recurrent unit; TCN: temporal convolutional network; EVS: explained variance score; MSE: mean square error.
In
In
Predictions of several models. LSTM: Long shortterm memory; GRU: gated recurrent unit; TCN: temporal convolutional network.
Next, we modified two random temperature data into abnormal data to test the robustness of the LSTMGRU and TCNR2. The test results are shown in
The result of abnormal data on LSTMGRU and TCNR2. LSTM: Long shortterm memory; GRU: gated recurrent unit; TCN: temporal convolutional network.
The EVC and MSE in abnormal data of the two models.



LSTMGRU 
0.8514 
16.1151 
LSTM: Long shortterm memory; GRU: gated recurrent unit; TCN: temporal convolutional network; MSE: mean square error; EVS: explained variance score.
The picture on the left is LSTMGRU, and the picture on the right is TCNR2. As shown in
In this paper, two kinds of network structures for predicting thermocline time series data are proposed. One is the LSTMGRU hybrid neural network model, and the other is the TCNbased neural network model. The two networks have obvious advantages over other models in accuracy, stability, and adaptability. TCN has higher prediction accuracy, while LSTMGRU can better predict abnormal data and has higher robustness. For future work, other variables, such as location blocks and atmospheric parameters, can be added to the model to better predict ocean temperature and improve the accuracy of ocean temperature prediction. This paper only concerns ocean surface temperature. Deep seawater is also a vital topic, and it is more practical to predict the temperature of shallow seawater and deep seawater simultaneously. Forecasting multiple data in different depths adds one dimension to the data, and this will be another challenge as seawater exhibits nonlinear vertical variations.
Made substantial contributions to conception and design of the study: Jiang Y, Zhao M
Performed data analysis and interpretation: Zhao W, Qin H, Qi H
Performed data acquisition, as well as providing, technical, and material support: Wang K, Wang C
The Argo data were collected and made freely available by the International Argo Program and the national programs that contribute to it (
This work was supported by the National Natural Science Foundation of China under Grant 62072211, Grant 51939003, and Grant 51809112.
All authors declared that there are no conflicts of interest.
Not applicable.
Not applicable.
© The Author(s) 2021.