^"we refer as a gated recurrent unit (GRU), was proposed by Cho et al. [2014]" Junyoung Chung, et al. (2014). Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. Arxiv 1412.3555
^Cho, Kyunghyun; van Merrienboer, Bart; Gulcehre, Caglar; Bahdanau, Dzmitry; Bougares, Fethi; Schwenk, Holger; Bengio, Yoshua (2014). "Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation". arXiv:1406.1078 [cs.CL]。
^ abWeiss, Gail; Goldberg, Yoav; Yahav, Eran (2018). "On the Practical Computational Power of Finite Precision RNNs for Language Recognition". arXiv:1805.04908 [cs.NE]。
^Britz, Denny; Goldie, Anna; Luong, Minh-Thang; Le, Quoc (2018). "Massive Exploration of Neural Machine Translation Architectures". arXiv:1703.03906 [cs.NE]。
^"when the reset gate is close to 0, the hidden state is forced to ignore the previous hidden state and reset with the current input only." Cho, et al. (2014).
^"the update gate controls how much information from the previous hidden state will carry over to the current hidden state" Cho, et al. (2014).
^"acts similarly to the memory cell in the LSTM network and helps the RNN to remember longterm information" Cho, et al. (2014).
^"allows the hidden state to drop any information that is found to be irrelevant later in the future, thus, allowing a more compact representation" Cho, et al. (2014).
^"allowing a more compact representation" Cho, et al. (2014).
^"As each hidden unit has separate reset and update gates, each hidden unit will learn to capture dependencies over different time scales" Cho, et al. (2014).
^Dey, Rahul; Salem, Fathi M. (20 January 2017). "Gate-Variants of Gated Recurrent Unit (GRU) Neural Networks". arXiv:1701.05923 [cs.NE]。
^Heck, Joel; Salem, Fathi M. (12 January 2017). "Simplified Minimal Gated Unit Variations for Recurrent Neural Networks". arXiv:1701.03452 [cs.NE]。