logo

SCIENCE CHINA Information Sciences, Volume 63 , Issue 6 : 160305(2020) https://doi.org/10.1007/s11432-020-2873-x

Overfitting effect of artificial neural network based nonlinear equalizer: from mathematical origin to transmission evolution

More info
  • ReceivedFeb 26, 2020
  • AcceptedApr 13, 2020
  • PublishedMay 13, 2020

Abstract

Overfitting effect of artificial neural network (ANN) based nonlinear equalizer (NLE) leads to a trap of bit error ratio (BER) overestimation in optical fiber communication system, especially when the performance is evaluated by the commonly-used pseudo-random binary sequence (PRBS). First, we mathematically investigate the PRBS generation and Gray code mapping rules, in comparison with the use of Mersenne Twister random sequence (MTRS). Under the condition of a symbol erasure channel, we identify that ANN can recognize both the PRBS generation and symbol mapping rules, by increasing the weights of NLE at specific positions, whereas the MTRS is currently safe owing to the limited input length of current ANN based NLE. Then, we design four channel models of fiber optical transmission to experimentally examine various impairments on the evolution of overfitting effect. When both the additive white Gaussian noise (AWGN) channel and the bandwidth limited channel are considered, the mitigation of overfitting becomes possible by the use of pruned PRBS (P-PRBS) training set with removing the generation and mapping rules determined input symbols. However, as for both the chromatic dispersion (CD) uncompensated channel and the CD managed channel, the overfitting effect becomes serious, because both CD and fiber nonlinearity induced inter-symbol interference (ISI) is beneficial for ANN to identify the PRBS symbol rules. Finally, possible solutions to mitigate the overfitting effect are summarized.


Acknowledgment

This work was supported by National Key RD Program of China (Grant No. 2018YFB1801301) National Natural Science Foundation of China (Grant No. 61875061), and Key Project of RD Program of Hubei Province (Grant No. 2018AAA041).


References

[1] Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks Advances in neural information processing systems (NIPS), 2012. 1097--1105. Google Scholar

[2] Hinton G, Deng L, Yu D, Dahl G E, et al. Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Signal Processing Magazine, 2012, 28: 82--97. Google Scholar

[3] Sagiroglu S, Yavanoglu U, Guven E N. Web based machine learning for language identification and translation. In: Proceedings of the 6th International Conference on Machine Learning and Applications, Cincinnati, 2007. 280--285. Google Scholar

[4] Jarajreh M A, Giacoumidis E, Aldaya I. Artificial Neural Network Nonlinear Equalizer for Coherent Optical OFDM. IEEE Photon Technol Lett, 2015, 27: 387-390 CrossRef Google Scholar

[5] Giacoumidis E, Le S T, Ghanbarisabagh M. Fiber nonlinearity-induced penalty reduction in CO-OFDM by ANN-based nonlinear equalization.. Opt Lett, 2015, 40: 5113-5116 CrossRef PubMed Google Scholar

[6] Luo M, Gao F, Li X, et al. Transmission of 4$\times~$50-Gb/s PAM-4 Signal over 80-km Single Mode Fiber using Neural Network. In: Proceedings of Optical Fiber Communication Conference, 2018. M2F.2. Google Scholar

[7] Yang Z, Gao F, Fu S. Radial basis function neural network enabled C-band 4??50??Gb/s PAM-4 transmission over 80??km SSMF.. Opt Lett, 2018, 43: 3542-3545 CrossRef PubMed Google Scholar

[8] Chuang C, Liu L, Wei C, et al. Convolutional Neural Network based Nonlinear Classifier for 112-Gbps High Speed Optical Link. In: Proceedings of Optical Fiber Communication Conference, 2018. W2A.43. Google Scholar

[9] Ye C, Zhang D, Hu X, et al. Recurrent Neural Network (RNN) based End-to-End Nonlinear Management for symmetrical 50Gbps NRZ PON with 29dB+ Loss Budget. In: Proceedings of European Conference on Optical Communication, 2018. 1--3. Google Scholar

[10] Karanov B, Chagnon M, Thouin F. End-to-End Deep Learning of Optical Fiber Communications. J Lightwave Technol, 2018, 36: 4843-4855 CrossRef Google Scholar

[11] Karanov B, Lavery B, Bayvel P, et al. End-to-end optimized transmission over dispersive intensity-modulated channels using bidirectional recurrent neural networks. Opt Express, 2019, 27: 19650--19663. Google Scholar

[12] Wang D, Zhang M, Li Z. Modulation Format Recognition and OSNR Estimation Using CNN-Based Deep Learning. IEEE Photon Technol Lett, 2017, 29: 1667-1670 CrossRef Google Scholar

[13] Dong Z, Khan F N, Sui Q. Optical Performance Monitoring: A Review of Current and Future Technologies. J Lightwave Technol, 2016, 34: 525-543 CrossRef Google Scholar

[14] Chen X, Li B, Shamsabardeh M, et al. On real-time and self-taught anomaly detection in optical networks using hybrid unsupervised/supervised learning. In: Proceedings of European Conference on Optical Communication, 2018. 1--3. Google Scholar

[15] Charalabopoulos G, Stavroulakis P, Aghvami A H. A frequency-domain neural network equalizer for OFDM. In: Proceedings of IEEE Global Telecommunications Conference, 2003. 571--575. Google Scholar

[16] Rajbhandari S, Ghassemlooy Z, Angelova M. Effective Denoising and Adaptive Equalization of Indoor Optical Wireless Channel With Artificial Light Using the Discrete Wavelet Transform and Artificial Neural Network. J Lightwave Technol, 2009, 27: 4493-4500 CrossRef Google Scholar

[17] Digital test patterns for performance measurements on digital transmission equipment ITU-T Recommendation O.150. Google Scholar

[18] IEEE Standard for Ethernet Amendment 10: Media Access Control Parameters, Physical Layers, and Management Parameters for 200 Gb/s and 400 Gb/s Operation IEEE Std 802.3bs, 2017: 1--372. Google Scholar

[19] Eriksson T A, Bülow H, Leven A. Applying neural networks in optical communication systems: Possible pitfalls. IEEE Photon Technol Lett, 2017, 29: 2091--2094. Google Scholar

[20] Shu L, Li J, Wan Z, et al. Overestimation Trap of Artificial Neural Network: Learning the Rule of PRBS. In: Proceedings of European Conference on Optical Communication, 2018. 1--3. Google Scholar

[21] Chuang C, Liu L, Wei C, et al. Study of Training Patterns for Employing Deep Neural Networks in Optical Communication Systems. In: Proceedings of European Conference on Optical Communication, 2018. 1--3. Google Scholar

[22] Yi L, Liao T, Huang L. J Lightwave Technol, 2019, 37: 1621-1630 CrossRef Google Scholar

[23] Matsumoto M, Nishimura T. Mersenne Twister: A 623-Dimensionally Equidistributed Uniform Pseudo-Random Number Generator. ACM Trans Model Comput Simul, 1998, 8: 330. Google Scholar

[24] Doran R W. The Gray Code. J UCS, 2007, 13: 1573-1597. Google Scholar

[25] Agrawal G P. Nonlinear Fiber Optics. 4th ed. San Diego: Academic Press, 2001. Google Scholar

  • Figure 1

    (Color online) ANN learning process under the symbol erasure channel.

  • Figure 2

    (Color online) BER results under the symbol erasure channel with (a) OOK symbols, (b) PAM-4 symbols and (c) PRBS-20 PAM-4 symbol generation rules.

  • Figure 3

    (Color online) L-$\infty$ weight distributions for 100 independent trainings for (a) PRBS OOK symbols, (b) MTRS OOK symbols, (c) PRBS PAM-4 symbols, and (d) MTRS PAM-4 symbols.

  • Figure 4

    (Color online) (a) Structure of OOK symbol sequences to be transmitted; (b) AWGN channel model.

  • Figure 5

    (Color online) (a) BER results of the AWGN channel; (b) L-$\infty$ weight distributions of the ANN with the input length of 61.

  • Figure 6

    (Color online) BER results under the AWGN channel with the help of P-PRBS training set.

  • Figure 7

    (Color online) Experimental setup of typical IM-DD transmission system.

  • Figure 8

    (Color online) BER results of B2B transmission channel under conditions of (a) different ROPs with the baudrate of 25 GB and (b) different baudrates with the ROP of $-3$ dBm.

  • Figure 9

    (Color online) BER results of the 20 km SSMF channel under the conditions of (a) different baudrates with the ROP of $-1$ dBm and (b) different ROPs with the baudrate of 40 GB.

  • Figure 10

    (Color online) L-$\infty$ weight distributions of ANNs for the 20 km SSMF channel under condition of 40 GB and $-1$ dBm ROP. (a) Using the PRBS training set, and (b) using the P-PRBS training set.

  • Figure 11

    (Color online) BER results of 100 km SSMF channel with the CD pre-compensation.

  • Figure 12

    (Color online) L-$\infty$ weight distributions of ANNs for the 100 km SSMF channel at the 18 dBm launch power. (a) Using the PRBS training set, and (b) using the P-PRBS training set.

  • Table 1  

    Table 1The Ruleset of PRBS-20 OOK symbols

    $k^{\rm~a)}$Ruleset$k$Ruleset$k$Ruleset
    0–16Null38–39ans, $X(n+38$)53ans, $X(n53$)
    17–19$X(n+17$)40ans, $X(n-40$), $X(n+40$)54–55ans, $X(n+54)$
    20–22ans$^{\rm~b)}$, $X(n-20$), $X(n+20$)41–43ans, $X(n41)$56ans, $X(n56$)
    23–25ans, $X(n23$)44–45ans, $X(n44)$57ans, $X(n+57)$
    26–28ans, $X(n26)$46ans, $X(n46)$58ans, $X(n58)$
    29–31ans, $X(n29)$47–49ans, $X(n47)$59ans, $X(n59)$
    32–33ans, $X(n32)$50ans, $X(n50)$60ans, $X(n-60)$, $X(n+60)$
    34ans, $X(n+34)$51ans, $X(n+51)$$\ldots$$\ldots$
    35–37ans, $X(n35)$52ans, $X(n52)$

    a) The $k$ means the position relative to current symbol at the input vector, the input length is $2\times~k+1$. b) The ans means the Ruleset for current $k$ includes the Ruleset for the smaller $k$ above.

Copyright 2020  CHINA SCIENCE PUBLISHING & MEDIA LTD.  中国科技出版传媒股份有限公司  版权所有

京ICP备14028887号-23       京公网安备11010102003388号