logo

SCIENTIA SINICA Informationis, Volume 48, Issue 11: 1497-1509(2018) https://doi.org/10.1360/N112018-00158

Towards creative language generation: exploring Chinese humor generation

More info
  • ReceivedJun 20, 2018
  • AcceptedSep 27, 2018
  • PublishedNov 14, 2018

Abstract

Humor generation is one of the tasks of computational creativity, which can not only make a computer creative and have a personality, but also improve user experiences. This paper explores the generation of Chinese jokes, the main form of humor. In particular, the following task is considered: given the setup of a joke, generate the corresponding punchline that is in line with current natural language generation technologies using one of two approaches. One is based on the encoder-decoder framework and lacks modeling of humor characters. The other is based on generative adversarial networks (GANs), in which four characters (ambiguity, incongruity, phonetic similarity, and universality) are introduced into the reward function to evaluate the generated jokes and supervise the generator. Experimental results show that the GANs approach with joke character rewards obtains promising improvements compared to the encoder-decoder framework, namely, extra six percentage points on the ratio of jokes. While the performance is insufficient, as a first step towards creative language generation, the insights obtained in the exploration will help us in future research.


Funded by

国家自然科学基金(61673248,61772324)

山西省“1331工程"重点学科建设计划


References

[1] Nijholt A, Andreea I N, Valitutti A, et al. Humor in human-computer interaction: a short survey. In: Proceedings of Workshop on Designing Humor in Human-Computer Interaction. Berlin: Springer, 2017. 527--530. Google Scholar

[2] Ritchie G. The JAPE Riddle Generator: Technical Specification. Informatics Research Report EDI-INF-RR-0158. 2003. Google Scholar

[3] Stock O, Strapparava C. HAHAcronym: a computational humorsystem. In: Proceedings of the ACL Interactive Poster and Demonstration Sessions, Ann Arbor, 2005. 113--116. Google Scholar

[4] Petrovi'c S, Matthews D. Unsupervised joke generation from big data. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Sofia, 2013. 228--232. Google Scholar

[5] Aggarwal S, Mamidi R. Automatic generation of jokes in hindi. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics-Student Research Workshop, Vancouver, 2017. 69--74. Google Scholar

[6] Ren H, Yang Q. Neural Joke Generation. Final Project Reports of Course CS224n. 2017. Google Scholar

[7] Raskin V. Semantic Mechanisms of Humor. Berlin: Springer, 1985. Google Scholar

[8] Ritchie G. Can Computers Create Humor?. AIMag, 2009, 30: 71 CrossRef Google Scholar

[9] Lin H F, Zhang D Y, Yang L, et al. Computational humor researches and applications. J Shandong Univ (Natl Sci). 2016, 51: 1--10. Google Scholar

[10] Strapparava C, Stock O, Mihalcea R. Computational humour. In: Emotion-Oriented Systems. Berlin: Springer, 2011. 609--634. Google Scholar

[11] Attardo S, Raskin V. Script theory revis(it)ed: joke similarity and joke representation model. Humor - Int J Humor Res, 1991, 4 CrossRef Google Scholar

[12] Valitutti A, Doucet A, Toivanen J M. Computational generation and dissection of lexical replacement humor. Nat Lang Eng, 2016, 22: 727-749 CrossRef Google Scholar

[13] Özbal G, Strapparava C. Computational humour for creative naming. In: Proceedings of the 3rd International Workshop on Computational Humor, Amsterdam, 2012. 15--18. Google Scholar

[14] Binsted K, Ritchie G. Computational rules for generating punning riddles. Humor - Int J Humor Res, 1997, 10 CrossRef Google Scholar

[15] Tinholt H W, Nijholt A. Computational humour: utilizing cross-reference ambiguity for conversational jokes. In: Proceedings of International Workshop on Fuzzy Logic and Applications, Camogli, 2007. 477--483. Google Scholar

[16] Labutov I, Lipson H. Humor as circuits in semantic networks. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Jeju Island, 2012. 150--155. Google Scholar

[17] Du S K, Wan X J, Ye Y J. Towards automatic generation of entertaining dialogues in chinese crosstalks. 2017,. arXiv Google Scholar

[18] Cho K, van Merrienboer B, Gulcehre C, et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the Empiricial Methods in Natural Language Processing, 2014. Google Scholar

[19] Luong M T, Pham H, Manning C D. Effective approaches to attention-based neural machine translation. In: Processing of Conference on Empirical Methods in Natural Language Processing, Lisbon, 2015. 1412--1421. Google Scholar

[20] Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks. In: Proceedings of Advances in Neural Information Processing Systems, 2014. 2672--2680. Google Scholar

[21] Lamb A, Goyal A, Zhang Y, et al. Professor forcing: a new algorithm for training recurrent networks. In: Proceedings of the 30th Conferenc on Neural Information Processing Systemss, Barcelona, 2016. 4601--4609. Google Scholar

[22] Yu L T, Zhang W N, Wang J, et al. SeqGAN: sequence generative adversarial nets with policy gradient. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, 2017. 2852--2858. Google Scholar

[23] Li J W, Monroe W, Shi T L, et al. Adversarial learning for neural dialogue generation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2017. Google Scholar

[24] Shetty R, Rohrbach M, Hendricks L A, et al. Speaking the same language: matching machine to human captions by adversarial training. In: Proceedings of IEEE International Conference on Computer Vision, 2017. 4155--4164. Google Scholar

[25] Serban I V, Sordoni A, Bengio Y, et al. Building end-to-end dialogue systems using generative hierarchical neural network models. In: Proceedings of the 30th AAAI Conference on Artificial Intelligence, 2016. 3776--3783. Google Scholar

[26] Li J W, Monroe W, Ritter A, et al. Deep reinforcement learning for dialogue generation. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, 2016. Google Scholar

[27] Yang D, Lavie A, Dyer C, et al. Humor recognition and humor anchor extraction. In: Proceedings of Conference on Empirical Methods in Natural Language Processing, Lisbon, 2015. 2367--2376. Google Scholar

[28] Miller T, Gurevych I. Automatic disambiguation of english puns. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, 2015. 719--729. Google Scholar

[29] Mihalcea R, Strapparava C. Making computers laugh: investigations in automatic humor recognition. In: Proceedings of the Conference on Human Language Technology and Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Vancouver, 2005. 531--538. Google Scholar

[30] Shang L F, Lu Z D, Li H. Neural responding machine for short-text conversation. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics, Beijing, 2015. 1577--1586. Google Scholar

  • Figure 1

    Punchline generation based on the encoder-decoder framework

  • Figure 2

    Punchline generation based on GANs

  • 1   Table 1Examples of jokes construction
    No. Joke examples
    1
    同样是又馋又懒,
    熊猫和猪告诉我们长相有多么重要.
    (set_up)
    (punchline)
    2
    长得丑怎么了, 我自己又看不到,
    恶心的是你们.
    (set_up)
    (punchline)
    3
    A: 你觉着我长这样儿是不是应该去整整容了?
    B: 我觉着可以直接人道毁灭了.
    (set_up)
    (punchline)
    4
    A: 为什么深海鱼都很丑?
    B: 因为太黑谁也看不见谁, 大家就随便长长喽.
    (set_up)
    (punchline)
  • 2   Table 2Datasets used in our models
    Dataset Size Sources
    Training data 500000
    480000 pairs of conversation from Sina Weibo
    and 15000 one-liner jokes or single-turn jokes
    Valid data 5000 One-liner jokes or jokes single-turn conversation
    Test data 1000 One-liner jokes or jokes single-turn conversation
  • 3   Table 3Examples of human evaluation
    Set_up Punchline Coherence Funniness
    A:单身是无罪的. B:你是不是说我是无罪的? 1 1
  • 4   Table 4Impact of Weibo dataset on systems
    Model BLEU-4
    SYS$_\text{encoder-decoder}$(with Weibo dataset) 11.79
    SYS$_\text{encoder-decoder}$(no Weibo dataset) 10.04
  • 5   Table 5Results of the different models on test-set$_{1000}$
    Model BLEU-4
    SYS$_\text{encoder-decoder}$ 11.79
    Adver-REGS 12.07
    SYS$_\text{GANs-joke}$(single) 14.67
    SYS$_\text{GANs-joke}$(ensemble) 14.8
  • 6   Table 6Results of the different models on test-set$_{500}$
    Model Ratio of jokes (%)
    SYS$_{\rm~encoder-decoder}$ 10.2
    Adver-REGS 13.4
    SYS$_{\rm~GANs-joke}$(single) 16.6
    SYS$_{\rm~GANs-joke}$(ensemble) 16.8
  • 7   Table 7Examples of the system generation
    No. Set_up Punchline
    1 你的数学成绩超过你的体重了吗?
    Human: 这真是个血腥的问题.
    SYS$_\text{encoder-decoder}$: 没唉.
    SYS$_\text{GANs-joke(single)}$: 再胖下去也不对吧, 我认了.
    2 你觉着我长这样儿是不是应该去整整容了?
    Human: 我觉着可以直接人道毁灭了.
    SYS$_\text{encoder-decoder}$: 我知道了.
    SYS$_\text{GANs-joke(single)}$: 我去, 你这辈子不要去医院了.
    3 为什么深海鱼都很丑?
    Human: 因为太黑谁也看不见谁, 大家就随便长长喽.
    SYS$_\text{encoder-decoder}$: 因为你丑.
    SYS$_\text{GANs-joke(single)}$: 因为他们没结婚.

Copyright 2019 Science China Press Co., Ltd. 《中国科学》杂志社有限责任公司 版权所有

京ICP备18024590号-1