Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton
Ferrer, Moya Chen, Guillem Cucurull, David Esiobu,
Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller,
Cynthia Gao, Vedanuj Goswami, Naman Goyal, An-
thony Hartshorn, Saghar Hosseini, Rui Hou, Hakan
Inan, Marcin Kardas, Viktor Kerkez, Madian Khabsa,
Isabel Kloumann, Artem Korenev, Punit Singh Koura,
Marie-Anne Lachaux, Thibaut Lavril, Jenya Lee, Di-
ana Liskovich, Yinghai Lu, Yuning Mao, Xavier Mar-
tinet, Todor Mihaylov, Pushkar Mishra, Igor Moly-
bog, Yixin Nie, Andrew Poulton, Jeremy Reizen-
stein, Rashi Rungta, Kalyan Saladi, Alan Schelten,
Ruan Silva, Eric Michael Smith, Ranjan Subrama-
nian, Xiaoqing Ellen Tan, Binh Tang, Ross Tay-
lor, Adina Williams, Jian Xiang Kuan, Puxin Xu,
Zheng Yan, Iliyan Zarov, Yuchen Zhang, Angela Fan,
Melanie Kambadur, Sharan Narang, Aurelien Ro-
driguez, Robert Stojnic, Sergey Edunov, and Thomas
Scialom. 2023. Llama 2: Open foundation and fine-
tuned chat models.
Kees van Deemter, Emiel Krahmer, and Mariët Theune.
2005. Squibs and discussions: Real versus template-
based natural language generation: A false opposi-
tion? Computational Linguistics, 31(1):15–24.
Emiel van Miltenburg, Desmond Elliott, and Piek
Vossen. 2018. Measuring the diversity of automatic
image descriptions. In Proceedings of the 27th Inter-
national Conference on Computational Linguistics,
pages 1730–1741, Santa Fe, New Mexico, USA. As-
sociation for Computational Linguistics.
Tony Veale and Rafael Pérez y Pérez. 2020. Leaps and
bounds: An introduction to the field of computational
creativity. New Generation Computing, 38:551–563.
Gian Wiher, Clara Meister, and Ryan Cotterell. 2022.
On decoding strategies for neural text generators.
Transactions of the Association for Computational
Linguistics, 10:997–1012.
Jörg Wöckener, Thomas Haider, Tristan Miller, The-
Khang Nguyen, Thanh Tung Linh Nguyen, Minh Vu
Pham, Jonas Belouadi, and Steffen Eger. 2021. End-
to-end style-conditioned poetry generation: What
does it take to learn from examples alone? In Pro-
ceedings of the 5th Joint SIGHUM Workshop on Com-
putational Linguistics for Cultural Heritage, Social
Sciences, Humanities and Literature, pages 57–66,
Punta Cana, Dominican Republic (online). Associa-
tion for Computational Linguistics.
Xiaoyuan Yi, Ruoyu Li, Cheng Yang, Wenhao Li, and
Maosong Sun. 2020. Mixpoet: Diverse poetry gen-
eration via learning controllable mixed latent space.
In Proceedings of the AAAI conference on artificial
intelligence, volume 34, pages 9450–9457.
Weizhe Yuan, Graham Neubig, and Pengfei Liu. 2021.
Bartscore: Evaluating generated text as text genera-
tion. In Advances in Neural Information Processing
Systems, volume 34, pages 27263–27277. Curran As-
sociates, Inc.
Sina Zarrieß, Hendrik Buschmeier, Ting Han, and
Simeon Schüz. 2021. Decoding, fast and slow: A
case study on balancing trade-offs in incremental,
character-level pragmatic reasoning. In Proceedings
of the 14th International Conference on Natural Lan-
guage Generation, pages 371–376, Aberdeen, Scot-
land, UK. Association for Computational Linguistics.
Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q.
Weinberger, and Yoav Artzi. 2020. Bertscore: Evalu-
ating text generation with bert. In International Con-
ference on Learning Representations.
Yizhe Zhang, Michel Galley, Jianfeng Gao, Zhe Gan,
Xiujun Li, Chris Brockett, and Bill Dolan. 2018.
Generating informative and diverse conversational
responses via adversarial information maximization.
In Proceedings of the 32nd International Conference
on Neural Information Processing Systems, NIPS’18,
page 1815–1825, Red Hook, NY, USA. Curran Asso-
ciates Inc.
Wei Zhao, Maxime Peyrard, Fei Liu, Yang Gao, Chris-
tian M. Meyer, and Steffen Eger. 2019. MoverScore:
Text generation evaluating with contextualized em-
beddings and earth mover distance. In Proceedings
of the 2019 Conference on Empirical Methods in
Natural Language Processing and the 9th Interna-
tional Joint Conference on Natural Language Pro-
cessing (EMNLP-IJCNLP), pages 563–578, Hong
Kong, China. Association for Computational Linguis-
tics.
A Appendix
A.1 DeepSpeare and SA
Deepspeare (Lau et al., 2018) is specifically de-
signed for poetry generation. Its core architecture
consists of an LSTM language model, a pentameter
model (specifically designed to learn iambic me-
ter) and a rhyme model. During training, it takes
sonnets as input data (three quatrains followed by
a couplet) but ultimately processes the contained
quatrains by splitting any given sonnet. The rhyme
model processes ending words of quatrain verses
and uses a margin-based loss to discriminate be-
tween rhyming and non-rhyming words. It is not
limited to specific rhyme patterns but assumes that
rhymes exist in the data. At inference time, Deeps-
peare generates quatrains.
Structured Adversary. Like Deepspeare, Struc-
tured Adversary (SA) (Jhamtani et al., 2019) incor-
porates different components: an LSTM language
model and a discriminator used to decide whether
line endings are typical for poetry. Both compo-
nents are organized in an adversarial setup, where
the language model acts as a generator, trying to