Background

The foundations for the Language Creator were laid a long time ago. At thirteen, I taught myself Esperanto and soon afterwards discovered John C. Wells’ book Esperanto lingvistisk set (the Danish translation of Lingvistikaj aspektoj de Esperanto). Wells treated Esperanto not as a curiosity but as a system to be analysed typologically. That perspective fascinated me. For a time, I half-imagined that the study of language was essentially the scientific search for the perfect auxiliary tongue. I later became interested in many other aspects of the field, but this early idea led me to Aarhus a few years later to read the subject formally. There, I also studied computer science – a combination that quietly laid the groundwork for everything that followed.

A decisive stimulus came in 1995, when the Russian émigré artists Vitaly Komar and Alex Melamid visited Denmark as part of their Most Wanted Paintings project. Invited by the Danish newspaper Politiken, they had conducted a national survey asking people what they wanted in a painting. Based on the statistical results, they produced Denmark’s “most wanted” painting and its “least wanted” counterpart. Here they are:

Images by Vitaly Komar and Alex Melamid, from the Most Wanted Paintings project. Reproduced here with permission. Higher-resolution versions are available on the official project website.

Komar and Melamid treated artistic taste as surveyable data: ask people what they prefer, then construct the statistical result. For someone already thinking in terms of typological parameters, the parallel was irresistible. Soon afterwards, a group of students – Anette Nielsen, Sebastian Adorján Dyhr, Søren Harder and myself – decided to apply the same method to language. We called the experiment Poll Language. We designed a questionnaire and distributed it within the department, then constructed two languages from the responses.

Those languages have since been lost – like tears in rain. I remember, however, that the “most wanted” language turned out to be highly exotic, brimming with typologically rare and intricate features, while the “least wanted” language resembled a pared-down form of Danish.

Not long afterwards, I began wondering whether the entire process – the parameterisation of structural features and their realisation as coherent systems – could be automated. An early version involved receiving completed questionnaires by email and processing them with Perl and LaTeX to produce PDF grammars. It quickly became clear that the scope was far greater than what I could realistically complete in spare evenings. Over the years, I returned to the idea repeatedly, each time hoping that new technology would make it feasible, and each time finding that it did not.

In late 2022, however, something genuinely changed. The public release of ChatGPT, soon followed by other large language models (LLMs), finally made the project viable. The AI can assist with everything from theoretical questions (“Can the polysynthesis of Greenlandic and Mohawk be modelled in the same way?”) to computational implementation and even mundane technical obstacles (“Why is my .htaccess redirecting me to the wrong file?”).

The result is this website, which I hope will be of interest to typologists and conlangers alike. If you would like to contribute, please visit the Codeberg page.

Thomas Widmann