02/03/2025
Research Article
Human languages trade off complexity against efficiency
The authors trained different language models, from simple statistical models to advanced neural networks, on a database of 41 multilingual text collections comprising a wide variety of text types, which together include nearly 3 billion words across more than 6,500 documents in over 2,000 languages. The trained models were then used to estimate entropy rates, a complexity measure derived from information theory.
Image credit: Annie Spratt
Connect with Us