Benchmarking Open-Source Language Models for Efficient Question Answering in Industrial Applications


Discover the first version of our scientific publication "Benchmarking Open-Source Language Models for Efficient Question Answering in Industrial Applications" published in arxiv and submitted to the Engineering Applications of Artificial Intelligence journal. This article is already available to the public.

Thanks to the Novelis research team for their know-how and expertise.


In the rapidly evolving landscape of Natural Language Processing (NLP),Large Language Models (LLMs) have demonstrated remarkable capabilitiesin tasks such as question answering (QA). However, the accessibility andpracticality of utilizing these models for industrial applications pose signif-icant challenges, particularly concerning cost-effectiveness, inference speed,and resource efficiency. This paper presents a comprehensive benchmarkingstudy comparing open-source LLMs with their non-open-source counterpartson the task of question answering. Our objective is to identify open-source al-ternatives capable of delivering comparable performance to proprietary mod-els while being lightweight in terms of resource requirements and suitable forCentral Processing Unit (CPU)-based inference. Through rigorous evalua-tion across various metrics including accuracy, inference speed, and resourceconsumption, we aim to provide insights into selecting efficient LLMs forreal-world applications. Our findings shed light on viable open-source al-ternatives that offer acceptable performance and efficiency, addressing thepressing need for accessible and efficient NLP solutions in industry settings.

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Recent blogs

All blogs