Introduction
Mainframe operating systems, originating in the 1940s, remain essential to critical sectors such as finance and government. However, the vast legacy of COBOL code—estimated by IBM to be around 200 to 220 billion lines—needs to be migrated to modern platforms and rewritten in contemporary programming languages. This task is monumental, with the cost of rewriting COBOL code using human resources estimated at 32 to 50 cents per line, presenting a $100 billion challenge. The time required for a complete rewrite by human programmers is still uncertain. These systems are often perceived as outdated, requiring significant maintenance and modernization. Addressing this challenge demands innovative tools capable of understanding and interacting with legacy codebases, a long-standing obstacle for the industry. The advent of Large Language Models (LLMs) offers a potential solution to this enduring problem. However, there are several concerns when applying LLMs to mainframe modernization.
Challenges in Using LLMs for Mainframe Modernization:
1. Limited Training on Mainframe Languages: While existing LLMs are trained on a wide range of languages, both natural and programming, they lack sufficient training on languages used in mainframes, such as COBOL. The relatively small amount of COBOL code available online leads to inadequate understanding and reasoning in these models.. Additionally, organizations tend to keep their mainframe codebases private due to the high security demands of financial-critical sectors, further limiting the available training data.
2. Lack of Proper Benchmarks: The absence of comprehensive documentation and clear business goals for mainframe systems makes it difficult to develop benchmarks to evaluate the quality of LLMs in this domain. This hinders the ability to measure their effectiveness and reliability in mainframe modernization tasks.
3. Complexity Beyond Code Generation: LLMs for coding are primarily trained for code generation, the most common use case in software engineering tasks. However, mainframe modernization involves more than just generating COBOL code—organizations aim to migrate their systems to other languages. Thus, LLMs must possess knowledge beyond code generation to effectively modernize these systems.
XMainframe
To address these challenges, researchers at FPT Software AI Center have developed XMainframe, a state-of-the-art large language model (LLM) specifically designed with expertise in mainframe legacy systems and COBOL codebases. The solution includes the creation of an extensive data collection pipeline to produce high-quality training datasets, significantly enhancing XMainframe’s performance in this specialized domain. Additionally, they introduce MainframeBench, a comprehensive benchmark for evaluating mainframe knowledge through multiple-choice questions, question answering, and COBOL code summarization. Empirical evaluations show that XMainframe consistently outperforms existing state-of-the-art LLMs in these tasks, achieving 30% higher accuracy than DeepSeek-Coder on multiple-choice questions, doubling the BLEU score of Mixtral-Instruct 8x7B on question-answering, and scoring six times higher than GPT-3.5 on COBOL summarization. This work underscores XMainframe’s potential to drive significant advancements in managing and modernizing legacy systems, ultimately enhancing productivity and saving time for software developers.
Illustration of steps to collect data to build Mainframe:
Results on MCQ:
Results on Q&A
Results on Code Summarization:
Check out the Paper and GitHub. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter..
Don’t Forget to join our 48k+ ML SubReddit
Find Upcoming AI Webinars here
Thanks to FPT Software AI Center for the thought leadership/ Resources for this article. FPT Software AI Center has supported us in this content/article.
Asif Razzaq is the CEO of Marktechpost Media Inc.. As a visionary entrepreneur and engineer, Asif is committed to harnessing the potential of Artificial Intelligence for social good. His most recent endeavor is the launch of an Artificial Intelligence Media Platform, Marktechpost, which stands out for its in-depth coverage of machine learning and deep learning news that is both technically sound and easily understandable by a wide audience. The platform boasts of over 2 million monthly views, illustrating its popularity among audiences.