News report
Ukraine Transfers 10TB for National Language Model Training
The State Archival Service of Ukraine has transferred 10 terabytes of data for the training of the national language model 'Syaivo'. This initiative was reported by the Ministry of Digital Transformation.
Share this article
Share to social platforms, or copy the article link and share text manually.
Photo: Ukrinform UA
At a glance
- 10TB of data transferred for training the 'Syaivo' language model
- Contains materials equivalent to 70,000 books
- Digital copies in archives to exceed 200 million by year-end
- Over 50 partners contributing data for model training
- Beta testing of the model scheduled for spring 2026
Why it matters
The transfer of 10TB of archival data is crucial for advancing Ukraine's capabilities in AI. This effort also aims to enhance the preservation and accessibility of historical documents.
https://www.ukrinform.ua/rubric-society/4111375-nacionalnu-veliku-movnu-model-sajvo-trenuvatimut-na-tekstah-ukrderzarhivu.html
What Happened
The State Archival Service of Ukraine has made a significant contribution to the training of the national large language model, 'Syaivo', by providing 10 terabytes of historical texts and documents. The announcement was made by the Ministry of Digital Transformation, indicating that this transfer represents a first in sharing data from the State Archives for digital development in Ukraine.
Key Details
This data transfer includes approximately the equivalent of 70,000 books. The materials consist of unique historical documents, government records, and scientific texts, marking a notable increase in the country's digital resources.
The initiative aims to enhance the training of the 'Syaivo' model, which will analyze a variety of texts including historical sources, manuscripts, laws, court rulings, media materials, and dictionaries. By the end of the year, the number of digital copies available in state archives is expected to grow from 150 million to over 200 million.
This rapid digitization is among the highest globally and reflects Ukraine's commitment to preserving its archival heritage. Currently, over 50 partners, including media organizations, universities, and libraries, are contributing materials for this model's training.
A beta test for the 'Syaivo' model is planned for spring 2026, following a memorandum signed between the Ministry and the telecommunications company Kyivstar in summer 2025. This agreement outlines collaboration for establishing a large language model aimed at integrating AI into the public sector, defense, and business sectors further.
Why It Matters
This development is a pivotal step in advancing Ukraine's technological capabilities and preserving its cultural heritage. By leveraging archival data, the 'Syaivo' model aims to foster innovation in digital services and artificial intelligence applications throughout the country.
Background
The initiative to develop a national language model aligns with ongoing efforts to boost the use and understanding of the Ukrainian language in digital spaces. The collaboration between the Digital Transformation Ministry and various stakeholders marks a significant move towards modernizing state services and making information more accessible through advanced technologies.
Source: Ukrinform UA
This report is maintained as a live newsroom article. Headlines and top paragraphs may be tightened when fresh reporting changes the clearest angle.
Newsletter
Get the next major Ukraine report
Follow the strongest verified developments with a cleaner newsroom brief and direct follow-up coverage.
Report format
Fast lead first, then fuller context.
Source photo stays distinct from any illustration.
Related coverage stays inside the same reporting thread.