Edited from Zaporizhzhia, Ukraine

newukrainedaily.com

New Ukraine Daily

Ukraine reporting, explainers, and practical support coverage.

News report

Ukraine Transfers 10TB for National Language Model Training

The State Archival Service of Ukraine has transferred 10 terabytes of data for the training of the national language model 'Syaivo'. This initiative was reported by the Ministry of Digital Transformation.

Ukrinform UAReport2 min readUpdated 4/11/2026

Share this article

Share to social platforms, or copy the article link and share text manually.

The State Archival Service of Ukraine has transferred 10 terabytes of data for the training of the national language model 'Syaivo'. This...

Photo: Ukrinform UA

At a glance

  • 10TB of data transferred for training the 'Syaivo' language model
  • Contains materials equivalent to 70,000 books
  • Digital copies in archives to exceed 200 million by year-end
  • Over 50 partners contributing data for model training
  • Beta testing of the model scheduled for spring 2026

Why it matters

The transfer of 10TB of archival data is crucial for advancing Ukraine's capabilities in AI. This effort also aims to enhance the preservation and accessibility of historical documents.

https://www.ukrinform.ua/rubric-society/4111375-nacionalnu-veliku-movnu-model-sajvo-trenuvatimut-na-tekstah-ukrderzarhivu.html

What Happened

AI illustration of The State Archival Service of Ukraine has transferred 10 terabytes of data for the training of the national language m...
Illustration for this report. Created by the editorial desk using AI.

The State Archival Service of Ukraine has made a significant contribution to the training of the national large language model, 'Syaivo', by providing 10 terabytes of historical texts and documents. The announcement was made by the Ministry of Digital Transformation, indicating that this transfer represents a first in sharing data from the State Archives for digital development in Ukraine.

Key Details

This data transfer includes approximately the equivalent of 70,000 books. The materials consist of unique historical documents, government records, and scientific texts, marking a notable increase in the country's digital resources.

The initiative aims to enhance the training of the 'Syaivo' model, which will analyze a variety of texts including historical sources, manuscripts, laws, court rulings, media materials, and dictionaries. By the end of the year, the number of digital copies available in state archives is expected to grow from 150 million to over 200 million.

This rapid digitization is among the highest globally and reflects Ukraine's commitment to preserving its archival heritage. Currently, over 50 partners, including media organizations, universities, and libraries, are contributing materials for this model's training.

A beta test for the 'Syaivo' model is planned for spring 2026, following a memorandum signed between the Ministry and the telecommunications company Kyivstar in summer 2025. This agreement outlines collaboration for establishing a large language model aimed at integrating AI into the public sector, defense, and business sectors further.

Why It Matters

This development is a pivotal step in advancing Ukraine's technological capabilities and preserving its cultural heritage. By leveraging archival data, the 'Syaivo' model aims to foster innovation in digital services and artificial intelligence applications throughout the country.

Background

The initiative to develop a national language model aligns with ongoing efforts to boost the use and understanding of the Ukrainian language in digital spaces. The collaboration between the Digital Transformation Ministry and various stakeholders marks a significant move towards modernizing state services and making information more accessible through advanced technologies.

Source: Ukrinform UA

This report is maintained as a live newsroom article. Headlines and top paragraphs may be tightened when fresh reporting changes the clearest angle.

Newsletter

Get the next major Ukraine report

Follow the strongest verified developments with a cleaner newsroom brief and direct follow-up coverage.

Contact the newsroom

By subscribing, you agree to receive newsroom email updates. Your email is stored in our internal subscriber database for future mailings. See our Privacy Policy and Terms.

Report format

Fast lead first, then fuller context.

Source photo stays distinct from any illustration.

Related coverage stays inside the same reporting thread.