Blog

Conversion of PDF to SCORM Using AI: Challenges and Achievements

23/12/2024
Juan Prieto

The challenge of converting PDF documents to SCORM has been elevated to the next level through the implementation of artificial intelligence (AI). This advancement presents itself as an efficient and versatile solution, ideal for those needing to transform documents into interactive content, while preserving the original design or adapting it to a distinct, customized layout.

Global Vision: An Intelligent and Flexible Process

The synthesis of documents into electronic formats such as EPUB, SCORM, or HTML from PDFs is now performed by various transformation systems that preserve the original visual appearance.

However, adapting a PDF document with a specific design to a document with a different structure and a new layout is challenging due to the lack of information about the structure in the source document.

This aspect is particularly important for interactive content such as HTML and SCORM, where proper organization enhances both the visual experience and interactivity. Additionally, it is crucial that this content is accessible and responsive, allowing for seamless adaptation to different screen sizes and devices.

The conversion of a PDF to another format with exact positions is a straightforward process. However, transforming a PDF by adjusting the positions of elements and reading flows typically requires a human typesetter or designer. Ximdex accomplishes this using artificial intelligence techniques.

How Was It Done?

The problem to solve is how to layout the source content to adapt it to a new electronic format with a reading flow, design, and structure different from the original.

To solve this, we use computer vision to identify key elements on the page, separate them from the main content, and place them in the appropriate spaces within the new format.

At the same time, we apply technologies based on LLMs (Large Language Models) to analyze the text, identify patterns, and adjust it to a coherent reading structure.

Training the Vision Model:

A dataset of sample PDFs was prepared, where key elements such as accordions, pop-ups, activities, and others were manually identified and labeled.

The following image illustrates how this manual labeling is performed, which is a necessary step before training the visual recognition system.

During training, key metrics are evaluated to measure how well the model detects elements and adjusts them correctly to the new design. Once the results exceed the expected threshold, the model is ready to be used from the Ximdex Platform.

Once the system is trained, it can process a page from a PDF document and separate secondary elements from the main text. The illustration below shows a page from an educational book, highlighting the elements recognized by the computer vision system and the confidence level assigned to each.

Can I Use XPUBLISH Already?

Absolutely yes! XPUBLISH is a major leap in automating complex conversion tasks, such as adapting print designs to the digital world. Available on the Ximdex Platform, this service transforms PDF documents using computer vision techniques to identify secondary elements such as activities, annotations, or diagrams. Upon completion, the content is automatically converted into formats like SCORM or EPUB, featuring an accessible and adaptable design for seamless viewing on any device.