Transforming LaTeX to Accessible and Inclusive Formats: A Guide for Open Educational Resources
Authors
Sandeep Singh Sangha
(Virginia Tech)
Anita R Walz
(Virginia Tech)
Abstract
This article presents a process for using TeX4ht to convert LaTeX documents into accessible Hypertext Markup Language (HTML) and Electronic Publication (EPUB) formats, aiming to reduce document processing barriers for open educational resource (OER) practitioners. These practitioners—including instructor-authors, librarians, and instructional designers—are often responsible for making mathematically intensive and other openly licensed documents accessible (e.g., compliant with the Americans with Disabilities Act (ADA) Web Content Accessibility Guidelines (WCAG) AA 2.1 standard) for individuals with print disabilities and those who rely on screen reader technologies. Despite this responsibility, many practitioners are either unfamiliar with or under-supported in using existing technical conversion tools such as Pandoc, AMS TexML, the LaTeX axessibility package, and LaTeXML. These tools are often perceived as overly technical and time-intensive to learn. Other options, such as ADA-compliant LaTeX-to-PDF tagging tools, are not yet non-operational. Rekeying entire manuscripts is labor-intensive, and the common fallback, manually remediating (e.g., tagging) PDFs, fails to meet sustainability standards. This approach bolts accessibility features onto the PDF after recompilation, rather than embedding them into the source files. As a result, accessibility tagging must be redone each time the LaTeX source files are edited and recompiled, undermining long-term accessibility and maintainability. In this article, we offer both a rationale and a step-by-step guide for converting LaTeX source files into HTML and EPUB formats for book-length, mathematics-intensive OER. We introduce command-line prompts suitable for users new to the command line. Having faced these challenges ourselves, we aim for this documentation to support staff in library-based OER initiatives, under-resourced OER publishing programs, authors in mathematically intensive STEM disciplines, and LaTeX users seeking sustainable accessibility solutions.
Keywords: Open Educational Resources (OER), LaTeX, TeX4ht, HTML, EPUB, Higher Education, Digital Accessibility
How to Cite:
Sangha, S. S. & Walz, A. R.,
(2025) “Transforming LaTeX to Accessible and Inclusive Formats: A Guide for Open Educational Resources”,
Journal of Open Educational Resources in Higher Education 3(3),
297-318.
doi: https://doi.org/10.31274/joerhe.19769
Sangha,
S & Walz,
A.
(2025) 'Transforming LaTeX to Accessible and Inclusive Formats: A Guide for Open Educational Resources',
Journal of Open Educational Resources in Higher Education.
3(3)
:297-318.
doi: 10.31274/joerhe.19769
Sangha,
S & Walz,
A.
Transforming LaTeX to Accessible and Inclusive Formats: A Guide for Open Educational Resources. Journal of Open Educational Resources in Higher Education. 2025 10;
3(3)
:297-318.
doi: 10.31274/joerhe.19769
Sangha,
S
& Walz,
A.
(2025, 10 27). Transforming LaTeX to Accessible and Inclusive Formats: A Guide for Open Educational Resources.
Journal of Open Educational Resources in Higher Education
3(3)
:297-318.
doi: 10.31274/joerhe.19769
The content of this article is incredibly relevant to the journal, discussing practical applications for a tool which can support enhanced development processes for open educational resources. As a contribution to the paper’s Innovative Practices section, which publishes “short-form, practical, applicable, case-study type articles that are not research-bases,” the article is well-aligned.
The paper shares step-by-step methods for implementing a specific process to convert LaTeX content into other formats for more accessible publications, which would be incredibly useful for those supporting this work at their institutions. As such, this paper is both well-aligned and important for the field.
Organization
The article progresses well through its introduction of the topic and the tools to be discussed. Its step-by-step instructions for implementing TeX4ht are also well-organized and relatively easy to follow, though technical in nature.
The one area which could be improved in this sense is the paragraph just before the TeX4ht section. This paragraph would benefit from the addition of a subheader to highlight its contents, something along the lines of “How to Read this Article.” If someone was scanning through this article section-by-section, they might skip the introduction (or the second half of it) to see the methods and main content of the paper. However, this specific paragraph, providing an overview of how the paper is organized and what it contains, is incredibly important and should not be glossed over. For this reason, I would recommend making it a top-level heading or assigning it a subheading within the introduction. This would make its importance and purpose clearer to readers without changing the structure of the article in any measurable way.
Methodology, Approach, Conclusions
This paper provides a valuable and clear process for OER creators, publishers, and support staff to consider when supporting the development of content using LaTeX, a popular markup language for researchers in mathematics, physics, and engineering. The methods for utilizing TeX4ht make sense and the reasoning for the tool’s use are also clear.
However, I do have some concerns with the Introduction section within the paper. It would have been nice to see more research put into the Introduction section, to back up the relevance of the topic for readers and show a stronger base of knowledge on the topic. In some cases, statements are made that appear to be conjecture or based on personal experience, which could be taken negatively by readers with a lot of experience in this area.
Specific points of concern are noted below:
At the top of p. 3, The statement is made that “OER practitioners, especially those without a technical background, typically use one of three approaches for accessible document production with LaTeX source files.” I would recommend rephrasing this sentence to either make it clear that it is an observation you are making based on your own experience and prior knowledge or to cite a specific piece of research that backs up your point that these three methods specifically are the ones most commonly used.
In the middle of p. 3, several statements are made about a specific approach to publishing LaTeX content with "OER platforms," most of which are in this sentence: “The third approach discards LaTeX’s benefits and involves scraping content from a PDF into OER platforms like Pressbooks, Manifold, or Overleaf.” I would argue that this sentence is misleading in a couple of areas.
1. This process only discards LaTeX’s benefits “as a tool for typesetting and compiling documents into a single package or outputting graphs and figures.” These methods keep other benefits intact, such as formatting for mathematical equations (at least in Pressbooks’ case, which includes a plugin for Mathjax).
2. I do not know many OER publishers who “scrape content from a PDF into OER platforms.” More often, the base text is imported into platforms and the figures and math must be compiled separately as LaTeX for MathJax (for Pressbooks) or as images/plain text (for Manifold). Our team recently began taking the LaTeX source code for documents and converting them to HTML or EPUB using pandoc, but I understand that may not be common.
Footnote 3 is an incredibly useful aside and provides excellent context for how tools like pandoc have been used as workarounds for this in the past, and I'd love to see that context brought into the main text of the article to flesh out the introduction's point.
3. I would not recommend placing Overleaf next to Pressbooks or Manifold here. Overleaf is a tool specifically intended for ingesting and creating documents entirely written in LaTeX. It is not commonly used by general OER practitioners, but it is commonly used by authors who are well-versed in LaTeX. Libretexts and PreTeXT are more commonly used for OER development than Overleaf, in my experience.
Besides these few, small areas of concern, I did not have any concerns with the accuracy or methods presented in the paper. Everything is presented with purpose and the specific methods being discussed within the step-by-step instruction sections show a strong background of knowledge in this topic.
Writing Style, References
The writing style is strong, clear, and not overly bloated, something I especially appreciated since this is a very technical set of instructions.
There was only one section where I would recommend changing the general style and formatting.
The use of bold in the paragraph before the “TeX4ht” section is distracting, especially when coupled with the use of hyperlinks throughout the same section. If the bold is necessary, this could also be improved by not listing the full title of the appendices. For example, the sentence “Finally, we summarize our findings from extensive experimentation with make4ht and tex4ebook in appendices Appendix A. Additional Experiments & Findings - TeX4ht for HTML and Appendix B. Additional Experiments & Findings - TeX4ht for EPUB respectively” could be simplified to: “Finally, we summarize our findings from extensive experimentation with make4ht and tex4ebook in Appendix A and Appendix B, respectively.”
Application
Yes! This paper does an excellent job making the case for the relevance of the process it introduces, making the process relatively easy to follow, and providing links to external guides to support practitioners as they explore more accessible ways to delve into formatting for documents developed in LaTeX. The inclusion of code examples for both Windows and Mac was also an excellent touch, and I applaud the authors for making their process easy to follow for those with limited technical experience. It is still likely beyond those with no technical experience, but those willing to learn will gain valuable experience by working through this well-documented process.
What are the stronger points/qualities of the article?
The straightforward and clear writing style, step-by-step instructions, and (as previously stated) inclusion of both Windows and Mac commands for implementing the described practices were all handled well by the authors. The rationale for this process and the use of this tool specifically are well described, and I look forward to testing the process against our press' current workflows for handling LaTeX documents in the future.
What are the weaker points/qualities of the article? How could they be strengthened?
As previously noted, the introduction of the article could be further clarified with some small tweaks, and a subheading for the paragraph describing the structure of the paper would be useful for readers. These are small concerns, though, and I could see the authors easily addressing them.
The only additional note I would add is that the article might benefit from the addition of 1–2 sentences about what a screen reader is and why they are important, after the sentence: "LaTeX source code is typically compiled into Portable Document Format (PDF); however, without additional intervention, the mathematical equations, figures, tables, navigation structures, etc., within LaTeX-generated PDFs are inaccessible to readers who use screen readers to access digital materials."
While I would hope that most readers would understand the purpose and importance of having screen reader accessible documents, not all readers will, and 1–2 sentences explaining who uses screen readers and why would further support the importance of this article for a general audience of OER creators.