If you are responsible for your organization’s website and online documents, you may find yourself answering questions such as “What is Accessibility?” and “What is PDF remediation?” and “What are tags?” Below is a brief overview of what PDF remediation entails.
What does accessible mean?
“Accessible” means “able to be reached or entered.” When referring to digital content, it means able to be used by a person using assistive technology, or by someone who has a disability. Accessible documents are readable by assistive technology such as screen readers or connected Braille displays, but “accessible” also means understandable by all people, including those with cognitive disorders or brain injuries, and usable on a variety of technology and platforms. For more information about accessible digital content, check out this article, The Four Pillars of WCAG.
What are “Tags?”
Tags are digital labels that provide information to assistive technology about what elements the document contains. These can include headings, images, tables, lists, links, etc. Tags also tell assistive technology where these various elements belong in the order of the document. Tags provide in a hierarchy (or, “outline”) of how a document should be read and they provide structure. They inform assistive technology users about what they are reading and help them to more easily navigate and move through the content.
What is PDF Remediation?
PDF Remediation is the process of “tagging” digital elements of PDF documents so that they can be read using assistive technology. Again, these “tags” identify the elements and inform the assistive technology about the order in which they are meant to be read. Many organizations use the PDF file format because visually it remains the same no matter what platform is used to open it. For visual users, the PDF format is stable and consistent across many platforms and a variety of devices.
Ideally, all documents would be created accessibly, and stay that way even if “saved as PDF.” In truth, that is often not the case. Even completely accessible documents created in MS Word, Google Docs or other authoring tools may not be accessible when saved to PDF format. Not all existing tags are preserved when content is converted to PDF, and some elements may still require remediation in order to remain accessible. Similarly, a document that beings as inaccessible or only partially accessible will not be more accessible when saved as a PDF. Remediation is required.
The benefits of adding correct PDF tagging go beyond accessibility. They also improve the SEO of any online documents and make such documents more usable for everyone reading them.
Common PDF elements requiring remediation to be made accessible are images, headings, links, lists, tables, and reading order. Other elements may require remediation as well, but these are the most commonly occurring elements needing to be tagged.
Headings are a navigation tool that help organize a document and inform the reader of what it contains. Just like newspaper headings, document headings tell the user what type of content follows. For an assistive technology user, headings are essential in dividing content into easily understood sections. A person using assistive technology can choose to move through a document reading only the headings to tell them what it contains. Without headings, a person reading a document cannot find specific information without reading every single line of text it contains.
If, for example, the document is a handbook, and the person reading it needs help with a specific aspect, they don’t want to have to read the entire 50-page document to find out how to replace batteries. Instead, they will want to navigate to that section by using headings. Users can skim through headings to find what they need, and properly tagged digital headings help assistive technology users do this.
All images must have alt text in order to be understood by assistive technology. Without alt text, any image found in a PDF (or any other format) will simply read as “image” or “graphic.” This means that whatever information was meant to be conveyed by that image or graphic is unavailable to an assistive technology user. Some images are purely decorative and may not require alt text. These can be tagged as an “artifact” and will be ignored by assistive technology. This might include background images, boxes, text shadows, or repetitive logos.
Alt text should be short and describe the image as it pertains to the content. The best way to do this is an article unto itself, but an example might be a picture of George Washington Crossing the Delaware. How the alt text is written depends on the context. Is it a document about Presidents? The Revolutionary War? Boating in the 1700s? Art? Readers need to know how the image forwards the information being conveyed in the rest of the document.
Charts, graphs, flow charts, and infographics need to be clearly and completely described. Sometimes the best way to do this is to include the data table from which the chart or graph is derived.
Links within PDF documents need to be tagged as links. If the text of the document doesn’t indicate where the link leads, some explanation is necessary so the reader knows where the link is going. Otherwise, they may not realize they are leaving your website, or where they will end up. It’s the digital equivalent of jumping off a cliff. This kind of information is useful for all users. Most people prefer to know where a link is leading, and most people would rather see a link attached to descriptive text than a string of HTML code.
Lists need to be tagged as lists. If there is no indication that text is part of a list, it will simply appear to be a wall of unrelated text or a bunch of words with no context.
Properly tagged lists will allow assistive technology to inform the reader that items are “item 1 of 12” so they know the items are part of a list. This can be quite complex in the case of a nested list (or “outline”). A table of contents is an example of a list.
Without proper list tags, the table of contents pictured here will simply read as a wall of items. The reader will be unable to understand that for example, that “Research Design,” “Participants,” “Measures” and “Procedure and Analytic Plan” fall under the section “Methods.”
Tables can be difficult for assistive technology users to parse. With each cell usually referring to both a row and a column header for content, additional information is required so that the data can be clearly understood. Row and column headers must be identified in order for the data to be understood and to facilitate navigation.
Table 9, Rainfall by continent 2009