A few problems with the concept of accessible PDFs, Part Two

Firstly, apologies for such a long delay between posts – especially on a topic that garnered such interest!

This is part two of a review of the Accessibility Support Documentation for PDF. Reading through the document is quite reassuring, with every single success criterion (even the AAA ones) either supported by Adobe, or the responsibility of the document author.

It’s only when one reads the Appendices that it becomes apparent that all is not as it seems. Adobe PDF does fail in some serious ways, it just seems to have escaped the author of the Accessibility Support document.

But firstly, I looked at the lack of testing, and make sure you read the comments because there is one from Adobe!

Secondly, let’s look at the document in more detail.

Correctly tagged headings

In response to WCAG2 Success Criterion 1.3.1: Info and Relationships: Information, structure, and relationships conveyed through presentation can be programmatically determined or are available in text, (Level A). Adobe says:

“PDF provides a variety of ways to convey information and relationships with semantic elements such as headings, lists, tables, and paragraphs. The ISO 32000-1 PDF Specification details structure types in section 14.8.4.

Testing results

  • Test id 4 (Correctly tagged paragraphs)
  • Test id 5 (Correctly tagged headings)
  • Test id 6 (Correctly tagged controls and input elements)”

However, if one actually reads Test id 4, 5 and 6 in the Appendices, it actually turns out that Window Eyes does not read headings. In fact, if you don’t believe me, read the Appendices, and I quote:

“Headings are identified. (WE [Window Eyes] does not identify headers but does read text)”

Headings are extremely important to screen reader users. It allows them to scan a document to determine which section they need to read. And this is  especially important in PDF documents, because PDF documents tend to be much longer than your average web page – that’s a lot of text to trawl through if you can’t jump from heading to heading.

You know, headings are so important that they are mentioned at WCAG2 Level A twice. Success Criterion 2.4.1: Bypass Blocks: A mechanism is available to bypass blocks of content that are repeated on multiple Web pages, also requires headings, so that screen reader users can jump from one place to another. Adobe says:

“PDF allows documents to be tagged with headings which can be used in conjunction with assistive technologies to bypass sections of content.  PDF also provides bookmarking functionality that allows keyboard users to accomplish similar bypassing.

Testing results

  • Test id 9 (Correctly tagged headings)
  • Test id 10 (PDF bookmarks)”

Once again, if one reads test id 9 and 10 in the Appendices, it turns out that Window Eyes does not read headings, but it does read bookmarks. So, instead of using headings to break up your PDF you could use bookmarks, but then a review of the Appendices shows that JAWS doesn’t read bookmarks. Now this is a real problem. It means you need to markup every heading in your PDF twice: once as a heading (so JAWS will read it), and once as a bookmark (so WindowEyes will read it). And remember, I talked about the scarcity of testing previously, so there could be assistive technology out there that reads neither.

A few more accessibility concerns…

  • The abbreviation feature is not supported by the magnifier ZoomText.
  • You can embed media, but there is no equivalent of the HTML NOEMBED feature. This means users cannot set an alternative for any media they embed in the PDF.
  • You can only adjust the primary background colour of the PDF – not the background colour of regions of a page.

My conclusions

PDF is not an accessible technology… yet. I do commend Adobe for being one of the first companies to address accessibility issues in their products, however they still have a long way to go to match the accessibility features of X/HTML. The difficulty in tagging a PDF also essentially prohibits accessible PDFs from being developed in the mainstream media.

18 thoughts on “A few problems with the concept of accessible PDFs, Part Two

  1. AlastairC says:

    Playing devils advocate though, aren’t these issues with the user agent rather than the format?

    I realise that was the aim of the previous article, but I’ve had issues with JAWs and HTML previously that are similar in nature. For example, having a link in a heading meant the heading wasn’t read out (probably fixed now, but these sort of issues come and go in screen readers).

    Don’t get me wrong, I do think there are issues with PDF use, but aren’t the issues here for user-agent vendors to fix?

  2. steve faulkner says:

    The difficulty in tagging a PDF also essentially prohibits accessible PDFs from being developed in the mainstream media.

    I created a PDF from part of your blog post, I copied it Word and then converted it PDF, I didn’t have to tag it, tags were added in the conversion process. NVDA the free open source screen reader for windows had no problem identifying and navigating by the correctly tagged headings in the resulting PDF.

    Window Eyes also has poor support for some HTML4 elements, does that mean HTML is an inaccessible technology?

    1. Gian says:

      …I created a PDF from part of your blog post, I copied it Word and then converted it PDF, I didn’t have to tag it, tags were added in the conversion process. NVDA the free open source screen reader for windows had no problem identifying and navigating by the correctly tagged headings in the resulting PDF….

      Hi Steve,
      Certainly creating a PDF from text is very easy. It is when creating a PDF from a complex document (one with images and tables etc) that it is difficult. Some of my work comes from companies that make accessible PDFs and then contract the difficult ones to me. For example the eGovernment Accessibility Toolkit (PDF, 3MB) was very difficult to tag in PDF.
      Cheers,
      Gian

  3. Samuel says:

    You wrote: “PDF is not an accessible technology”. That’s not true in my opinion.

    If WindowEyes does not support headings this a problem of WindowEyes. If JAWS does not support bookmarks this is a problem of JAWS. If ZoomText does not support abbreviations this is a problem of ZoomText.

    PDF allows to specify an alternative text for every embedded media. See section 12.5.2 in ISO 32000-1:2008.

    If a PDF reader dos not allow to adjust the primary background color of regions of a page this is a problem of the PDF viewer.

    The problems you mentioned are not part of the PDF technology itself. Rather, the screen readers are not ready for accessible PDFs yet.

    Samuel

    1. Gian says:

      …You wrote: “PDF is not an accessible technology”. That’s not true in my opinion. If WindowEyes does not support headings this a problem of WindowEyes….

      …Playing devils advocate though, aren’t these issues with the user agent rather than the format?…

      Hi Samuel and Alistair
      That is an interesting position to take – and certainly one that the W3C would agree with. But until screen reader manufacturers catch up, the end result is the same: people with disabilities are unable to use PDFs. This is what (in Australia) organisations like the Australian Government Information Management Office (AGIMO) are most interested in. It is whether people with disabilities can actually use the PDF that will inform whether AGIMO decide that PDF is defined as an “accessible technology” within the confines of WCAG2.
      Cheers,
      Gian

  4. Gian,
    Thanks for the follow-up post. I’m pleased to see how short your list of concerns is. I have a few comments:

    Related to your comments on heading support, the question that I keep coming back to is “how much is enough?”. Many people advocated for web developers to use headings in the past, prior to JAWS, Window-Eyes, or other assistive technologies providing support for headings. HTML contained support for headings, browsers supported identification of headings in their DOMs, and eventually assistive technologies added support for headings. I’m not sure that I could say that all assistive technologies support headings in HTML – I suspect that most do, but don’t have data on that. Similarly, PDF provides support for headings, Adobe Reader supports identification of headings in its DOM, and assistive technologies such as JAWS and NVDA have added support for headings. More will in the future, I’m sure.

    The follow-up question is “how much does it matter?”. Assistive technologies are products that are designed for their users. If a company that makes an assistive technology chooses to not implement support for headings in a product that supports headings within the content, does that mean that the non-assistive technology product isn’t accessible? It is a tough problem, since on one hand it is important for end-users to have support for features like headings, but on the other hand we can’t force any assistive technology vendor to do anything in a product that we do not control.

    You express dismay that authors might need to implement support for headings and bookmarks in PDF: “It means you need to markup every heading in your PDF twice: once as a heading (so JAWS will read it), and once as a bookmark (so WindowEyes will read it).” Is this different from what you need to do in HTML? It seems that document navigation support by headings in HTML is focused on screen readers, but keyboard users can’t use headings to navigate (unless you count Opera’s long-standing support for heading navigation). So an HTML author also needs to add skip-nav links to support keyboard users, right? Fortunately, creating bookmarks in PDF is done in the same step as headings, at least for Adobe Acrobat, so it isn’t really two separate steps.

    You cite concerns about embedded media. “This means users cannot set an alternative for any media they embed in the PDF.” Authors can provide equivalents such as captions or audio descriptions for media in a PDF. Authors can also link to or provide additional alternative versions, should an author decide to provide a video in Flash within the PDF and wants to also provide a link to the WindowsMedia version of the file. Of course for video the author can provide H.264 video and it can be played with the Flash Player or a different player, depending on the user’s choice.

    In your conclusion you say “The difficulty in tagging a PDF also essentially prohibits accessible PDFs from being developed in the mainstream media” but you didn’t write about this in your post at all so I’m not sure what the support for this conclusion is. If you are using a tool that supports accessibility well (e.g. Acrobat, Microsoft Word, Open Office, to name a few) then creating a tagged PDF is easy. There are also more production-scale tools that support accessibility. I’d venture a guess that the issue with the production of accessible PDF is the same in the media as it is for the production of accessible web pages – there is less knowledge on accessibility among these authors than is needed.

    Thanks again for your interest in PDF accessibility.
    AWK

    1. Gian says:

      …Thanks for the follow-up post. I’m pleased to see how short your list of concerns is. I have a few comments…

      Hi Andrew

      Thanks for coming back and responding again. I suppose my answer to “how much is enough?” and “how much does it matter?” is that it matters a lot to people who can’t use PDF. But I am sure things will improve with time. I believe it is important to highlight those things that need to be improved so that people don’t blindly say “PDF is accessible”, when there are a few things that still need fixing.

      With regards to my comment on the difficulty of tagging PDF – it is true I didn’t mention this in my post; it has been a personal observation. I get many requests from people having difficulty tagging complex PDFs- not because they don’t know what to do, but because their version of Adobe Reader crashes. These people are trying to do their very best in accessibility and the product is letting them down.

      And you are quite right: knowledge about the accessibility of PDFs, is like knowledge of the accessibility of X/HTML; there will be many people who know little of either. Unlike HTML though, PDF has an auto-tagging function which tends to do 80% of the work, and I do commend Adobe for creating such a feature.

      Cheers,
      Gian

  5. Cliff Tyllick says:

    Gian, have you actually tested PDFs in Window Eyes? I have never before heard a complaint that Window Eyes does not read headings. I would be shocked if any screen reader were to omit the headings and read only the text between them.

    But that’s not what the Appendix says. It says that Window Eyes fails to read headers. Headers are not headings, in spite of the many people in this field who use the terms as if they are interchangeable. A header is the region above the text window, where a print document will usually display a running head — for example, the heading of the current chapter on odd pages, and the short title of the document on even pages. A heading, by contrast, is a text block that is assigned a semantic level above the level of body text.

    Until about two years, maybe four, ago, JAWS did not read headers, either — or, if it did, getting it to do so took a keystroke combination that was unknown to the quite proficient JAWS users I know.

    So, if you haven’t already — and your post suggests that you didn’t test this, but merely got the idea from the Appendix — you should verify that Window Eyes ignores headings. I suspect you’ll find that’s not true, but I’m eager to find out what you discover.

    1. Gian says:

      …Gian, have you actually tested PDFs in Window Eyes? I have never before heard a complaint that Window Eyes does not read headings….

      Hi Cliff
      Maybe this is something Andrew Kirkpatrick can answer – whether headers refer to headings or to essentially what I would call titles. I assumed headers meant headings, because the reference to headers was immediately underneath the JAWS reference to headings. However, I will test this and get back to you.
      Cheers,
      Gian

  6. AlastairC says:

    Gian wrote:
    “But until screen reader manufacturers catch up, the end result is the same: people with disabilities are unable to use PDFs.”

    So what is the next step? Websites already have millions of PDFs, so surely the best next-step *is* for screen reader manufacturers to catch up?

    Adobe (and the ISO process for the PDF standards) have done what is in their power to do, the only thing that will make screen reader vendors update (from previous experience) will be pressure from their users.

    Apart from the narrow case of NVDA and ARIA/HTML5, screen readers always trail mainstream usage, as they tend to wait for things to features.

    If we go around saying that PDF is not accessible (*very* disputed), then the screen reader vendors have an excuse not to improve support. If we maintain that the format is accessible, it moves the responsibility to the vendor.

    1. Gian says:

      …So what is the next step? Websites already have millions of PDFs, so surely the best next-step *is* for screen reader manufacturers to catch up?…

      Hi Alistair

      Well I don’t think screen reader manufacturers are deliberately not trying to work with PDFs, but PDFs have been around for a long time now and the screen readers still don’t manage the PDFs that well. Doesn’t that suggest that perhaps the screen reader manufacturers can’t make a screen reader that reads a PDF properly? Doesn’t that suggest that it is up to Adobe to create PDFs that screen readers can interpret? I really don’t think screen reader manufacturers are looking for an excuse not to support PDFs.

      And in terms of whether the PDF format is accessible or not – without thorough testing we cannot tell. And as I mentioned in my first post; very little testing was completed. A small range of assistive technologies were tested and on a very small number of operating systems and browsers. This document is certainly not enough to declare that PDF is accessible.

      Oh and by the way, saying that web sites have millions of PDFs and therefore it’s all up to the screen reader manufacturers is a little like saying that web sites have millions of images without ALT attributes, so the screen reader manufacturers should come up with an algorithm to describe images.

      Gian

  7. AlastairC says:

    Sorry, by “as they tend to wait for things to features.” I meant:
    as they tend to wait for things to be common before implementing new features.

  8. A few more comments:
    you wrote: “I suppose my answer to “how much is enough?” and “how much does it matter?” is that it matters a lot to people who can’t use PDF.”

    You’re not answering my question. You seem to be viewing the lack of heading support in Window-Eyes as an indication of an absolute barrier for Window-Eyes users. I agree with Alistair’s comment that end users who want the features need to advocate for them with their user agent of choice, but in the meantime, the users do have access to the information even if they don’t hear that it is a heading. This is not as good, no question, but it is the same situation that users had with HTML not so long ago.

    You wrote: I believe it is important to highlight those things that need to be improved so that people don’t blindly say “PDF is accessible”, when there are a few things that still need fixing.

    Agreed – but you can replace PDF in your sentence above with any technology. There is no technology, including HTML, that you can say is fully accessible with no qualification.

    You wrote: “I get many requests from people having difficulty tagging complex PDFs- not because they don’t know what to do, but because their version of Adobe Reader crashes. These people are trying to do their very best in accessibility and the product is letting them down.”

    I assume that you mean Acrobat instead of Reader. If people are experiencing this they should send the info in to Adobe since this would be a bug thatwe’d want to address. I haven’t experienced Acrobat crashing while working on PDF document tags myself, so customer comments are appreciated.

    You wrote: “For example the eGovernment Accessibility Toolkit (PDF, 3MB) was very difficult to tag in PDF”

    That’s a 272 page document that was created in Microsoft Word and exported with tagging disabled. 272 pages of anything would be a challenge, including semantic HTML, but the author(s) didn’t do one of the main steps correctly, and that created a lot of unnecessary work.

    AWK

    1. Gian says:

      …You’re not answering my question…

      I presume you refer to the question “how much is enough”? I think that product manufacturers like Adobe should be continually looking at accessibility.

      In 1998, the very first time I ran a seminar on accessibility, I was asked about the accessibility of PDFs. I am always asked about the accessibility of PDFs. I would love to say that they are accessible, but I just don’t believe they are. I believe that X/HTML has many more accessibility features than PDF and that they are both easier to implement and more commonly implemented than PDF tagging. But I don’t want to get into an argument about the accessibility of PDFs because I simply haven’.t tested them for accessibility.

      What I would like to see is a comprehensive review of PDFs, testing them against a variety of assistive technologies and with various people with different disabilities to ascertain exactly how accessible they are/n’t. But I haven’t seen this testing done, and I believe the “Accessibility Support Documentation for PDF” is misleading in indicating that this kind of comprehensive testing has been completed.

      Gian

  9. Gian,
    You’ve said a number of times that very little testing was done, but you are still pointing to the accessibility support documentation for PDF, which is not the document that you initially thought that it was. It was created to indicate how the PDF format could address the assistive technology-specific items within WCAG 2.0, and the requirement for the WCAG 2.0 release was that implementors provide data on four combinations of user agents and assistive technologies. We provided six, but I certainly will not say that this is represents all of the tools that we want to support PDF, nor will I say that this is all of the testing that we’ve done. We have done more testing, and will share more of that data in the future, but for the WCAG 2.0 implementation report this is what we provided.

    It is a little ironic that you say “But I don’t want to get into an argument about the accessibility of PDFs because I simply haven’.t tested them for accessibility.” — I’m not trying to argue either – I just want the correct information out there – but surely you can’t believe that you aren’t putting forth an argument when you post a series on your blog titled “A few problems with the concept of accessible PDFs”?

    AWK

    1. Gian says:

      You’ve said a number of times that very little testing was done, but you are still pointing to the accessibility support documentation for PDF, which is not the document that you initially thought that it was… (AWK)

      I keep referring to the very little testing done because I believe that the document is pretending to be something it’s not. It is called “Accessibility support documentation for PDF”, but it isn’t really about the accessibility support of PDFs at all – it’s about a very limited range of testing completed, that was then obfuscated between the report itself and the Appendices.

      …We have done more testing, and will share more of that data in the future, but for the WCAG 2.0 implementation report this is what we provided.. (AWK).

      I would very much like to see further testing. Perhaps your testing has shown that there is much more accessibility support for PDF than that document indicates, but (to be a cynic), I would assume you would have added that data into the document if it supported your case that PDFs are accessible.

      …It is a little ironic that you say “But I don’t want to get into an argument about the accessibility of PDFs because I simply haven’t tested them for accessibility.” — I’m not trying to argue either – I just want the correct information out there – but surely you can’t believe that you aren’t putting forth an argument when you post a series on your blog titled “A few problems with the concept of accessible PDFs”? (AWK)

      Once again this purely comes back to the document and how I believe it is misleading. I believe the document (if one didn’t know the scope of testing, and didn’t have access to the Appendices) is misleading in that it apparently talks about the accessibility support for PDFs. If you can prove to me that PDFs are accessible then that’s great (and a lot of people will be happier), but this document is misleading in that it indicates that there PDFs are accessible, when little testing was done, and some of the tests failed. That’s my problem: the concept put out there by this document that PDFs are accessible, when this document can declare no such thing.

  10. […] I have finally had a chance to read through the extensive AGIMO study into PDFs. It’s a comprehensive review of PDFs and their accessibility, and the authors should be commended for completing such detailed testing while still being able to explain the findings in plain English. It is the most thorough review of PDFs that I have seen, and it confirms some of my previous statements. […]

  11. […] Part 2: A few problems with the concept of accessible PDFs […]

Comments are closed.