I Used The Web For A Day Using A Screen Reader

Chris Ashton

2018-12-19T13:30:48+01:002018-12-19T13:25:05+00:00

This article is part of a series in which I attempt to use the web under various constraints, representing a given demographic of user. I hope to raise the profile of difficulties faced by real people, which are avoidable if we design and develop in a way that is sympathetic to their needs. Last time, I navigated the web for a day with just my keyboard. This time around, I’m avoiding the screen and am using the web with a screen reader.

What Is A Screen Reader?

A screen reader is a software application that interprets things on the screen (text, images, links, and so on) and converts these to a format that visually impaired people are able to consume and interact with. Two-thirds of screen reader users choose speech as their screen reader output, and one-third of screen reader users choose braille.

Screen readers can be used with programs such as word processors, email clients, and web browsers. They work by mapping the contents and interface of the application to an accessibility tree that can then be read by the screen reader. Some screen readers have to manually map specific programs to the tree, whereas others are more generic and should work with most programs.

Accessibility Originates With UX

You need to ensure that your products are inclusive and usable for disabled people. A BBC iPlayer case study, by Henny Swan. Read article ?

Chart showing popularity of desktop screen readers ranks JAWS first, NVDA second and VoiceOver third. — Pie chart from the Screen Reader Survey 2017, showing that JAWS, NVDA and VoiceOver are the most used screen readers on desktop. (Large preview)

On Windows, the most popular screen reader is JAWS, with almost half of the overall screen reader market. It is commercial software, costing around a thousand dollars for the home edition. An open-source alternative for Windows is NVDA, which is used by just under a third of all screen reader users on desktop.

There are other alternatives, including Microsoft Narrator, System Access, Window-Eyes and ZoomText (not a full-screen reader, but a screen magnifier that has reading abilities); the combined sum of these equates to about 6% of screen reader usage. On Linux, Orca is bundled by default on a number of distributions.

The screen reader bundled into macOS, iOS and tvOS is VoiceOver. VoiceOver makes up 11.7% of desktop screen reader users and rises to 69% of screen reader users on mobile. The other major screen readers in the mobile space are Talkback on Android (29.5%) and Voice Assistant on Samsung (5.2%), which is itself based on Talkback, but with additional gestures.

Table showing popularity of mobile screen readers. Ranks VoiceOver first, Talkback second, Voice Assistant third. — Popularity of mobile screen readers: Ranks VoiceOver first, Talkback second, Voice Assistant third. (Large preview)

I have a MacBook and an iPhone, so will be using VoiceOver and Safari for this article. Safari is the recommended browser to use with VoiceOver, since both are maintained by Apple and should work well together. Using VoiceOver with a different browser can lead to unexpected behaviors.

How To Enable And Use Your Screen Reader

My instructions are for VoiceOver, but there should be equivalent commands for your screen reader of choice.

VoiceOver On Desktop

If you’ve never used a screen reader before, it can be a daunting experience. It’s a major culture shock going to an auditory-only experience, and not knowing how to control the onslaught of noise is unnerving. For this reason, the first thing you’ll want to learn is how to turn it off.

The shortcut for turning VoiceOver off is the same as the shortcut for turning it on: ? + F5 (? is also known as the Cmd key). On newer Macs with a touch bar, the shortcut is to hold the command key and triple-press the Touch ID button. Is VoiceOver speaking too fast? Open VoiceOver Utility, hit the ‘Speech’ tab, and adjust the rate accordingly.

Once you’ve mastered turning it on and off, you’ll need to learn to use the “VoiceOver key” (which is actually two keys pressed at the same time): Ctrl and ? (the latter key is also known as “Option” or the Alt key). Using the VO key in combination with other keys, you can navigate the web.

For example, you can use VO + A to read out the web page from the current position; in practice, this means holding Ctrl + ? + A. Remembering what VO corresponds to is confusing at first, but the VO notation is for brevity and consistency. It is possible to configure the VO key to be something else, so it makes sense to have a standard notation that everyone can follow.

You may use VO and arrow keys (VO + ? and VO + ?) to go through each element in the DOM in sequence. When you come across a link, you can use VO + Space to click it — you’ll use these keys to interact with form elements too.

Huzzah! You now know enough about VoiceOver to navigate the web.

VoiceOver On Mobile

The mobile/tablet shortcut for turning on VoiceOver varies according to the device, but is generally a ‘triple click’ of the home button (after enabling the shortcut in settings).

You can read everything from the current position with a Two-Finger Swipe Down command, and you can select each element in the DOM in sequence with a Swipe Right or Left.

You now know as much about iOS VoiceOver as you do desktop!

Navigating By Content Type

Think about how you use the web as a sighted user. Do you read every word carefully, in sequence, from top to bottom? No. Humans are lazy by design and have learned to ‘scan’ pages for interesting information as fast as possible.

Screen reader users have this same need for efficiency, so most will navigate the page by content type, e.g. headings, links, or form controls. One way to do this is to open the shortcuts menu with VO + U, navigate to the content type you want with the ? and ? arrow keys, then navigate through those elements with the ?? keys.

screenshot of 'Practice Webpage Navigation' VoiceOver tutoriasl screen — (Large preview)

Another way to do this is to enable ‘Quick Nav’ (by holding ? along with ? at the same time). With Quick Nav enabled, you can select the content type by holding the ? arrow alongside ? or ?. On iOS, you do this with a Two-Finger Rotate gesture.

screenshot of rota in VoiceOver, currently on 'Headings' — Setting the rotor item type using keyboard shortcuts. (Large preview)

Once you’ve selected your content type, you can skip through each rotor item with the ?? keys (or Swipe Up or Down on iOS). If that feels like a lot to remember, it’s worth bookmarking this super handy VoiceOver cheatsheet for reference.

A third way of navigating via content types is to use trackpad gestures. This brings the experience closer to how you might use VoiceOver on iOS on an iPad/iPhone, which means having to remember only one set of screen reader commands!

screenshot of ‘Practice Trackpad Gestures' VoiceOver tutorial screen — (Large preview)

You can practice the gesture-based navigation and many other VoiceOver techniques in the built-in training program on OSX. You can access it through System Preferences ? Accessibility ? VoiceOver ? Open VoiceOver Training.

After completing the tutorial, I was raring to go!

Case Study 1: YouTube

Searching On YouTube

I navigated to the YouTube homepage in the Safari toolbar, upon which VoiceOver told me to “step in” to the web content with Ctrl + ? + Shift + ?. I’d soon get used to stepping into web content, as the same mechanism applies for embedded content and some form controls.

Using Quick Nav, I was able to navigate via form controls to easily skip to the search section at the top of the page.

screenshot of YouTube homepage — When focused on the search field, VoiceOver announced: ‘Search, search text field Search’. (Large preview)

I searched for some quality content:

Screenshot of 'impractical jokers' in input field — Who doesn’t love Impractical Jokers? (Large preview)

And I navigated to the search button:

VoiceOver announces “Press Search, button.” (Large preview)

However, when I activated the button with VO + Space, nothing was announced.

I opened my eyes and the search had happened and the page had populated with results, but I had no way of knowing through audio alone.

Puzzled, I reproduced my actions with devtools open, and kept an eye on the network tab.

As suspected, YouTube is making use of a performance technique called “client-side rendering” which means that JavaScript intercepts the form submission and renders the search results in-place, to avoid having to repaint the entire page. Had the search results loaded in a new page like a normal link, VoiceOver would have announced the new page for me to navigate.

There are entire articles dedicated to accessibility for client-rendered applications; in this case, I would recommend YouTube implements an aria-live region which would announce when the search submission is successful.

Tip #1: Use aria-live regions to announce client-side changes to the DOM.

<div role="region" aria-live="polite" class="off-screen" id="search-status"></div>

<form id="search-form">
  <label>
    <span class="off-screen">Search for a video</span>
    <input type="text" />
  </label>
  <input type="submit" value="Search" />
</form>

<script>
  document.getElementById('search-form').addEventListener('submit', function (e) {
    e.preventDefault();
    ajaxSearchResults(); // not defined here, for brevity
    document.getElementById('search-status').textContent = 'Search submitted. Navigate to results below.'; // announce to screen reader
  });
</script>

Now that I’d cheated and knew there were search results to look at, I closed my eyes and navigated to the first video of the results, by switching to Quick Nav’s “headings” mode and then stepping through the results from there.

Playing Video On YouTube

As soon as you load a YouTube video page, the video autoplays. This is something I value in everyday usage, but this was a painful experience when mixed with VoiceOver talking over it. I couldn’t find a way of disabling the autoplay for subsequent videos. All I could really do was load my next video and quickly hit CTRL to stop the screen reader announcements.

Tip #2: Always provide a way to suppress autoplay, and remember the user’s choice.

The video itself is treated as a “group” you have to step into to interact with. I could navigate each of the options in the video player, which I was pleasantly surprised by — I doubt that was the case back in the days of Flash!

However, I found that some of the controls in the player had no label, so ‘Cinema mode’ was simply read out as “button”.

screenshot of YouTube player — Focussing on the ‘Cinema Mode’ button, there was no label indicating its purpose. (Large preview)

Tip #3: Always label your form controls.

Whilst screen reader users are predominantly blind, about 20% are classed as “low vision”, so can see some of the page. Therefore, a screen reader user may still appreciate being able to activate “Cinema mode”.

These tips aren’t listed in order of importance, but if they were, this would be my number one:

Tip #4: Screen reader users should have functional parity with sighted users.

By neglecting to label the “cinema mode” option, we’re excluding screen reader users from a feature they might otherwise use.

That said, there are cases where a feature won’t be applicable to a screen reader — for example, a detailed SVG line chart which would read as a gobbledygook of contextless numbers. In cases such as these, we can apply the special aria-hidden="true" attribute to the element so that it is ignored by screen readers altogether. Note that we would still need to provide some off-screen alternative text or data table as a fallback.

Tip #5: Use aria-hidden to hide content that is not applicable to screen reader users.

It took me a long time to figure out how to adjust the playback position so that I could rewind some content. Once you’ve “stepped in” to the slider (VO + Shift + ?), you hold ? + ?? to adjust. It seems unintuitive to me but then again it’s not the first time Apple have made some controversial keyboard shortcut decisions.

Autoplay At End Of YouTube Video

At the end of the video I was automatically redirected to a new video, which was confusing — no announcement happened.

screenshot of autoplay screen on YouTube video — There’s a visual cue at the end of the video that the next video will begin shortly. A cancel button is provided, but users may not trigger it in time (if they know it exists at all!) (Large preview)

I soon learned to navigate to the Autoplay controls and disable them:

In-video autoplay disable. (Large preview)

This doesn’t prevent a video from autoplaying when I load a video page, but it does prevent that video page from auto-redirecting to the next video.

Case Study 2: BBC

As news is something consumed passively rather than by searching for something specific, I decided to navigate BBC News by headings. It’s worth noting that you don’t need to use Quick Nav for this: VoiceOver provides element search commands that can save time for the power user. In this case, I could navigate headings with the VO + ? + H keys.

The first heading was the cookie notice, and the second heading was a

entitled ‘Accessibility links’. Under that second heading, the first link was a “Skip to content” link which enabled me to skip past all of the other navigation.

“Skip to content” link is accessible via keyboard tab and/or screen reader navigation. (Large preview)

‘Skip to content’ links are very useful, and not just for screen reader users; see my previous article “I used the web for a day with just a keyboard”.

Tip #6: Provide ‘skip to content’ links for your keyboard and screen reader users.

Navigating by headings was a good approach: each news item has its own heading, so I could hear the headline before deciding whether to read more about a given story. And as the heading itself was wrapped inside an anchor tag, I didn’t even have to switch navigation modes when I wanted to click; I could just VO + Space to load my current article choice.

Headings are also links on the BBC. (Large preview)

Whereas the homepage skip-to-content shortcut linked nicely to a #skip-to-content-link-target anchor (which then read out the top news story headline), the article page skip link was broken. It linked to a different ID (#page) which took me to the group surrounding the article content, rather than reading out the headline.

“Press visited, link: Skip to content, group” — not the most useful skip link result. (Large preview)

At this point, I hit VO + A to have VoiceOver read out the entire article to me.

It coped pretty well until it hit the Twitter embed, where it started to get quite verbose. At one point, it unhelpfully read out “Link: 1068987739478130688”.

VoiceOver can read out long links with no context. (Large preview)

This appears to be down to some slightly dodgy markup in the video embed portion of the tweet:

We have an anchor tag, then a nested div, then an img with an alt attribute with the value: “Embedded video”. (Large preview)

It appears that VoiceOver doesn’t read out the alt attribute of the nested image, and there is no other text inside the anchor, so VoiceOver does the most useful thing it knows how: to read out a portion of the URL itself.

Other screen readers may work fine with this markup — your mileage may vary. But a safer implementation would be the anchor tag having an aria-label, or some off-screen visually hidden text, to carry the alternative text. Whilst we’re here, I’d probably change “Embedded video” to something a bit more helpful, e.g. “Embedded video: click to play”).

The link troubles weren’t over there:

One link simply reads out “Link: 1,887”. (Large preview)

Under the main tweet content, there is a ‘like’ button which doubles up as a ‘likes’ counter. Visually it makes sense, but from a screen reader perspective, there’s no context here. This screen reader experience is bad for two reasons:

I don’t know what the “1,887” means.
I don’t know that by clicking the link, I’ll be liking the tweet.

Screen reader users should be given more context, e.g. “1,887 users liked this tweet. Click to like.” This could be achieved with some considerate off-screen text:

<style>
  .off-screen {
    clip: rect(0 0 0 0);
    clip-path: inset(100%);
    height: 1px;
    overflow: hidden;
    position: absolute;
    white-space: nowrap;
    width: 1px;
  }
</style>

<a href="/tweets/123/like">
  <span class="off-screen">1,887 users like this tweet. Click to like</span>
  <span aria-hidden="true">1,887</span>
</a>

Tip #7: Ensure that every link makes sense when read in isolation.

I read a few more articles on the BBC, including a feature ‘long form’ piece.

Reading The Longer Articles

Look at the following screenshot from another BBC long-form article — how many different images can you see, and what should their alt attributes be?

Screenshot of BBC article containing logo, background image, and foreground image (with caption). (Large preview)

Firstly, let’s look at the foreground image of Lake Havasu in the center of the picture. It has a caption below it: “Lake Havasu was created after the completion of the Parker Dam in 1938, which held back the Colorado River”.

It’s best practice to provide an alt attribute even if a caption is provided. The alt text should describe the image, whereas the caption should provide the context. In this case, the alt attribute might be something like “Aerial view of Lake Havasu on a sunny day.”

Note that we shouldn’t prefix our alt text with “Image: ”, or “Picture of” or anything like that. Screen readers already provide that context by announcing the word “image” before our alt text. Also, keep alt text short (under 16 words). If a longer alt text is needed, e.g. an image has a lot of text on it that needs copying, look into the longdesc attribute.

Tip #8: Write descriptive but efficient alt texts.

Semantically, the screenshot example should be marked up with