Well, by now, I told some people about my idea and received only positive
feedback so far. Other ones appear interested. So I decided to elaborate,
what I come so far concerning how computers can be a bit more accessible for
humans with a hearing handicap.
This entry is a translation from
my German one.
But first, what will it be about here?
- State of the art
- My motivation
- The application: Point of view of an user
- The application: Point of view of a developer
- Potential problem areas
State of the art
When it comes to "accessibility" in terms of the internet, its almost everytime
about users with a visual handicap. I can understand this, since this people
face the biggest hurds. But there are more kind of users, which are
confrontated with barrieres, too. Just think of learning-disabled ones, or
hard-of-hearing humans (h-o-h for short).
So I decided to focus on humans with a hearing handicap, especially on sign
language users, since I sympathise with them.
You could say, that people with a hearing handicap are lucky ones.
They're almost always sighted and hence can enjoy web applications like the
developers and designers intended to. Everything's fine. Okay, sometimes one
have to add some subtitles and they're done. You'd think.
But you're overlooking, that the spoken language isn't their mother tongue.
Better think of h-o-h like foreigns, who aren't capable to fully understand
your language. You have to think hard about how to phrase your sentences, when
writing then. Another drawback is, that emotions and interlined opinions don't
always arrive on the signing readers' side.
Several approaches were taken, to introduce an avatar for signing. A German
deaf blogger, HeWriteSilent, enumerates some of them
in his blog entry.
For example, have a look at
Or, pretty young, SiMAX.
You'll find more, when you enter "sign language avatar" in your favourite
search engine. But none of them could break through. The German deaf pirate,
Julia Probst, explained why.
A few days ago, this message flooded the press within the deaf crowd:
Although I'm fascinated by it (even when I saw a first video back in June),
I see some problems:
- It is proprietary. This implicates, that an user cannot assume, not being
spyed on. Aside you cannot prevent the software from encroaching upon areas,
it wasn't meant to access. Moreover, you cannot prevent the project from
become discontinued at any point of the future (say, because it isn't
- It's by Mircosoft. Not only haven't they proven not to be savvy in
security, but I'm also afraid of them constraining software on their
operation system-only. This corresponds first 1. And APIs don't outlast for
eternity (see the recent skype API shutdown).
- It's commercial. Because, the source code isn't open (see 1.), Microsoft
is able to dictate the price. And when you're interested in h-o-h's affairs,
you'll quickly recognise, that they have to face several burdens. Yes, health
insurance funds and social assistance offices pay most of the accessible
technology, but I'm afraid, they won't do in eternity.
Moreover, people depend on their goodwill.
Originally, I wanted to work on a free software solution on the q.t.
My motivation arise from this three sources:
However, I do know of one Indian developer who has made an important
contribution. His name is Krishnakant [Mane]. He came to a talk I gave, and
said that there was no Free Software that could speak words from the screen,
and so asked what he ought to do? I said, “Write some.” So a few years later,
he came to one of my talks, and reminded me of what I had said to him earlier
— and said that he had “written some”. So now he has made major contributions
to screen-reading software, which he and thousands of other people use.
- My h-o-h daughter. I want to prepare a world for her, where she can unfold
As mentioned above, it was meant to be developed on the q.t. Then I wanted to
showcase a proof-of-concept and ask the experts for support. But then I've read
Aaron Seigo's thoughs on introducing new ideas to free software communities.
Aside, a friend of mine from my Google+ times
wrote some lines in his blog,
which tipped the scale (translated by me):
For me, freedom includes the point, that I feel unobserved in my life and am
not bond on something, which sole purpose is to consider me a commodity.
It's more then absurd. I've got the notion, that some providers treat me
dismissive, patronising and rip me off. A state, I wouldn't expose someone
in wittingly. Modesty forbids it.
So I will publish it sooner on GitHub. But I need help, too. I'm especially
unsure on the choice of license.
I think, I should talk about it on IRC. But more about it below.
The application: Point of view of an user
Let's begin with the FrontEnd. So that piece of software, an end user will most
likely get to see. At first, I want to write for the web. I imagine to see a
small icon (e.g. a hand, a common symbol among sign language users) beneath
text, indicating readiness for translation. Later on I want to offer programs
for native use cases. Because I like Python very much, I will write the
software in this programming language using a svg python library. But that's in
the remote future. My intention is not to translate chats
(because it would need much more information in real time to decipher emotions
and so on), but to offer something, to translate websites easily in sign
I just mentioned SVG. This is an image format for so called vector graphics.
Those have the advantage, not to save pictures as set of points, but as a set
of curves and lines. Hence, a high resolution at arbitrary size can be
guaranteed. Beside that, it's an official
standard by W3C, the World Wide Web
Consortium, which defines rules for HTML. Moreover it is written in XML, so
plain text. On the one hand, it's highly compressable and on the other hand,
text is easily modifyable.
text can be embedded also :-)
The compression rate is important to me. I mentioned above, that avatars
couldn't break through. I assume, these are the reasons:
- Bandwidth. Have you ever tried to watch a video on a smartphone via UMTS or
LTE? Lags for granted. In my opinion, deaf users can skip audio. So only the
image has to be transmitted.
- Emotions. I'm going to construct my avatar in several layers of SVG.
With doing so, a designer is able to focus on special areas without having to
worry for the rest. Aside it is possible to exchange only certain elements
(e.g. the eyes).
- Caching. Due to exporting certain elements in stand-alone libraries it is
possible, to cache only these. Hence, they have to be loaded only once.
I want to highlight some use cases.
As said, I'm in touch with sign language users every now and then. If one don't
met one in person, one usually have the alternative of written communication
or video chatting. Well, German (or in general: spoken language) isn't the
mother tongue of sign language users. I find myself wishing to sign via my
smartphone. But it isn't possible, yet.
"Das große Wörterbuch der Deutschen Gebärdensprache"
(Schreenshot of the iPhone app)
Aside I'm in the process of learning this (wonderful) language.
A manual would be handy. Well,
there are already apps for iOS and Android,
but they're pretty expensive. EUR 9,00 per theme package.
I hardly believe, someone is willing to pay for it.
So, let us imagine the program would be already available. How should one use
the app? Mrs Kestner came up with a pretty clever taxonomy:
Signs are sortable in, wether they're signed with one or two hands, how much
fingers are used and where they are located.
I could imagine, that this can be accessed via swipe gestures or submenus.
But I need a mock-up for it …
The application: Point of view of a developer
As developer, you give thought about a project in the forefront. For example,
I discovered, that sign language differs from region to region. But there are
also gestures, which reoccur. Hence I want to keep the libraries as modular as
possible. This might look like this:
- A library describing how a move looks like. Which XML snippet has to be
manipulated to see a thumb up? Which lines have to be faded-out?
Which elements have to be recalculated (but without the description,
HOW this is accomplished)?
- A library associating a sequence of words to a gesture, describing the how.
Which steps are necessary to move a hand upwards? Which lines have to be
manipulated in order to see a surprised face?
- A library devoted to handle the design. Do you know Sims? In the newer parts
there are plenty of options to design your figure. I want to stick with a
comic figure. See the problem areas below.
- A library for parsing the text. In the beginning, I'll focus on the spoken
language's grammar. But I'd like to see, how words can be identifed better
I inspected LanguageTool, but this
doesn't fit my needs.
In the end, it shall be possible to change the color of skin or of the
complexion without interfering or dealing with other libraries. To accomplish
that it's necessary to define fixed points. Those define for example the
position of the head. Or the beginning of the hips. Such stuff. How plump a
body can be regulated later on ;-)
It's important to me, to isolate SVG snippets as much as possible to faciliate
the change of the manipulating programming language later. I'll begin with
Java as well.
Let's consider the embedding. When I look at the recent distribution of
accessible elements in websites and apps, I have to conclude, that the process
is disullusioning. Of course, there are a couple of developers, which pats the
backs of their fellows. But it drifted away from the main stream. In the
beginning, the design was … well, technical. There were text files with
images and so. Then framesets appeared. In my opinion, problems started here.
The libraries have to be included easily. I'm not finally sure, wether I will
deploy them as jQuery plugin
(I'm using jQuery.SVG.js, which is
compatible to jQuery) and then readout text through a HTML-class ala
"sign-this", or wether it will become a browser add-on. I can "ensure" the
decoupling of accessibility from the developers' goodwill with an add-on.
But I tend to the first solution, since there's a fact, which isn't easily
readout: the mood. It's a meta-information, which have to be shipped out with.
Emotions are an important part of how a message is interpreted. Imagine a
neutral face to a sad event. It'll have a completely different effect.
Just imagine, what else could be done with it!
Let's say, the mood is also defined via HTML classes
(at least as long as there isn't a standard for it). Maybe something like
"mood-". You could use it for sign language interpretion. But you could
recycle it, too. Say for adjusting the background colour. Look, there are
apps for ambience.
Why don't work collaboratively on it?
The mood could be used for blind users, too. The backing vocal could be
adjusted. I have to work more on
screenreadern for smartphones
(needs Flash, Video, ca. 22 minutes) and desktops, before I can get more in
detail. But it's a good chance to promote multiple purpose in the accessibility
Potential problem areas
Das große Wörterbuch der Deutschen Gebärdensprache
(Screenshot der Desktop-Anwendung)
Well, the beloved patents. I'm worrying about failing due to the
Großen Wörterbuch der Deutschen Gebärdensprache
(large dictionary of the German sign language).
I've successfully worked together with the publisher. She's really engaged in
the affairs of h-o-h. And have to make money with her publishing company. I can
understand this. But is it possible to grant a patent on grammar? Would it be
possible to avoid potential conflicts with a clever
choice of a GPL-compatible license
The plugin I use is licensed with
Workplaces. Another topic I spend much thoughts about. I've decided to use
comic figures wittingly. On the one hand, they're easily described, on the
other hand they're low-quality designed. I don't want to mulct
speech-to-text reporters of their jobs. My software ain't to be displayed on
websites, where text have a certain life span. Maybe it's even usable for real
time translation. It's meant to be so.
Copyleft. Well, it's nice to have source open for the public. And I can only
wish it for such a basic technology like the one I want to construct. But I
want to allow a commercial usage as well. And hence I'm not sure, wether a
copyleft will prevent this effect …