The WordCloud
The WordCloud is a research product that allows people to understand how VAs hear the human voice. This research product is a response to the research question of ‘Can Designing in friction to VAs change domestic behaviours and routines?’The WordCloud consists of a microphone that captures audio, an E-Ink display and an OLED display. Listening to the environment it has been placed in, the device presents spoken words it recognises into visible text and produces a WordCloud on the E-Ink display. The E-Ink display is not entirely dynamic and refreshes at a set time delay. The small OLED display shows users the last word the device has heard to provide instant feedback of its functionality to the user.
The WordCloud is predominantly designed to be a glanceable display with limited controls. A single button allows users to clear the memory of the WordCloud, i.e., delete all the words that the device had heard.
The core hardware of the WordCloud consists of a Raspberry Pi 4, an array of four high performance digital microphones (ReSpeaker Mic-Array V.2.0), an E-ink Display (Inky WHat), and an OLED display (PiOLED). The Raspberry Pi was running Ubuntu and the real-time speech to text recognition was achieved using Microsoft Azure Cognitive Services. During the development of the WordCloud the researchers found only two APIs that had real-time speech to text capabilities that would run on a Raspberry Pi with any sort of competence - Microsoft’s Azure Cognitive Services and Assembly.ai. The research team selected the Microsoft API as it could translate many languages, including Swiss German, whereas Assembly.ai was only effective in English.