Text-to-speech continues

A lot of progress has been made on making more interesting text-to-speech options in Max, using the Web Speech and Google Maps APIs. It’s on github as performable-maps, though still not quite a release candidate yet.

The gist:

The Web Speech API is implemented in a few browsers (Chrome, Safari, Firefox). The voices available in the API are OS- and browser-dependent: when run in any browser on a Mac, the voices that come with the ‘say’ command are available, while when run in Chrome on any OS, there are several voices available from Google that sound very much like the voices used in Google Maps turn-by-turn directions. Being able to use these voices in Max would be great, right? I agree!

Challenges:

  • The Web Speech API does not work with the Web Audio API in any form.
  • [jweb], Max’s embedded web browser, is built on Chromium, which does not support the Web Speech API – you can load a page that uses it, but no sound will be produced.

Solution:

Using an intermediary localhost Python server and Sqlite3 database, get the browser and Max to communicate by posting and polling. Python is natively installed in OSX, and not hard to install in Windows. The interaction looks like this:

  1. When Max loads, run a script to open a tab in Chrome that loads the localhost page. The page is basically a shell: the user won’t need to interact with it at all. It’s only loaded because it’s the only way for the Web Speech API to function.
  2. When the browser loads, it gets its list of voices and sends it to the server. It then begins polling the server (one request every 3 seconds) to see if there are any new requests of text to be spoken.
  3. The user clicks a bang in Max to populate a umenu with the list of available voices, and chooses one.
  4. In this implementation, the user also chooses a set of directions from Max; we’ll be iterating over a list of driving directions and requesting the next step from the browser.
  5. Pressing the “prev” or “next” buttons in Max sends data to the server about which direction we’d like spoken aloud, which path it came from, and the exact time it was requested. Max immediately begins polling (one request every .2 seconds) to find the exact start/end times of the speech to figure out when to start recording.
  6. When the browser discovers an unchecked speech request, it queues up the utterance, and plays it when ready (multiple utterances won’t occur simultaneously). SpeechSynthesisUtterances have two event listeners, among others: onstart and onend. These are fairly exact markers (with a little wiggle room) of when sound begins and ends; when these listeners are triggered, the browser sends notification of the trigger to the server.
  7. Max has been polling furiously; when it receives onstart, it begins recording, and starts counting milliseconds. When it receives onend, it stops recording, stops polling, crops the buffer~ object to the number of milliseconds, and writes it to a file in a local directory.

Note that in order for Max to get sound from the browser, you’ll need to use Soundflower or Jack, set it as your computer’s stereo output, and set Soundflower/Jack as Max’s input.

performable-maps! Check it out!

Personal text-to-speech odyssey

I’m currently in the process of working on a piece for percussion, and I’ve found that thinking about the gritty technology stuff first helps me hone in on what I want a work to focus on. I’ve got a few different stumbling blocks taking space in my head, so I’m writing to get all the ideas out.

For this piece, I’m hoping to have the performer get some driving directions from the Google Maps Directions API, and use the Mac say command or similar do text-to-speech that I can then process with Max/MSP. I have the following constraints:

  1. The piece should be as OS-independent as possible.
  2. The performer should not need continuous internet access, and should still be playable without it, as not all venues will have available WiFi.
  3. The piece should not ask the performer to install anything, or an absolute minimum of things if necessary.
  4. All aspects of the piece should be controlled from within the patch; running outside applications/files should be a last resort.

Given this, I basically have one option for text-to-speech: the Web Speech API, which is a part of HTML5, and is standard on Google Chrome. Mac say obviously hits only one half of the OS market. Asking the performer to install Chrome doesn’t strike me as the worst offense; most people have Chrome already, even if they use it infrequently.

The devil now is point #4. In order to use the Web Speech API, I have to ask the performer to localhost a page (on the off-chance that two people might be playing this piece in two different places at the same time), and open a Chrome tab. Not only this, but the local-hosted page needs to update live as requests to the server come in, and whatever I use to host needs to be something that many people have/use. Options:

  1. Python is pretty common, comes installed on all Macs, and a Windows install is not too difficult. Python contains a SimpleHTTPServer, which can be run either with the shell object in Max/MSP (unfortunately also Mac-specific), or with the py/pyext objects. Unfortunately, py hasn’t been updated in a long time, and doesn’t come with good documentation. There are Jython and embedded web server objects as well, facing similar issues.
  2. Possibility of running all the Python nonsense as an executable somehow, through Python libraries meant to do this.
  3. mxj might allow a Java server. This is where I am right now.

 

 

Path to Hold Still

I premiered Hold Still on Friday at UMW’s Undergraduate Research and Creativity Symposium in the ITCC’s Digital Auditorium, and it got a really good response! A lot of the students and faculty involved in the poster sessions that happened right before stuck around, which meant I had a pretty strong science contingent in the audience. Andy Rush of DTLT did a great video of it, which you can see here:

I’ll do a post in a bit about the full setup of the piece, but I wanted to provide some context for the lack of posts between the start of this independent study and now.

About a month and a half ago, I was still struggling to figure out what I wanted the subject of the piece to be. Having been to a number of electroacoustic festivals, I’ve seen a lot of pieces that invoked some really interesting technology without surrounding the tech with a piece that could give it staying power. While this is a genre that hungers for innovation, and these types of pieces are important experiments, it made me heavily aware that the wrong piece could turn my paper controller concept into a gimmick rather than a strength.

The more I experimented with sounds and Max/MSP, the more I realized that the best thing to do for the piece would be to make it about drawing. While deliberating, I happened upon a spoken word video on YouTube that really spoke to me. One video became another, and another, and before I knew it I’d binge-watched nearly everything the Button Poetry channel had to offer.

So from that, I knew I needed to write something personal, and it needed to be mostly dialogue. I initially wanted to piece together individual words from spoken word recordings like a ransom letter, because I thought it would give me a good balance of openness and camouflage. I had an idea of where I wanted the words to go, but just me saying them? That seemed way too direct. Then, mid-spoken-word-binge, I landed on poet Sarah Kay’s TED Talk, and after it, I knew I needed to use my own voice. Her talk is about helping high school students use spoken word poetry to discover their voices, and that using yours is important, especially when you are afraid to do so. Even if I wasn’t 100% confident that everyone would accept the poem I was about to write, that was what pushed me to do it: I have something to say.

I wrote up the main voice of the poem in a night, and did the secondary voices the next week. The whole time, I was really afraid of what I was doing, because a lot of electroacoustic music is about obscurity and collage. I’d never heard something like this, where whole sections of a voice are played back with very few changes; plus, a lot of the subject matter turned out to be very personal, and opening up your heart to a room full of strangers is daunting.

The first time I played the tape part back for Mark, I was nearly vibrating with anxiety. Alongside running on very little sleep, I had the obvious/normal fear that he’d reject this out-of-genre poem. Halfway through listening, he said “this is incredible,” and my fears disappeared. He helped assure me that the piece is stronger because of how different and raw it is; it’s vulnerable, it’s personal, it’s necessary. It’s not just some throwaway nostalgic poem scrawled on looseleaf.

I didn’t write this post to try and glorify the process of writing Hold Still; I wanted to share the experience of confusion and confidence. I can’t be the only one who’s considered backing down from performing something this open in front of a crowd, for fear that no one would want to hear it. After the climax of the piece, after its darkest point, I write “there is no voice but your own” somewhere on the paper controller, because that’s what writing this piece has meant to me. You have a voice, it is different from mine, and I want to hear what you have to say. Please don’t hide.

Working paper pitchbend!

Very little changed in the circuit, but somehow it still led to the LED lighting up! This quickly turned into turning the paper into the pitchbend control, which worked. Some discoveries, some things I’d guessed would be true:

  1. Graphite has a really high resistance! When plugged into an analog in, which can take values from 0 to 1023 (0 means the contacts on the test leads are directly touching), the smallest value I’ve seen in the serial monitor is ~780. If you look at the Arduino code in the post below, change the map line to map(paper, 780, 1023, 0, 127). Still not perfect, but better.
  2. Having an extra, normal resistor in the circuit is important, as it is inevitable that your leads will touch accidentally at some point. RIP red LED.
  3. Interacting with the graphite as in the video will probably cause the connection to deteriorate over the course of the piece. In order for the graphite to work, the lines have to be really dark, and it helps if they’re thick enough that the performer doesn’t accidentally veer out of bounds. Since drawing is important here, the performer will probably have a pencil to correct issues. Or, I could take the slow destruction of the controller as part of its message/design, which could be cool.

Arduino MIDI controllers – instructions

If you’re interested in how using Arduino for MIDI works, here’s how to make the simple controller I described in the previous post. These instructions are based on a number of tutorials, especially Arduino, Sensors, and MIDI. There are many methods of setting up an Arduino for MIDI, but this is what I found to be the most straightforward.

Wiring

First, hook up your Arduino as follows. The squiggly thing in the lower left is the photoresistor, used for pitchbend, and the large circle at the bottom is the push button for note on/off. Red and black cables are for power and ground, while the yellow cables are for sending input values to the Arduino (digital in 7 for the push button, analog in 0 for the photoresistor).

circuit sketch

Code

Arduino boards run code written in a modified version of Wiring, which was based on Processing. Copy the following code into the Arduino IDE (adapted from this tutorial):

int velocity = 100;
int noteON = 144;
int pitchBEND = 224;
const int BUTTON = 7;
const int SENSOR = A0;
int val = 0;

int old_val = 0;
int state = 0;
int old_state = 0;
int note = 50;

void setup() {
  Serial.begin(9600);
  pinMode(BUTTON, INPUT); 
}

void loop() {
  int lsr = analogRead(SENSOR);
  int bend_val = map(lsr, 0, 1023, 0, 127);
 
  old_val = val;
  val = digitalRead(BUTTON);
  old_state = state;
  if (val == HIGH) {
    state = 1;
  }
  else {
    state = 0;
  }
  delay(10);
  if (state == 1 && old_val == 0) {
    MIDImessage(noteON, note, velocity);
    delay(20);
  }
  else if (state == 1 && old_val == 1) {
    MIDImessage(pitchBEND, 0, bend_val);
    delay(20);
  }
  else if (old_state == 1 && state == 0) {
    MIDImessage(noteON, note, 0);
    delay(20);
  }
}

void MIDImessage(int command, int MIDInote, int MIDIvelocity) {
  Serial.write(command);
  Serial.write(MIDInote);
  Serial.write(MIDIvelocity);
}

Verify the code and upload it to your Arduino via the USB connection. Once it’s uploaded without errors, it’s time for some serial trickery!

Interpreting Serial as MIDI

There are a number of different ways to send the serial information and have your computer interpret it as MIDI, with pros and cons to each. The easiest method is the one I’m using, and the most commonly used in MIDI controller tutorials.

  1. Install a MIDI <-> Serial bridge program, like Hairless. This will interpret serial information coming from a given USB port and output correctly formatted MIDI messages through the selected MIDI output channel. This is easy, and the download requirement is minimal. However, you need to have Hairless running any time you want to use your device, and on low-grade PCs like mine, Hairless will crash if you accidentally send too many MIDI messages at an inhumanly fast rate (happened during testing).
  2. Deconstruct a MIDI -> USB adapter cable, detach all its cable, and use the board to do the converting for you. This requires extra parts, including the MIDI converter, and an extra USB outlet to put in your breadboard. This is a pretty good permanent solution, but in the short term, it costs more money to test and has more moving parts if you’re still just starting out.
  3. Write custom drivers. (Not my wheelhouse!)

Install Hairless, and check that the baud rate is set to 9600 in the Preferences. Select the Arduino as your Serial port, and select your desired MIDI out port. MIDI out information depends on your OS. If you’re on Windows, the only default MIDI out option will be the Microsoft Wavetable Synth. This is fine for initial tests, but isn’t a channel you can select in a DAW. To set new MIDI channels, I installed loopMIDI, which lets you create new virtual MIDI channels that last as long as you are logged in. On Mac, most tutorials say to send MIDI information to IAC Driver Bus 1.

Test!

Open up your desired DAW, and give it a try. The code provided will pitch your note up a half step on each button press; if the pitches get annoyingly high, press the red RESET button on the Arduino board. Changing the light levels in your room or cupping your hands over the photoresistor will alter the pitchbend values sent. The below screenshot shows Ableton with Hairless and loopMIDI up for reference.

circuit screenshot

 

Easel controller: some background

Instrument creation! I’ve always been fascinated by hackerspaces and the incredible things I’ve seen people create with microcomputers, but I never thought device creation would be something I could/would do. For the first Christmas in a long time, I had a moment of clarity when my dad asked me what I wanted for Christmas. Haven’t I always wanted to try an Arduino?

And to my delight, he got me one of the Arduino Uno starter kits! Just enough to get your feet wet – wires, LEDs, a small breadboard, moment buttons, photoresistors, and a whole mess of normal resistors (buyable via MakerShed, if you have a hankering). After playing around with it, I got obsessed with the idea of turning it into a MIDI controller. After all, most MIDI controllers connect via USB, and the Arduino has a USB connection already for uploading Processing patches. Shouldn’t that work?

Long story short, it does, though not quite as directly as you might want. Since I haven’t purchased new components yet, I have it set up so that pressing the moment button controls note on/off messages, and a photoresistor controls pitchbend (dark room = lowered pitch, brighter room = raised pitch).

But to me, the goal of making my own MIDI controller is creating something unique and visually interesting, for the instrument to be fascinating in its own right. So I did some research.

Kobakant’s How to Get What You Want is a fascinating website all about wearable technology experimentation. They explore creating sensors and actuators, and all the different kinds of materials used to put technology in clothing. Good examples to start with are fabric potentiometers, their knit stretch sensor shootout, and embroidered cloth speakers (!!).

When I first found this website, I was overly excited about the possibility of turning a jacket into a MIDI controller, using a zipper as a potentiometer, all of it. There’s even an Arduino model specifically designed for use in clothing! But many of these materials are a bit pricey, conductive thread is known to tarnish over time, and I am not an experienced seamstress. So, while the excitement is still palpable, I’ll table my foray into e-textiles for now.

I kept looking around, until I had a thought – would graphite work? A quick Google search turned up a video, which inspires a number of exciting possibilities. First, it’s possible to draw on paper and have a circuit – that’s awesome! Second, since graphite is a much better resistor than it is a conductor, drawing a line of graphite naturally makes it a variable resistor, which also naturally makes it a potentiometer. That’s exciting too! Further digging turned up an instructable on making a four-channel mixer on paper, though it seems a little difficult to control. Also, check out these circular graphite sequencers! Though some of the sounds are a bit chirpy, they remind me of the beat to Hazey by Glass Animals.

As far as testing graphite circuits, I haven’t gotten it to work yet. I’ve been working with just lighting LEDs separate from the Arduino (I haven’t short-circuited anything on the Arduino yet, but I’d rather not if I can avoid it). Some troubleshooting things I’m going to try:

  1. Get a multimeter (someone nicked mine), and check if the resistance of my graphite lines is too high. Or, if I need a better 6B pencil!
  2. Get a 9V battery. Most of these experiments/devices use 9V batteries, while the Arduino puts out signal at a 5V, and the battery pack I’ve been using is 6.5V. It’s possible the resistance is too high to light LEDs in my tests, even though LEDs light at 3.3V. This opens up the issue of needing to step down the voltage from the 9V to the Arduino, but I’ll cross that bridge if it’s necessary.
  3. Get alligator clips. Though this is probably less of an issue, pressing wires down to paper isn’t necessarily the most sure-fire way of completing a circuit. With bigger leads that would attach to paper, it would leave my hands free to test other things.

Hello World, and some goals

Here marks the start of a personal compositional odyssey!

After being tech director for numerous Barn Dances and attending all manner of electroacoustic festivals, I have a larger than average sample of classical electroacoustic music. But for all that I have experienced, I have yet to write any music in this style, and I intend for this independent study to be the push that forces me to explore the technology!

To that end, I will be writing at least two pieces this semester, which have broad definitions to start:

1. Custom controller & live electronics. After getting an Arduino Uno for Christmas, I immediately jumped on the idea of making a custom MIDI controller using it. I’ll do another post on all the research I did for possible directions to take the controller with non-standard materials, but I eventually chose to focus on using graphite drawn on paper as the circuit. We’re calling this the easel controller for now. Sonically, this piece will involve controlling an instrument in Max/MSP and triggering different sounds using the easel controller.

2. Electroacoustic composition with video. This piece will have a notated harp part to be played live, which will be processed with an instrument in Max/MSP. After the piece is written, I’ll make a video to go along with it.

I’ll be posting here fairly regularly as I make strides in this project, to document my progress.

(Side note: this is my first time seeing the Twenty-Fifteen theme. I’m impressed – it looks really good!)