Richard Williamson
- Aug 8, 2020
- 13 min read

Bringing A Conversation online

Updated: Aug 17, 2020

A Conversation was initially a piece of live theatre that Nigel and Louise produced in 2013 and I lit for performances at London’s Yard theatre followed by a UK tour:

As part of the Yard Theatre’s Generation Game season of new writing, Shunt co-founders Louise Mari and Nigel Barrett use Ethel Cotton’s 60-year old conversation course to create a strange and unsettling piece of solo theatre. Barrett, as our seminar leader, guides us through the maxims of this mistress of conversation. It starts innocuous enough, but soon turns sinister.

“Have you ever stopped to think that your happiness depends upon your ability to carry on an interesting and intelligent conversation?” Cotton asks in Lesson No. 1.

Once the COVID-19 lockdown began Louise, Nigel and I began discussing what work might be interesting to do during this strange period. Knowing we had the resources of Stone Nest – a historic grade 2 listed former Welsh Chapel, now a new home for performing arts – available to us we began to think about what we could do which was more than just a zoom call. The initial excitement of seeing performers in their own homes had now started to wear off and we felt ready to present something with all the beautiful qualities and transformative possibilities of a theatrical space. The fact that this show embodied the challenges at the very heart of remaking live performance for a digital platform - as a show that involved a literal conversation with its audience – made ‘A Conversation’ the most interesting and provocative choice.

And with good timing Nigel and Louise were approached by the lovely people from Crossover Labs and invited to propose a show for their exciting new Electric Dreams Online festival.

The challenge

Original production shot

As a piece of theatre, A Conversation was always staged to be interactive, while still based in a traditional theatre environment. The very nature of the source material led to the the desire for Nigel to converse with the audience; a live conversation based on the rules of this book from 1927 – to be viewed through the prism of the 21st century.

In a traditional theatre environment this is quite simple, basic cues – such as the raising of the houselights, or handing an audience members a card – quickly let the audience know it’s time for them to speak, and so they can be made a key part of the performance without feeling exposed.

Initial R&D

The challenge we faced was how to both make a visually interesting piece of theatre while keeping the audience part of the piece, while avoiding the show looking like a normal zoom call, or running the risk of audience members activities disturbing the performance.

We began by spending a couple of sessions in the space trying out things, and Louise developed a storyboard to help us understand how to bring the piece alive

Storyboard extract from Louise

An early idea was to surround Nigel with the “front row” of the audience, making them part of the frame, but keeping Nigel centred. This would allow Nigel to speak to the audience and for their reactions to visible and audible when required, and for the audience to feel like they were present, as in a theatre, together with the actor.

While this felt effective for the general interactions we also wanted a method to really bring specific audience members into the scene. We discussed traditional split screen, or “picture in picture” style visual effects, but these didn’t feel suitable enough to actually allow the audience member concerned to appear a fully immersed part of the show, instead we felt it would be more exciting to bring a physical screen into frame through which we would see a selected audience member appear as if they were standing on the stage, live with the performer.

Storyboard extract from Louise

This would allow Nigel to fully appear to be having a conversation with the person involved, and for their reactions to appear as if in the same world as Nigel.

In the finished performance we ended up with a scene involving four audience members all having a chat together, two on traditional screens, one on an iPad in Nigel’s hand and one projected onto the wall behind him

In addition to the audience integration we also wanted the piece to feel like a fully formed piece of theatre – while still something that felt meant for the screen. With this in mind, we decided to use a combination of projection onto the ragged surface of the building behind Nigel. Content overlaid onto the frame in front of Nigel along with carefully designed lighting, sound, set and props all framed carefully for a single camera.

The solutions

Although Zoom always seemed the logical choice I decided to investigate a few options and so looked at various products on the market, discovering the following:

Zoom

Pro: Very widely used, meaning people would not have to install a specific app to use it so removing technical barriers for the audience
Pro: Low cost and reasonable quality
Pro: Large numbers of people able to join a “call” (up to 100 with basic plans)
Pro: Webinar option allows for increased control over what attendees can do
Pro: Ability to Mute/unmute audience members’ audio and video remotely
Pro: Ability to have a host with almost total control of the call
Con: Has certain audio processing which can’t be disabled
Con: Very much designed for “meetings” so some features, such as audience views, are not ideal
Con: No ability to take more than one audio feed from the entire call, and no ability to take more than either a gallery view or a single camera view from any one Zoom instance

Skype

Pro: Free to use and widely known
Pro: NDI feeds of participants video available
Con: No remote muting/unmuting of attendees audio and video
Con: Often lower quality
Con: app has become confusing over time leading to reduced adoption

WebRTC/Jitsu/Custom site

Pro: Ability to design totally custom experience
Pro: Would be able to take individual audio/video feed of each audience member allowing greater flexibility
Con: Any custom features would require significant development time
Con: Without testing on every available device it would be likely that certain equipment at the audience end would create problems
Con: Audience may be uncomfortable with unfamiliar device
Con: Greater bandwidth would be required at the studio end to allow for a large number of feeds to arrive

With all of that we decided that Zoom was indeed the best choice although it still involved certain significant issues. These issues included:

Zoom has limited “batch” features, meaning that things that are simple for one user – such as unmuting their audio, become complicated when you need to unmute 20 people very quickly
Zoom only shows you one person's video at a time, or everyone in the gallery together, making it difficult to extract different people at the same time
Zoom messes with the audio, even when you tell it not to - meaning that background sound effects can disappear or sound strange
The host has only limited control over what the audience see – without the audience’s input they may see a gallery view of every other participant, or they may get a big view of the show along with a thumbnail strip of the other participants – which we wanted to avoid. Meaning that we would need to find a solution to ensure the audience are properly briefed

A very lucky circumstance happened which solved many of these issues when a friend introduced me to Andy Carluccio of Liminal who it transpired had been working on a zoom implementation. This allowed control of a zoom client with OSC and which exposes certain information about the Zoom call back over OSC – this was a lifesaver as you will see below..

The Picture Frame Gallery

At first glance the “picture frame gallery” seemed simple – all we needed to do was arrange the individual feeds of the audience around the screen, however in reality we had a few problems to overcome:

Zoom only displays the other video feeds as a gallery view
Zoom rearranges this gallery view depending on how many people are in the call
Zoom moves people around the gallery view as and when people join/leave the call

My initial plan was simply to use video mapping software (such as Millumin) to take a feed of the gallery view and set up crops to remap each square to a new size and position. While this worked, it relied on knowing exactly how many people would be in the call and setting up the mapping appropriately – as soon as this number changed, the size and layout of the call would alter and I would end up with weird crops

Once I discovered ZoomOSC I was able to receive a notification of how many people were in the gallery, as well as to be notified when this number changed. This gave me the information but how to make use of it?

Extract of shader code

The eventual solution I came to (with the help of Ashley Green) was to write a custom ISF Shader to re-render the live feed in real time to create the required output. ISF shaders are great as they can effectively rearrange pixels at the graphics card level and so are super flexible and very fast.

The final product works as follows:

Shader properties

Take an NDI feed of the zoom gallery view
Take the count of the number of cells in the gallery
Take the index of the show cell (to ignore it)
Calculate the number of rows/columns the specified count will use
Calculate the maximum possible size of each cell within the size of the gallery frame
Break up the image and render the chosen cell into the relevant position

After creating the shader I then created a separate windows application to act as the bridge between zoom and the shader - this would process and pass on some of the OSC and give the zoom operator some batch controls (including muting/unmuting all users) for use during the performance

The on-stage screens

For the on-stage screens we needed the ability to send any one participant’s video to one of four feeds on stage, for ease of operation we wanted these feeds to run through Qlab (which was running the pre-recorded content)

My initial thought was to use Zoom Rooms – which is a piece of zoom software which allows you to select up to three meeting participants and put them onto a monitor each. I would then send an NDI feed from this to Qlab which would route it to the screens. However, after experimenting I found a few problems:

Any one machine could only cope with three outputs, and needed a physical screen for every output
When using zoom in webinar mode with our show feed spotlighted in Zoom, one of these outputs would always be the spotlight feed with no way to disable
The above would mean we would need two machines and 6 monitors to get the four outputs we required
We would also need two iPads to control these two machines
We would then need two Zoom Room licenses to allow all of this to function

While the above did not make zoom rooms impossible they did make them less attractive, I then realised that with a small modification I could use the shader I created for the gallery to take a crop of the gallery and expand it to full screen. Experiments showed that the quality of this was surprisingly good and so would be an acceptable solution

With this in mind I adapted my control application to allow me to control additional instances of my shader in Millumin, each with its own NDI output. ZoomOSC massively aided in this as it has the option to output the “galOrder” – a list of which participant is where in the gallery, allowing our Zoom operator to easily pick the relevant person, but even better, it allows the shader to automatically adjust to keep the correct crop if someone else leaves the call while a participant is pinned.

Controlling the show

As this was an existing piece we already had a qlab file containing the sound for the show, however we hadn’t originally had any video - which we felt was important to enhance the piece for streaming.

Video

To keep things simple for operating I decided to use Qlab for all the cues in the show, other than some specific zoom operations for which we had a second operator, this all then fed into OBS - a very powerful, open source, piece of software which allows to easily combine live and pre-generated content and package it for live broadcast into Zoom

Any pre-recorded video was inserted as a standard video cue in qlab and the physical screens/projectors fed directly from this machine. To allow for video to be placed as an overlay on top of the camera I then created an additional syphon surface in qlab which fed into NDI Syphon to allow for the content to be transmitted over our network using the powerful NDI protocol

In order to change scenes in OBS (since the gallery surround was sent directly to OBS bypassing Qlab this required some additional control) I used bitfocus companion to act as a bridge between Qlab and OBS since OBS does not support OSC directly. Companion is a very powerful piece of software designed to work with a streamdeck but happy without.

Lighting was controlled by an ageing Strand 520i triggered via MSC from qlab, which kept the operation very simple and was a nice, nostalgic, throwback to my younger days as a strand programmer!

System diagram

In general the systems worked very well, the weakest point appeared to be NDI-Syphon which would sometimes crash (luckily so far never during a performance), and each time it is launched requires re-configuring for the show as it appears to have no way of storing a configuration. The lack of NDI support in Qlab itself is very frustrating - especially since it now appears to be a pretty standard part of many workflows.

The other issue I faced was that, thanks to apple removing most ports from their machines, I needed to use external network adaptors on the two macbooks. I discovered that while USB-ethernet adaptors claim to work up to 1GB, they do not cope nicely with both NDI and Dante on the same adaptor. I solved the problem by using a firewire adaptor on the older machine and in future may consider two sepate interfaces for the two protocols.

Audio

The original Sound Designer, Jon McLeod, was on board but sadly unable to travel to our studio. This ended up being a benefit as he was able to edit his original content from home (using teamviewer to connect to qlab running on site), and listen to the content over zoom meaning he only heard it in the way they audience would.

Audio schematic

Along with getting the audio from qlab into zoom - which we did by sending content via Dante over the network to OBS which fed into zoom - we also had to ensure that Nigel could hear the track, and that Nigel could hear the audience without there being issues with them hearing themselves

We discovered a few things along the way - some of which were obvious in hindsight, others maybe not..

It was imperative that the zoom instance into which we fed Nigel’s video and audio was also the zoom instance from which we took the audio of the audience – this way we could utilize zoom’s audio processing to our advantage and avoid any of Nigel coming back into our feed
By close mic’ing Nigel we could have enough audio coming back into the room for Nigel to hear the audience, while zoom was clever enough to cancel that sound out of the feed that went back into zoom. We did however have to manually adjust this depending on the volume of the audience member concerned - a compressor would probably have been useful in the chain
We should never feed any of the audience’s audio back into zoom, as zoom does this for us
Having any soundscape running while the audience are un-muted is likely to be unsuccessful (as zoom will often just cut that out)
Slow fades to silence will confuse zoom, even with “original sound” turned on
Audience members are often unaware of background noise in their area (especially when wearing noise cancelling headphones) so the zoom operator would need to keep careful watch on the gallery to mute anyone causing a disturbance to the piece

Teaching the audience

The final - and possibly most unexpected - piece of the puzzle was ensuring that the audience would watch the piece in the way that we imagined. For the show to work we needed the audience to only see the other audience members when we wanted them to, and we also wanted to encourage them to participate when required - and not when not

We decided early on to ask the audience when booking whether they wanted to be in the 'front row' - ie to be placed in the gallery and to have the ability to speak to Nigel during the performance. In hindsight we feel we could have explained this question better as many people found the term front row to be scary, and so refused meaning we had to do some encouragement to get people to move to that.

For those audience members who were not asked to participate zoom webinars is pretty perfect - by leaving audience members as 'attendees' they have no choice but to watch our speaker view - which we choose, they are unable to see the other audience members - unless we include them, and they cannot disturb the performance - so reducing the risk of unexpected 'zoom bombs'. We did however find that some audience members were concerned when logging in that their camera did not start up and the zoom interface felt different to that which they would normally use, leaving them to think they may not have correctly joined the show.

For those audience members who joined the 'front row' we used the zoom 'panellist' feature - this allows us to promote attendees to a higher status but still gives us the opportunity to control when these audience members can mute/un-mute themselves. This generally worked well but we did discover a few issues:

The audience member needs to ensure they are in "Active Speaker View", otherwise they will see the normal zoom gallery view
Unless the audience member is watching in full screen mode, there is no way for them to hide the other participants
Once in full screen mode the audience member needs to minimise their "thumbnail view", otherwise they will still see the other audience member
While (thankfully) the windows and macOS views are the same, if watching on a phone or tablet zoom functions differently so we had to ensure we also gave users with those devices the correct instructions
We have no way of knowing during the performance whether the user has seen or understood any instructions we have given
Audience memberes often only join the show at, or very slightly before, the performance start time giving little time for them to understand the information given to them, or for us to sort them into the correct categories
When a zoom performance starts late it is difficult to communicate to the audience the reasons for the delay

To solve these problems we developed a set of slides which played out pre-show to guide the audience and reassure them they were in the correct place, for the first few performances we had the additional problem that audience members were only asked to join five minutes before the start time - leading to a mad rush to attempt to allocate people to the correct role (front row or not), and leaving them very little time to read or digest the instructions

We also developed a set of instructions to give to the audience pre show which we hope helps them to get their setup correct

While these solutions helped we still found that audience members were often confused or anxious at the start of the show, we had no way of checking that they had configured their computer correctly.

If I were to do another similar piece of work I would encourage the artists I work with to include some form of introductory section to the piece which would feel like part of the performance, but allow information to be reiterated and give us time to perform any actions we need to once the audience have all joined.