WebRTC Video chat tutorial using Rust+WASM
Today we are going to walk through a simple video-chat application in Rust using WebRTC.
Scope of this tutorial
- Overview of WebRTC and Signalling flow
- Rust to WASM code for 2 way video + audio call client in the browser
- Seriously Simple Signalling Server
Pre-requisites
- Rust + Cargo installed
- 2 WebRTC-capable Browsers
- Some experience using Rust.
Longish Foreword
You need to enabled Media Capture (Camera and Mic access) from url’s without certificates (instances running hosted on your Dev machine that dont have SSL/TLS certs).
This needs to be done both on mobile and Desktop browsers.
To do this navigate to chrome://flags
and add your server IP and port (i.e. http://ip:port) under Insecure origins treated as secure
.
On Firefox this is under about:config
, change media.devices.insecure.enabled
to true
and media.getuser.media.insecure.enabled
to true
.
If you plan on using your smartphone browser as your second WebRTC client,
I recommend using chrome mobile, and the same process applies as above.
I also recommend using the serial debugging feature in chrome desktop connected to chrome mobile to get the JS console output from your smartphone on your pc, this will help you see whats happening inside the browser.
We will be compiling Rust to WASM, If you are new to this space it may be worthwhile to read up on the Rust-WASM scene first, there are links at the bottom of this article.
We will not be using TURN or STUN servers, this is running on a local network and as long as both clients’ IP addresses can be reached then we wont need STUN/TURN.
This guide is meant to be read right along side the corresponding source code: repo-link. I will include snippets but I will mostly reference functions/types.
WebRTC TLDR
Before we get into the code, here is a quick rundown of the application architecture and WebRTC flow.
Architecture
To setup the video call we will be writing a simple signalling server, the only job of this signalling server is to send setup information between the two peers.
Once the call has been set up, the call will be directly between the two browsers and no data flows the signalling server.
WebRTC Signalling Flow
The signalling flow happens in a couple steps, in depth version from Mozilla over here. Short version below.
- Peer B will be setup as a listener waiting for a Video Offer, Peer A will initiate sending the Video Offer to Peer B.
Peer A immediately starts sending ICE Candidates to Peer B. - Peer B gets the Video Offer and multiple of Peer A’s ICE Candidates.
Peer B sets its remote SDP description equal to Peer A’s Video Offer. - Peer B sends a Video Answer back to Peer A.
Peer B immediately starts sending ICE Candidates to Peer A. - Peer A gets the Video Answer, sets its remote SDP description equal to Peer B’s Video Answer.
- Multiple Ice Candidate’s are sent between the two peers until they have enough candidates to agree on a method of connecting to each other.
- Once they have agreed on a method to communicate, they add their own Media Streams to the connection, each Peer gets notified that the other Peer has added a Media stream to the connection. The
RtcIceConnectionState
changes toConnected
, callbacks are triggered, the streams are output to a video element and video is transmitted between the peers.
Lets get into the code
There are four parts to get this working
1. A simple static webpage (with some Javascript to Bootstrap the WASM)
2. The client-side code that compiles to WASM and runs in the browser.
3. A signalling server implementation
4. A small library that holds our signalling protocol
1. The Webpage
This is nothing special, a simple HTML page with some buttons, and video elements.
2. The client-side WASM
This section can be further broken down into: The RTCPeerConnection, Signalling (via WebSockets), the SDP messages and the ICE Candidates handling. In our implementation there is also some session handling code, which is explained later in this article.
The RTCPeerConnection
The RTCPeerConnection is the main driver of the web RTC connection.
The main functions we care about are:
set_oniceconnectionstatechange
: A function that takes a closure which handles what to do when the Ice Connection State changes add_stream
: This adds the remote stream to the local video elementset_onicecandidate
: This adds the other peer’s ice candidate to the local peer’s list of ice candidates.
Signalling
To handle signalling, we will be using an enum to wrap the setup messages then send them between client and server. We use serde to serialize/de-serialize the messages, so our enum must derive Serialize and Deserialize.
For signalling we have 4 different types of messages
VideoOffer(String, SessionID)
VideoAnswer(String, SessionID)
IceCandidate(String, SessionID)
ICEError(String, SessionID)
These are part of a larger enum SignalEnum
later in this which we use to handle all communication between the client and the signalling-server.
Session Description Protocol (SDP)
There are 3 functions that are important with respect to SDP,
1. Creating the SDP offer and sending it from Peer A inside VideoOffer(String)
to Peer B
2. Receiving the SDP offer at Peer B, setting Peer B’s remote description to the offer, and sending a SDP answer description back from Peer B inside a VideoAnswer(String)
to Peer A
3. Receiving an answer at Peer A, and setting Peer A’s remote description to the answer.
Interactive Connectivity Establishment (ICE)
There are three things we are concerned about with ICE
1. Sending our candidates to the other peer
2. Receiving the other peers candidates
3. The ICE state change.
The Ice Connection state change to Connected is going to trigger our call to set the remote incoming video streams to display in our local html video element.
These are the possible states that ICE Connection State can be in.
To compile the Rust client to wasm, from /wasm_client/
run cargo make build
or cargo make watch
if you plan on tinkering. This will output compiled binaries and some bootstrap js inside /wasm_client/pkg/
.
To host this webpage we can use the microserver crate by first installing it using cargo install microserver
Then by running cargo make serve
in the root directory of the project.
Check the /wasm_client/Makefile.toml
to see the command that is run.
3. The Signalling Server
The signalling server will be using Websockets to send strings to and from the clients, for this we will use async-tungstenite.
You can use whatever transport method you’d like to send signalling messages.
Internally the signalling server state will have to hold:
1. Active connections (Combination of PeerMap
+ UserList
)
2. Some state of the WebRTC Sessions (SessionList
) to group peers to a session and pass signalling info between peers
The short version of how this works is: When a new client connects to the signalling server, they are assigned a random UserId
, and are added to the UserList
, and PeerMap
application state.
When a client either starts listening for connections, or connects to another client that is listening, the client is placed in a SessionMembers
struct, either as a host or a guest, inside the SessionList
part of the application state.
When a client disconnects, they are removed from all of the above state, UserList
, PeerMap
and SessionList
if they started/joined a session.
The reason we are wrapping all of the state in Arc<Mutex<>>
‘s is because async-tungstenite spawns a new task for each client new connection, and so we need our access to any state to be Send
(Thread safe).
To build: from the /signalling-server/
directory run cargo make build
To start the server: run cargo make servesignal
in the root directory of the project.
Check the /signalling-server/Makefile.toml
to see the command that is run.
4. The Signalling Protocol
This describes the protocol that needs to be known by both the front and back end. VideoOffer
and VideoAnswer
contain SDP messages, IceCandidate
contains (you guessed it) an Ice candidate.
The SessionJoin/Ready/New/JoinSuccess/JoinError enum members are all for managing a WebRTC Session between our two peers.
When a client chooses to start listening, SessionNew
is sent to the Signalling Server and SessionReady
holding a SessionID
is sent from the signalling server back to the client.
This SessionID
is displayed on the web page.
When a different client wants to join that session, they need to connect to a session by entering the SessionID
, Which gets sent wrapped inside SessionJoin
. The server responds with either a SessionJoinSuccess
and begins the signalling flow, or a SessionJoinError
if the session does not exist in the signalling server state.
Conclusion
So now you should have a overview of how to make a basic WebRTC application in Rust !
You can take this further by looking into TURN and STUN servers. And the webrtc project underway to flesh out support for the WebRTC ecosystem in Rust.
If you feel so, you can visit my buy me a coffee page, but there is no pressure from me. The great thing about this article is the sharing of free knowledge and some (hopefully) working code of a cool little project.
Happy coding and go make something great !
Further Useful links:
- Wasm-Bindgen (CLI tool that facilitate high-level interactions between wasm modules and JavaScript.)
https://rustwasm.github.io/docs/wasm-bindgen - Wasm-pack (Tool for building and working with rust- generated WebAssembly that you want to interop with JavaScript)
https://rustwasm.github.io/docs/book/
https://rustwasm.github.io/docs/wasm-pack/ - In depth guide to WebRTC and its protocols.
https://webrtcforthecurious.com/ - Web-sys crate (Very useful for Web Dev) https://crates.io/crates/web-sys
- In depth guide to the Signalling flow
https://developer.mozilla.org/en-US/docs/Web/API/WebRTC_API/Signaling_and_video_calling - Does your browser support WebRTC ? Check https://caniuse.com/?search=webrtc