When OBS Ingest Is the Right Choice
If you already produce events in OBS, vMix, or hardware encoders, InterScribe can ingest your feed through RTMP, SRT, or WHIP and generate live captions and translations for your audience.
This model is ideal for centralized broadcast workflows and remote audiences. It is usually not the first choice for in-room low-latency interpretation because protocol buffering introduces delay.
Choose the Right Protocol
From InterScribe protocol guidance:
- RTMP: easiest compatibility, typically 5-10s latency.
- SRT: usually lower latency than RTMP (often 2-5s), stronger on unstable networks.
- WHIP: modern low-latency path (around ~1s in favorable conditions), tool support still maturing.
Choose based on compatibility first, then latency budget.
Step-by-Step Setup
1. Create or open your target session
In the InterScribe dashboard, open the session that will receive encoder audio.
2. Assign an AV Channel to the session
Under audio configuration, set the AV Channel this ingress will feed.
3. Create an ingress endpoint
Go to A/V Inputs → Ingresses and create a new ingress tied to the same AV Channel. Select RTMP, SRT, or WHIP.
4. Copy endpoint credentials
Capture stream URL, stream key, or protocol endpoint details exactly as shown.
5. Configure OBS output
In OBS output settings:
- Set service/custom server to provided endpoint.
- Paste stream key if protocol requires it.
- Route the correct audio source (clean speech feed preferred).
6. Validate your audio source strategy
Do not send a loud music-heavy mix when speech intelligibility is your goal. Prioritize podium/host microphones and clear voice channels.
7. Start stream and monitor in InterScribe
Once OBS starts, confirm that InterScribe receives signal on the assigned channel and captions begin flowing.
8. Validate attendee-facing outputs
From a second device, join as attendee and verify caption language switching and readability.
9. Monitor latency expectations live
Set moderator expectations based on protocol. If using RTMP, communicate that caption and translation display will reflect normal broadcast delay.
10. Capture final settings for reuse
Save OBS scene profile, routing notes, and endpoint mappings for faster recurring setup.
Production Best Practices
- Keep encoder machines on stable wired internet when possible.
- Avoid unnecessary transcode layers before InterScribe ingest.
- Label channels with source intent (for example: "Main Stage Clean Feed").
- Rehearse with real speaking pace, not only test tones.
Troubleshooting Matrix
| Symptom | Likely root cause | Fix |
|---|---|---|
| No captions despite active stream | Wrong channel binding | Confirm ingress and session use same AV Channel |
| Audio sounds distorted in captions | Gain staging or clipping | Lower input gain, validate clean source before encoder |
| Delay is too high for use case | Protocol mismatch | Move from RTMP to SRT/WHIP, or use Desktop/Web Agent for lower latency |
| Intermittent drops | Network instability | Prefer wired uplink, reduce encoder stress, monitor packet behavior |
Latency Decision Framework
Use this simple decision rule:
- Need near real-time in-room language access: use Desktop Agent/Web Agent first.
- Need compatibility with existing broadcast stack for remote audience: OBS ingest is strong.
- Need lower-delay protocol in modern pipeline: evaluate WHIP with your toolchain.
Hand-off Documentation Template
After each event, store:
- Protocol used.
- Encoder profile name.
- Audio source map.
- Observed end-to-end delay.
- Incidents and fix notes.
Teams that document this can onboard new operators quickly and avoid re-debugging known setup details.
Final Checklist Before You Publish This Process Internally
- The workflow names the exact InterScribe menu path for every critical action.
- Your team has a pre-event test session and a post-event review rhythm.
- Staff can explain fallback behavior in one sentence.
- Attendee-facing instructions are short, visible, and multilingual.
- Ownership is clear for setup, go-live monitoring, and post-event follow-up.
When these five points are true, the process is no longer theoretical. It is operational, trainable, and repeatable.

