Chinese Yellow Pages | Classifieds | Knowledge | Tax | IME

How to a/v sync in IETF RFC?

The RFC specified how to do a/v sycn generally in  https://tools.ietf.org/html/rfc6051

RTP flows are synchronised by receivers based on information that is
   contained in RTCP SR packets generated by senders (specifically, the
   NTP-format timestamp and the RTP timestamp).  Synchronisation
   requires that a common reference clock MUST be used to generate the
   NTP-format timestamps in a set of flows that are to be synchronised
   (i.e., when synchronising several RTP flows, the RTP timestamps for
   each flow are derived from separate, and media specific, clocks, but
   the NTP-format timestamps in the RTCP SR packets of all flows to be
   synchronised MUST be sampled from the same clock).  To achieve faster
   and more accurate synchronisation, it is further RECOMMENDED that
   senders and receivers use a synchronised common NTP-format reference
   clock with common properties, especially timebase, where possible
   (recognising that this is often not possible when RTP is used outside
   of controlled environments); the means by which that common reference
   clock and its properties are signalled and distributed is outside the
   scope of this memo.

A minimum reporting interval of 5 seconds is RECOMMENDED.

Rtcp Sender report

The sample sender report:

6.3.1 SR: Sender report RTCP packet

 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P|    RC   |   PT=SR=200   |             length            | header
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         SSRC of sender                        |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|              NTP timestamp, most significant word             | sender
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ info
|             NTP timestamp, least significant word             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         RTP timestamp                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                     sender's packet count                     |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      sender's octet count                     |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|                 SSRC_1 (SSRC of first source)                 | report
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block
| fraction lost |       cumulative number of packets lost       |   1
-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           extended highest sequence number received           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                      interarrival jitter                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         last SR (LSR)                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                   delay since last SR (DLSR)                  |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|                 SSRC_2 (SSRC of second source)                | report
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ block
:                               ...                             :   2
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
|                  profile-specific extensions                  |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

 

How Webrtc send sender report

the releated code is at:

bool RTCPSender::TimeToSendRTCPReport(bool sendKeyframeBeforeRTP) const {
/*
For audio we use a configurable interval (default: 5 seconds)

For video we use a configurable interval (default: 1 second) for a BW
smaller than 360 kbit/s, technicaly we break the max 5% RTCP BW for
video below 10 kbit/s but that should be extremely rare

 

// how to build sender report

std::unique_ptr<rtcp::RtcpPacket> RTCPSender::BuildSR(const RtcpContext& ctx) {
// Timestamp shouldn’t be estimated before first media frame.
RTC_DCHECK_GE(last_frame_capture_time_ms_, 0);
// The timestamp of this RTCP packet should be estimated as the timestamp of
// the frame being captured at this moment. We are calculating that
// timestamp as the last frame’s timestamp + the time since the last frame
// was captured.
int rtp_rate = rtp_clock_rates_khz_[last_payload_type_];
if (rtp_rate <= 0) {
rtp_rate =
(audio_ ? kBogusRtpRateForAudioRtcp : kVideoPayloadTypeFrequency) /
1000;
}
// Round now_us_ to the closest millisecond, because Ntp time is rounded
// when converted to milliseconds,
uint32_t rtp_timestamp =
timestamp_offset_ + last_rtp_timestamp_ +
((ctx.now_us_ + 500) / 1000 – last_frame_capture_time_ms_) * rtp_rate;

rtcp::SenderReport* report = new rtcp::SenderReport();
report->SetSenderSsrc(ssrc_);
report->SetNtp(TimeMicrosToNtp(ctx.now_us_));
report->SetRtpTimestamp(rtp_timestamp);
report->SetPacketCount(ctx.feedback_state_.packets_sent);
report->SetOctetCount(ctx.feedback_state_.media_bytes_sent);
report->SetReportBlocks(CreateReportBlocks(ctx.feedback_state_));

return std::unique_ptr<rtcp::RtcpPacket>(report);
}

 

How Webrtc handle when received Sender Report

when webrtc received sender report, it try to calculate the playout(delay) time :

rtp/rtcp delay + decode delay + render delay = playout delay ( for a/v)

The main code is at:

https://cs.chromium.org/chromium/src/third_party/webrtc/video/rtp_streams_synchronizer.cc

RtpStreamsSynchronizer::Process() , which is const running,  if diff is within 30ms, do nothing, otherwise

 syncable_audio_->SetMinimumPlayoutDelay(target_audio_delay_ms);
 syncable_video_->SetMinimumPlayoutDelay(target_video_delay_ms);

 

https://cs.chromium.org/chromium/src/third_party/webrtc/video/stream_synchronization.cc

ComputeRelativeDelay ( mainy from rtp/rtcp point of view) and ComputeDelays ( mainly for playout delay)