Enhancing SRS 5.0 with GB28181 Support: Challenges, Solutions, and Integration Strategies

Supporting GB28181 is the right thing to do, while it may also be a challenging task. These challenges make it interesting.

Introduction

With the efforts of many friends, SRS has numerous GB functionalities, details can be found in srs-gb28181[1]. Due to the complexity of GB and cameras, there are also many issues, especially stability problems. This is why GB has not been integrated into the SRS 5.0 branch yet.

Now that SRS 5.0 is nearing a feature freeze, we have added several significant features and improvements. The last feature under consideration is whether to support GB. Given the current stability performance of GB, it certainly cannot be fully merged. Is there a way to achieve a more stable integration?

Reducing features naturally improves stability. Therefore, a possible approach for merging into SRS 5.0 is to merge only the simplest GB capability, which we will call PoC. Here is a list of GB functionalities I am aware of:

  1. 1. Camera registration via SIP. Supported by srs-gb28181. PoC supported.
  2. 2. Automatic camera stream pushing. Supported by srs-gb28181. PoC supported.
  3. 3. GB/2016 to RTMP protocol conversion. Supported by srs-gb28181. PoC supported.
  4. 4. SIP signaling over TCP. Supported by srs-gb28181. PoC supported.
  5. 5. Single-port media transmission via TCP. Supported by srs-gb28181. PoC supported.
  6. 6. SIP signaling over UDP. Supported by srs-gb28181. PoC not supported.
  7. 7. Single-port media transmission via UDP. Supported by srs-gb28181. PoC not supported.
  8. 8. GB/2011 to RTMP protocol conversion. Supported by srs-gb28181. PoC not supported.
  9. 9. Multi-port media transmission via UDP/TCP. Supported by srs-gb28181. PoC not supported.
  10. 10. GB stream query via HTTP API. Supported by srs-gb28181. PoC not supported.
  11. 11. PTZ camera control via HTTP API. Supported by srs-gb28181. PoC not supported.
  12. 12. Web management interface. Supported by srs-gb28181. PoC not supported.
  13. 13. GB lower-level servers. Not supported by srs-gb28181. PoC not supported.
  14. 14. GB voice intercom. Not supported by srs-gb28181. PoC not supported.
  15. 15. GB playback. Not supported by srs-gb28181. PoC not supported.
  16. 16. GB encrypted transmission. Not supported by srs-gb28181. PoC not supported.

Note: PoC refers to the GB capability that SRS 5.0 plans to integrate. srs-gb28181[2] refers to the current GB capabilities supported by SRS.

The conditions for merging into 5.0 include assessing whether it can achieve a relatively stable state and consistently maintaining this feature. From these two perspectives, fewer features are beneficial to these two goals.

In practical applications, it is almost impossible to launch a GB project directly using SRS. In fact, no open-source project can accomplish this because GB involves a lot of business customization and requires deep customization to go online. Based on this judgment, the GB of SRS and multithread support are similar, mainly providing a basic framework to facilitate customization, allowing users to have a relatively stable foundation during customization.

It is especially recommended to use mature SIP libraries, such as the java-based jsip[3], rather than relying on the SIP implementation of SRS. SIP involves complex business logic, and writing such logic in C++ is prone to errors. We have implemented an example based on jsip for reference, check srs-sip[4].

Note: SRS 5.0 will still implement a SIP protocol stack, but it will only cover the most essential capabilities, and there are no plans for future expansion. Its main role is to facilitate integration.

There is a continuous stream of friends using SRS for GB projects, and the general feedback is that GB is a challenge hard to fill. We hope SRS’s humble contribution can make GB developers’ work a little easier.

Note: Apart from the GB protocol, more and more cameras are beginning to support RTMP streaming, and even SRT or WebRTC streaming. Looking forward, the GB protocol may not be the only viable protocol. However, considering the present usage of SRS for GB, it is reasonable for SRS to optimize GB.

Finally, most importantly, I hope everyone lowers their expectations. GB is too challenging a problem to solve, and it would be unwise to expect SRS to perform flawlessly.

Usage

Firstly, compile and launch SRS, ensure the version is 5.0.74+:

./configure --gb28181=on
make
./objs/srs -c conf/gb28181.conf

Then, in the camera settings, select AAC encoding, and configure the SIP server on the platform as SRS, as shown below:

GB28181 support
GB28181 support


  • • It must be AAC encoding. In audio encoding, select AAC, with a sampling rate of 44100HZ.
  • • It must comply with the GB-2016 standard; otherwise, TCP is not supported. Select GB/T28181-2016 in the protocol version.
  • • It must be TCP protocol, UDP is not supported. Select TCP in the transport protocol and use the GB-2016 standard.

After camera registration, SRS will automatically invite the camera to stream. You can open the following links for playback:

  • • http://localhost:8080/live/34020000001320000001.flv[5]
  • • http://localhost:8080/live/34020000001320000001.m3u8[6]
  • • webrtc://localhost/live/34020000001320000001[7]

Note: Please replace the stream name with your device name and then play.

Due to inconvenient equipment conditions at home for long-term stream testing, I managed to stream for 13 hours continuously without issues. We invite everyone to use dedicated testing environments and equipment to test; our first small goal is to achieve uninterrupted streaming for a month.

Candidate

The definition of Candidate in GB aligns with the concept of a Candidate[8] in WebRTC; both require exposing an IP address accessible to the client and transmitting it in SDP. For example:

  1. 1. Set stream_caster.sip.candidate in the SRS configuration; SRS will read this configuration upon startup, for example, 192.168.1.100.
  2. 2. GB device registers via SIP to SRS, and SRS initiates an INVITE message. The body of the message is SDP, specifying this IP address, e.g., IN IP4 192.168.1.100.
  3. 3. GB device connects to this IP address tcp://192.168.1.100:9000 and initiates a media request.

Note: The media port is configured in stream_caster.listen. Currently, only TCP ports are supported.

This CANDIDATE is the IP of the media server, and it can differ from the SIP server address configured in the client’s Usage[9].

Note: Due to how the SIP protocol in GB works, the To field during REGISTER does not contain the server’s address, making it impossible for the server to discover its address through SIP; server configuration is the only way.

Of course, if a network card is configured with an address accessible by the client, CANDIDATE can be set as * to allow SRS to discover it automatically.

Protocols

The GB-associated protocols are as follows:

  • • RFC3261: SIP: Session Initiation Protocol[10]
  • • RFC4566: SDP: Session Description Protocol[11]
  • • RFC4571: RTP & RTCP over Connection-Oriented Transport[12]
  • • GB28181-2016: Public Security Video Surveillance Networking System Information Transmission, Exchange, Control Technology Requirements[13]
  • • ISO13818-1-2000[14]: MPEGPS (Program Stream), PS media stream specifications.

SIP Parser

There isn’t a particularly good SIP library for C++, which is one reason SIP processing is unstable.

However, considering the structural similarity between the SIP and HTTP protocols, SRS uses http-parser[15] to parse SIP. This library is maintained by Node.js and was seemingly extracted from NGINX before, so its stability is quite high.

Although parsing SIP with HTTP requires some modifications, the primary changes are as follows:

  • • Method: New methods like REGISTER, INVITE, ACK, MESSAGE, and BYE need to be added; these are common messages in GB.
  • • RequestLine: Path parsing needs adjustments, as SIP uses a format like sip:xxx, which can be mistaken for a complete HTTP URL, causing parsing failures.
  • • ResponseLine: The generation of the Response needs modification, mainly in the protocol header, changing from HTTP/1.1 to SIP.2.0.

The changes are minimal, assuring protocol stability. This can be considered a solution to a challenging problem.

Unlike HTTP, in SIP, a single TCP channel does not necessarily correspond to one Request and one Response. For instance, after an INVITE, there might be both 100 and 200 responses. SRS can also be a Client, not just a Server. Under these conditions, http-parser can be set to BOTH mode, enabling it to parse Requests and Responses:

    SrsHttpParser* parser = new SrsHttpParser();
    SrsAutoFree(SrsHttpParser, parser);

    // We might get SIP request or response message.
    if ((err = parser->initialize(HTTP_BOTH)) != srs_success) {
        return srs_error_wrap(err, "init parser");
    }

Note: HTTP messages don’t require strictly one Request per one Response, so this won’t introduce additional issues.

During actual parsing, sometimes headers with spaces are encountered, like:

Content-Length:         142\r\n

This is compliant with the standard but can cause problems if manually parsed. HTTP-Parser can handle such cases correctly.

REGISTER

GB Registration Process:

  1. 1. Set the SIP server to SRS on the device.
  2. 2. The device sends a SIP-format REGISTER message.
  3. 3. SRS responds with 200/OK, completing the registration.

GB Heartbeat:

  1. 1. The device will continuously send MESSAGE as a heartbeat message later.
  2. 2. SRS responds with 200/OK, confirming the heartbeat.

If SRS restarts, no states are saved, so the received message could be a MESSAGE from the device without a REGISTER message, suggesting the device re-register. After consulting peers and SIP/GB experts, possible re-registration methods include:

  • • Not responding to the MESSAGE message, typically timing out after three failed heartbeats (there’s a setting on the device). Verification shows that Hikvision devices default to a heartbeat cycle of 60 seconds, suggesting re-registration may occur around 3 minutes.
  • • Responding with 403 or another message to MESSAGE. Verification shows similar results as not responding; the device doesn’t treat this specially.
  • • Sending a reboot command to the device, as specified in A.2.3 Control Command, with remote start being Boot. Attempts to reboot the device should lead to re-registration; this has not yet been verified.
  • • Shortening the EXPIRE during the REGISTER message response, reducing the registration interval, for instance to 30 seconds. Verification shows setting it to 30 seconds still re-registers around a single heartbeat interval, which is 60 seconds.
  • • Shorten the heartbeat interval on the device configuration, default being 60 seconds with a minimum of 5 seconds, making timeouts faster. Verification shows setting to 5 seconds results in re-registration in approximately 26 seconds.
  • • Adding a SIP Proxy layer to hold related information, transferring states to the Proxy. While viable, this approach isn’t suited for SRS as introducing additional components complicates the open-source model; this can be tried in individual implementations.
  • • Sending messages before a restart. This is effective for Gracefully Quit in SRS but doesn’t apply when using kill -9 or during system OOM where no clean up opportunity is provided; not applicable in all scenarios. However, during proactive upgrades, Gracefully Quit is often used, providing an opportunity to handle such issues; feel free to try this.

In summary, there is no particularly reliable method to prompt the camera for immediate re-registration. SRS must logically address this issue: After restarting, if the camera is already registered or streaming, it should be encouraged to re-register and re-stream.

Note: Many issues arise from long-term operations where a part of the system restarts, leading to inconsistent states, triggering various problems. Therefore, if a camera is in a registered or streaming state on SRS restart or startup, it should attempt to have the camera re-run its process, such as re-registering and re-streaming, aligning both parties’ states for higher reliability.

Note: Verification shows re-registration does not affect ongoing media stream transmission. The device probes port accessibility, and if TCP disconnects or UDP port becomes unreachable, it halts the stream transmission.

TCP or UDP

We opted for TCP support first, both for signaling and PS media in GB.

As per SIP protocol specifications, TCP support is mandatory and represents a major update of RFC3261 over RFC2543, referenced in RFC3261: Transport[16].

Regarding media protocols, GB uses PS format, typically for storage, while TS is for network transmission, meaning TS considers more network transmission concerns, whereas PS assumes reliability like disk reads and writes. Therefore, PS is simpler with TCP transport.

GB 2016’s description of TCP can be found in Appendix L, marked as Audio and Video Media Transmission Based on TCP Protocol:

Real-time video on-demand, historical video playback and download via TCP media transmission should support audio and video PS stream encapsulated by RTP, with the encapsulation format referencing IETF RFC 4571.

In practical applications, TCP is primarily used instead of UDP, particularly over public networks where UDP may experience packet loss with GB lacking retransmission or FEC design. The advantages of using TCP include:

  • • With UDP being stateless, upon server restart, the device remains unaware of the server restart, potentially continuing data transmission and causing mismatched states. Prolonged mismatched states may trigger issues like exceeding the request limit messages on the device.
  • • Separating signaling and media in GB, TCP enables better state synchronization, reflecting connectivity through signal like available signaling unavailable media or vice versa. For detailed understanding, refer to Protocol Notes[17].
  • • Post-restart, server can shorten REGISTER Expires and heartbeat interval to prompt device re-registration and re-entry into streaming. The server restart allows rapid device detection of broken media links.
  • • During transmission, network jitter causing connection drop can swiftly prompt both server and device to enter a fault handling process.

Therefore, SRS initially supports TCP, not UDP. Effectively supporting GB28181 2016 instead of GB28181 2011.

Note: GB28181-2016 and TCP protocol must be explicitly enabled.

Protocol Notes

Key considerations for the SIP protocol:

  • • The Via branch must start with z9hG4bK, as specified in Via[18].
  • • The ACK message following INVITE’s 200(OK) requires a new Via branch since ACK is not part of the INVITE transaction, refer to Via[19] and Example[20].
  • • The Contact in INVITE refers to its own address, not the GB device’s. Essentially, Contact should be generated from From and not To, according to Contact[21] and Example[22].
  • • INVITE’s Subject defines media stream sender ID:sender media stream sequence number, media stream receiver ID:receiver media stream sequence number, as per Appendix K[23]. In s=Play real-time scenarios, the receiver media stream sequence ID (SSRC) isn’t specified; generally, this field is zero-filled based on feedback.

Points of concern for the SDP protocol:

  • • The y field is a decimal number string denoting the SSRC value with format: dddddddddd. The first digit signifies whether it is real-time or historical media, 0 being real-time, and 1 historical; digits 2-6 derive from digits 4-8 of the 20 digit SIP monitoring domain ID, like 13010000002000000001 extracting 10000; digits 7-10 represent intra-domain media stream IDs, distinct from concurrent domain-derived media stream SSRC values, typically a four-digit decimal integer.

Note: The y= field in SDP is a GB extension, similar to a=ssrc:xxxx in WebRTC defining SSRC.

Signaling and media integration:

  • • After signaling registration, INVITE, TRYING, 200, and ACK, media transmission commences. See gb-media-ps-normal.pcapng.zip[24].
  • • With media transmission ongoing, re-registration in signaling doesn’t affect media, continuing normal transmission. Refer to gb-media-ps-sip-register-loop.pcapng.zip[25].
  • • Signal successfully completes INVITE; if media TCP isn’t active, the device attempts a single connection and relinquishes further attempts. See gb-media-disabled-sip-ok.pcapng.zip[26].
  • • During media transmission, if signaling disconnects, media disconnects after a period. Review gb-media-ps-sip-disconnect.pcapng.zip[27].
  • • Upon TCP interruption of media transmission, the client doesn’t retry. Study gb-media-disconnect-sip-ok.pcapng.zip[28].

Media protocol:

  • • While parsing media streams, various errors may arise, prompting the discarding of the entire pack data until the next pack appears (00 00 01 ba). This includes RTP parsing fails, PS header invalidity (non-00 00 01 start), partial PES headers (like ending in the previous TCP packet), and RFC4571 parsing errors (where the first two bytes indicate zero length).
  • • SRS offers recovery mode, attempting recovery upon encountering camera packet parsing errors, albeit occasionally irrecoverable, leading to restricted maximum recovery attempts. If consecutive packets remain irrecoverable, media disconnection occurs, triggering the INVITE renewal process. With a likely unrecoverable packet length anomaly, recovery mode is bypassed to initiate INVITE directly.
  • • Media employs MPEGPS streams, where length is 16-bit, capping P…