Looking for work! Check my resume here.

· 7 min read

Remote Video Monitoring

Automated Post-STB QoS monitoring with a RaspberryPi

$BUSINESS is a MVPD that broadcasts several dozen channels from their NOC located on the hill at the top of $TOWN. Some channels are weather affected and all channels or the hardware in the processing chain need some sort of periodic maintenance or recovery from bad states. The existing monitoring strategy has been periodic manual monitoring with increased attention during inclement weather. But, sometimes channel issues are brought to $BUSINESS attention by complaints from customers.

Given a short time period, the probability of failure is low for a single component but, the probability of a single failure from any component is quite high.

Background Information

$BUSINESS receives dozens of satellite signals which are handled by decoders that decrypt the streams. The streams are decrypted then some of those streams are passed through additional processing such as transcoders. Analog streams have the extra step to encoded them into digital streams. The now standardized digital streams are then passed through a CAS which encrypts the streams again but now with a key held by $BUSINESS. Finally, the digital streams are muxed together and broadcast over the air to the surrounding area.

Diagram

At any point in the chain of processing some hardware could malfunction/freeze/reset due to random chance, power issues or weather issues. There is also the surprisingly frequent occurrence of an upstream provider unexpectedly changing their catalog and requiring an update to return to the original program.

Due to all these points of failure and the total number of components there is always something to fix.

Remote Monitoring Solution

Given that issues occur at the NOC itself, such as power loss or broadcast signal issues, monitoring from within the NOC would mean failures that affect the entire broadcasting system would affect the monitoring solution as well or would go undetected in the case of broadcast signal issues occuring after the point in the processing chain that a monitoring solution would hook into. So a remote monitoring solution was preferred. We can remotely monitor downstream and isolated from the NOC itself using a normal decoder or STB that customers would use.

Diagram

Technical Details

The solution was built with minimalism in mind and simplicity. To meet those goals a single RaspberryPi was used to hook into the STB for controlling the channel that was selected, fetching video samples, reviewing those samples and reporting issues immediately over the internet to those responsible for rectifying any problems.

Software

There are four software services on the RPI and each run independently so that they can be swapped out easily in the future. So if the method by which samples are fetched changes or a new method of reporting issues is required the code can updated/replaced without interfering with the other services.

Given my simplicity principle they share information where needed via files on the system. Had the project grown any further it would be good to upgrade to something like a shared SQLite database but at the complexity requirements now flat files are sufficient.

Collection Service

An async event loop written with Trio controls the STB via the GPIO pins on the RPI. Then, a sample of the channel is fetched using ffmpeg. The sample is associated with its channel number by reading the channel number directly from the sample using an OCR tool. Lastly, the sample is stored with the timestamp it was collected with other samples from the same channel number.

This loop will repeat and another sample is added for the monitoring service to review.

Monitoring Service

The monitoring loop reviews the collected samples for each channel on a regular interval. For each channel, samples are compared against the samples preceding it to check if the channel appears to be frozen or stuck. This would happen quite often during inclement weather. This is done using a similarity hash algorithm so the samples do not need to be perfectly identical but close enough to trigger an alert that the channel should be reviewed. There is also a cache of samples to be used for immediately triggering an alert for known bad samples and another that contains samples that should never trigger an alert for whatever reason such as the children’s channel displaying a static “We’ll be back tomorrow” message during the night.

This process can also identify if no new samples have been collected or just a particular channel stopped gathering samples which is a sign of an issue with the STB itself and calls for a reboot of the STB.

Web Service

A simple web interface gives a view of all the most recent collected samples side by side and displays current statistics. The recent history of samples are also available if selected to help diagnose when an issue has begun and allows users to add samples to the “ignore” or “alert” caches used by the monitoring service.

web interface

Telegram Bot Service

Finally, for remote monitoring and alerting the Telegram bot can reach users directly at their phone and has commands available to allow users to review samples, see current alerts, instruct the collection service or monitoring service to perform a particular action all from a remote location.

This service could be replaced by or have a sibling service such as an email service or another messaging service to reach those who need to know about issues. Telegram was just a preference.

Hardware

The hardware is simple and consists of a few components.

The hardware was scavenged as it was impossible to import components due to geography and border lockdowns at the time. Luckily, a RaspberryPi was on hand but it was a generation 1 so performance leaves a lot to be desired.

  • RaspberryPi single board computer
  • Standard customer STB
  • HDMI-to-IP converter
  • Relay Module

The RaspberryPi uses its GPIO pins to both control the power to the STB via the relay module and to control the channel selection on the STB. Channel selction is possible due to a photodiode, taken from a broken security camera, shorts the STB’s physical channel control buttons when light is applied to it from an LED controlled by the RPI.

Final Thoughts

As a result monitoring QoS is much more of a background process than something that requires frequent mental interruptions to remember to manually monitor the channels. The Telegram bot also allows users to remotely monitor when no physical TV is available.

Adding an additional STB or upgrading the RPI would greatly reduce the time between channel sample collections and help catch issues quickier. But at the moment there is a 15 to 20 minute duration on the collection loop which isn’t bad compared to the manual process before where an unknown issue could go unnoticed for longer.

Personal Note

The hardware was the most difficult part of the project for myself as I had not automated a hardware system before. I did enjoy the experience and see myself using these skills more in the future.

  • python
Share:
Back to Projects