Camera board • Latency builds up when decoding H264 with V4L, not with LAVC

In short - I've found that two methods of decoding H264 give quite different results. Both ends of the pipeline are Raspberry Pi 4.

V4L method (builds up latency up to 10 seconds)
libcamera@30 fps > TX program (uses V4L) > UDP over lossy network > RX program (uses V4L) -> framebuffer

LAVC method (remains responsive)
libcamera@30 fps > TX program (uses V4L) > UDP over lossy network > RX program (uses LAVC) -> framebuffer

I would happily forget pipeline 1 and use 2, but I have understood that V4L is the "official" way one is supposed to decode H264 (other people might use it) and of course, software decoding eats 100% of one CPU core on the receiver, even at a modest resolution of 640 x 480.

I have firmly excluded possible latency on the transmitter (when I restart the receiving side, it reverts to low latency and builds up to high latency again). I have also excluded bugs in the UDP receiver code (the same receiver with a LAVC decoder works fine). Can someone guess, why would V4L develop multi-second latency during prolonged operation?

Some definitions of V4L variables:

Code:

// Video4Linux file descriptor.int v4l_fd;// Video4Linux capabilities structure.struct v4l2_capability v4l_capa;// Video4Linux format.struct v4l2_format v4l_format;// A buffer for interacting with Video4Linux, containing: start pointer, length field, inner buffer, data planestruct v4l_buffer {  void* start;  int length;  struct v4l2_buffer inner; // index, type, bytesused, flags, field, timestamp, timecode, sequence, memory                            // [union] m (offset, userptr, planes, fd)                            // length, reserved, reserved2  struct v4l2_plane plane; // bytesused, length, [union] m (mem_offset, userptr, fd), data_offset, reserved};// Video4Linux request for bufferstruct v4l2_requestbuffers v4l_reqbuf;// Two instances of a V4L buffer, one for output to the decoder, other for capturing decoder output.struct v4l_buffer output_to_v4l;struct v4l_buffer capture_from_v4l;

A function that I can switch over from one pipeline to another, which makes all the difference. This function is fed with reassembled encoder frames. "Reassembled" means that while some encoded frames might never arrive to the decoder, those that do will arrive in the correct order, and have been checksummed against corruption. Visible deterioration of video does occur when I-frames are lost, but passes almost undetectably when P-frames are lost.

Code:

void decode_rx_data(uint8_t* p_data, size_t len) {     // Works, no perceivable latency   if (decode_method == DECODE_METHOD_LAVC) {         // Prepare packet for decode      avpkt.size = len;      avpkt.data = p_data;            // Send packet to decoder      avcodec_send_packet(avcontext, &avpkt);      // Get decoded frame      int res = avcodec_receive_frame(avcontext, avframe);      if (res < 0) {         printf("Cannot decode frame.\n");      } else {         // Decoded, convert from YUV420 to BGRA straight into framebuffer.         sws_scale(sws_context, avframe->data, avframe->linesize, 0, STREAM_FRAME_H, sws_out_planes, sws_out_linesize);         rx_frameno++;      }            // Release      av_packet_unref(&avpkt);   }      // Starts responsive, but slows down to unbearable.   if (decode_method == DECODE_METHOD_V4L) {            // Query buffer, check if we can give input      ioctl(v4l_fd, VIDIOC_QUERYBUF, &output_to_v4l.inner);         if (output_to_v4l.inner.flags & V4L2_BUF_FLAG_DONE) {            // Dequeue the V4L output buffer         ioctl(v4l_fd, VIDIOC_DQBUF, &output_to_v4l.inner);               // Copy received data to the buffer start pointer         // Give information about how much data you provide         memcpy(output_to_v4l.start, p_data, len);         output_to_v4l.plane.bytesused = len;               // Queue the buffer for decoding         ioctl(v4l_fd, VIDIOC_QBUF, &output_to_v4l.inner);      } else {         printf("Cannot send encoded data to V4L.\n");      }            // Query buffer, check if we can get output      ioctl(v4l_fd, VIDIOC_QUERYBUF, &capture_from_v4l.inner);            if (capture_from_v4l.inner.flags & V4L2_BUF_FLAG_DONE) {               // Dequeue the V4L capture buffer         ioctl(v4l_fd, VIDIOC_DQBUF, &capture_from_v4l.inner);         // Get decoded length.         size_t decoded_len = capture_from_v4l.inner.m.planes[0].bytesused;         uint8_t* decode_p = (uint8_t*) capture_from_v4l.start;         // Copy image to the framebuffer (don't overwrite the metadata box).         memcpy(framebuf_p + (DATABOX_W * DATABOX_H * FB_BPP), decode_p, decoded_len);               // Queue the capture buffer again.         ioctl(v4l_fd, VIDIOC_QBUF, &capture_from_v4l.inner);               rx_frameno++;      } else {         printf("Cannot get decoded data.\n");      }   }}

Guess: maybe I'm doing something wrong with the VIDIOC_QUERYBUF, VIDIOC_QBUF and VIDIOC_DQBUF calls?

More info: a fragment showing how I initialize V4L on the receiver:

Code:

// Open a Video4Linux file descriptor.   if ((v4l_fd = open("/dev/video10", O_RDWR)) < 0 ) {      elog("Failed opening V4L file descriptor.\n");      exit(EXIT_FAILURE);    }      // Clear the format structure.   memset(&v4l_format, 0, sizeof(v4l_format));   // DISABLED, made no difference.   // Demand the H264 decoder to return a frame EVERY time.   // ioctl(v4l_fd, V4L2_CID_MPEG_MFC51_VIDEO_DECODER_H264_DISPLAY_DELAY_ENABLE, true);   // ioctl(v4l_fd, V4L2_CID_MPEG_MFC51_VIDEO_DECODER_H264_DISPLAY_DELAY, 1);      // Output stream: get default format, set needed format   v4l_format.type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE;   ioctl(v4l_fd, VIDIOC_G_FMT, &v4l_format);   v4l_format.fmt.pix_mp.width = FRAME_W;   v4l_format.fmt.pix_mp.height = STREAM_FRAME_H;   v4l_format.fmt.pix_mp.pixelformat = V4L2_PIX_FMT_H264;   v4l_format.fmt.pix_mp.field = V4L2_FIELD_NONE;   ioctl(v4l_fd, VIDIOC_S_FMT, &v4l_format);   // Capture stream: get default format, set needed format   v4l_format.type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE;   ioctl(v4l_fd, VIDIOC_G_FMT, &v4l_format);   v4l_format.fmt.pix_mp.width = FRAME_W;   v4l_format.fmt.pix_mp.height = STREAM_FRAME_H;   // Caution: if you request something other than BGR32, you get YUV420.   v4l_format.fmt.pix_mp.pixelformat = V4L2_PIX_FMT_BGR32;   v4l_format.fmt.pix_mp.field = V4L2_FIELD_NONE;   ioctl(v4l_fd, VIDIOC_S_FMT, &v4l_format);      // Get the values again to examine them   ioctl(v4l_fd, VIDIOC_G_FMT, &v4l_format);      // Check if you got the requested pixel format.   // if (v4l_format.fmt.pix_mp.pixelformat == V4L2_PIX_FMT_XBGR32)         // Memory map V4L buffers.   v4l_reqbuf.memory = V4L2_MEMORY_MMAP;   v4l_reqbuf.count = 1;   v4l_reqbuf.type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE;   ioctl(v4l_fd, VIDIOC_REQBUFS, &v4l_reqbuf);   map(v4l_fd, V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE, &output_to_v4l); // a wrapper around mmap()   v4l_reqbuf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE;   ioctl(v4l_fd, VIDIOC_REQBUFS, &v4l_reqbuf);   map(v4l_fd, V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE, &capture_from_v4l); // a wrapper around mmap()      // Queue the mapped buffers, granting the decoder exclusive access.   ioctl(v4l_fd, VIDIOC_QBUF, &capture_from_v4l.inner);   ioctl(v4l_fd, VIDIOC_QBUF, &output_to_v4l.inner);      // Start the V4L conveyor with the VIDIOC_STREAMON call.   // Decoder will now read from the "output to V4L" buffer   // and put encoded frames into the "capture from V4L" buffer.      int type_output = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE;   ioctl(v4l_fd, VIDIOC_STREAMON, &type_output);   int type_capture = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE;   ioctl(v4l_fd, VIDIOC_STREAMON, &type_capture);

I have also looked for an elegant way for "resetting" my decoder pipeline regularly or when latency is detected (I also send timestamps as metadata), but short of releasing all the V4L components and re-allocating and re-starting them, I have not found a remedy.

It's not a pressing issue for me, since LAVC works, but one of those puzzles that keeps haunting you because you chased bugs for a month and didn't catch them.

Statistics: Posted by diastrikos — Sun Sep 01, 2024 10:57 am — Replies 1 — Views 47

Camera board • Latency builds up when decoding H264 with V4L, not with LAVC

Trending Articles

RAMAYAMPET Mandal Sarpanch | Upa-Sarpanch | Ward member Mobile Numbers Medak...

लड़कियां सेक्स के दौरान क्यों करती है उह! आह!लड़कियां सेक्स के दौरान क्यों करती...

Neem Baba Extra Questions Answer Class 6 English Poorvi

Throw Back: 4×4 — Sikilitele (Ft Castro) Prod by JQ

Rajasthan Board 10th Result 2016 Roll No wise & Name Wise

Lowe faces four theft charges

Practice Sheet of Right form of verbs for HSC Students

Mafia, Murder & Mayhem In The Motor City: Detroit Mob Hit Timeline (1937-2007)

The 10 Tennessee Cities With The Largest Black Population For 2021

Materials Around Us Class 6 Worksheet Science Chapter 6

デスクトップヒープの枯渇

Best Suvichar in Hindi |बेस्ट सुविचार |शुभ विचार हिंदी में

Kanulanu Thaake Lyrics and translation | Manam (2014)

Korean Sex Porn Videos: XXX Videos & Free Porn Movies

Teen Shot In Miami Drive-By Dies From Injuries

Download: IQ Muzatasha feat Shy D & Pmj – Ulesi NiFertilizer Yamavuto

Mahakal Attitude Status

Property developer set up cannabis factory to help pay off debts...

♡

KB: How to troubleshoot issues when adding a Hyper-V host in System Center...