[DISCUSSION NEEDED] AV-sync fundamental issues #1248

bjuncek · 2019-08-19T12:41:08Z

As pointed out in #1221 _read_from_stream function assumes that all the offsets are global, which is fundamentally wrong as pts is stream-specific representation.

Here, I propose to make precomputed pts a global offset in the form of k/fps and type fraction.Fraction where k is k-th frame. Then, in every function which operates on stream level, global pts would be converted to a stream specific measure by dividing it by stream specific time-base, i.e. int(round(float(global_offset / stream_time_base))).

Obviously, tests are modified to reflect this change.

This is an early version and a suggestion on how to tackle formerly mentioned issue. If anyone has better Idea, please let me know.

cc @fmassa @iyah4888

fmassa

Thanks a lot for the PR Bruno!

This is indeed a problem with the current audio synchronization.

I've made a few comments, let me know what you think.
My main concern is that we introduce a slight backwards-incompatibility for the timestamps, so we should think about proper deprecation strategies for it.

torchvision/io/video.py

fmassa · 2019-08-28T11:48:58Z

torchvision/io/video.py

@@ -250,4 +268,4 @@ def read_video_timestamps(filename):
                                             container.streams.video[0], {'video': 0})
        video_fps = float(container.streams.video[0].average_rate)
    container.close()
-    return [x.pts for x in video_frames], video_fps
+    return [x.pts * x.time_base for x in video_frames], video_fps


This is a BC-breaking change, and I wonder if there would be a way of keeping backwards-compatibility for one version before removing the old behavior, with a loud warning.

Maybe we should add an extra option to read_video_timestamps, something like output_format=None, and raise a warning is it is None, and take the current behavior in this case. Similar to what has been done in interpolate, with the align_corners flag.
Thoughts?

Yeah, that makes a lot of sense.
Is there a preferred warning system in torchvision?

let's just use warnings.warn for now

S.G.
what do we want to keep as a default behaviour?

torchvision/io/video.py

fmassa · 2019-08-29T09:29:40Z

test/test_io.py

            for start in range(5):
                for l in range(1, 4):
                    lv, _, _ = io.read_video(f_name, pts[start], pts[start + l - 1])
                    s_data = data[start:(start + l)]
                    self.assertEqual(len(lv), l)
                    self.assertTrue(s_data.equal(lv))

-            lv, _, _ = io.read_video(f_name, pts[4] + 1, pts[7])
+            lv, _, _ = io.read_video(f_name, pts[4] + 1 / (fps + 1), pts[7])


Can you explain the need of the +1 in (fps + 1)?

My understanding was that the original test was here to make sure that if we try to decode from a pts that doesn't exist (pts[4]+1) that we return the closest possible frame to that pts - then we check and get 4 frames (from pts[4] to pts[7]).

In the same way, 1/(fps+1) is a non-existing pts that is the closest to the existing (pts[4]).

fmassa · 2019-08-29T09:33:38Z

test/test_io.py

+                raise unittest.SkipTest(msg)
+
+            lv, la, info = io.read_video(f_name, pts[3], pts[7])
+            # FIXME: add Another video - this one doesn't have audio


Another option for getting the video would be to use av.datasets https://github.com/mikeboers/PyAV/blob/cd458ffe89988b0feca44da6c56ef29f17555962/tests/common.py#L11

Sounds reasonable - I'll take a look.

fmassa · 2019-09-30T13:37:23Z

Subsumed by #1331

[WIP] AV-sync commint

f4a2919

fmassa requested changes Aug 29, 2019

View reviewed changes

fmassa mentioned this pull request Sep 9, 2019

modified code of io.read_video to interpret start_pts and end_pts in seconds #1313

Closed

Addressing comments

2419469

fmassa closed this Sep 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DISCUSSION NEEDED] AV-sync fundamental issues #1248

[DISCUSSION NEEDED] AV-sync fundamental issues #1248

Uh oh!

bjuncek commented Aug 19, 2019

Uh oh!

fmassa left a comment

Uh oh!

Uh oh!

fmassa Aug 28, 2019

Uh oh!

bjuncek Aug 30, 2019

Uh oh!

fmassa Sep 10, 2019

Uh oh!

bjuncek Sep 10, 2019

Uh oh!

Uh oh!

Uh oh!

fmassa Aug 29, 2019

Uh oh!

bjuncek Sep 10, 2019

Uh oh!

fmassa Aug 29, 2019

Uh oh!

bjuncek Aug 30, 2019

Uh oh!

fmassa commented Sep 30, 2019

Uh oh!

Uh oh!

[DISCUSSION NEEDED] AV-sync fundamental issues #1248

[DISCUSSION NEEDED] AV-sync fundamental issues #1248

Uh oh!

Conversation

bjuncek commented Aug 19, 2019

Uh oh!

fmassa left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fmassa commented Sep 30, 2019

Uh oh!

Uh oh!