Page MenuHomePhabricator

Sign language video recordings appear as audio on Commons
Closed, ResolvedPublicBUG REPORT

Description

TLDR: when we record a video on Lingualibre.org, send it to commons.wikimedia.org , we likely loose some metadata because the video appears on Wikimedia Commons as an audio file. We need to inspect this further. Videos were ok prior to 2020, see here for broken vs working Lingualibre videos

Critical This bug is the only blocking piece preventing Sign Language contributors to record sign language videos and elegant Lingua Libre SignIt for Firefox add-on to do outreach and expand internationally.

"T316113 : Vue dev tools must be allowed" could help.

List of steps to reproduce

  • Go to https://lingualibre.org/wiki/Special:RecordWizard > (Log in with your Wikimedia account)
  • Step 2: add yourself French Sign language.
  • Step 3: Select Sign Language, add few words : "crabe", "poisson" (fish), "lapin" (rabbit).
  • Step 4: Record 3 videos and stash them to Lingualibre. (some console.log occurs here)
  • Step 5: Review videos - you may download those video to inspect them ; Upload files to commons (some console.log occurs here)

What happens?:

Screenshot_20220824_185848.png (667×1 px, 188 KB)

  • On a wiki page, when you add the media via [[file:my_video.webm]], the miniature is an audio player.
    • Inspect HTML, Mediawiki indeed embed the files into an audio class and element.
    • Inspect HTML, change code into a video html tag, it appears as a video.

Screenshot_20220728_video_not_recognized.png (588×1 px, 259 KB)

  • That same file from Commons, when opened in full in the browser, appears as a video. User [[:media:my_video.webm]] or https://commons.wikimedia.org/wiki/Special:FilePath/my_video.webm .

Screenshot_20220728_video_recognized.png (1×1 px, 506 KB)

Downloading the file recorded-stashed on Lingualibre (Step 4) and uploaded-published to Commons (Step 5) shows that the files are identical and some metadata are missing at both steps.
Lingua Libre's Record Wizard video recording system may be faulty. Where ? How ?

What should have happened instead?:

  • All relevant video metadata should be present when stashed.
  • Media should be recognized as video on Commons.

Software version: n.a.

See also: Discussion and example -> might be caused by an issue when uploading to Commons, where bytes (which ones?) are missing.

Other considerations:

Event Timeline

Yug triaged this task as High priority.Jul 7 2022, 2:42 PM
Yug created this task.
Poslovitch renamed this task from Recording: sign language video recording appears as audio to Sign language video recordings appear as audio on Commons.Jul 28 2022, 2:29 PM
Poslovitch updated the task description. (Show Details)
Yug updated the task description. (Show Details)
Yug updated the task description. (Show Details)

Historical comparison

Please note the difference in metadata displayed below the file.

2019 it works :

2019_video.png (708×910 px, 159 KB)

202 it fails :

2022_video.png (237×910 px, 33 KB)

2022 files comparison : is there a transfer corruption ?

I here aim to check if the file is corrupted when sent to Commons.

  1. While recording on lingualibre, on step 4/5, while the file is still a client-side file, I downloaded it to my computer.
  2. After finishing the process and posting / uploading those to Wikimedia Commons, I visited the file's page and downloaded Wikimedia Commons' version of the file to my computer.

Both files show identical MD5 checksum :

Screenshot_20220824_141241.png (835×1 px, 204 KB)

There is no loss in transfer.

File logs

I don't understand further but I see the following logs on step 4 (recording and stashing) and step 5 (review and upload to commons):

Step 4 logs : stash on Lingualibre.org

Logs appearing at Step 4, at the end of recording one word, before Step 5 "publishing to commons" :

LL-step5-Chromium-upload_logs_before_publish_on_commons.png (1×1 px, 292 KB)

{
    "upload": {
        "result": "Success",
        "filekey": "19ftq4hdk6zk.ch4fow.8.webm",
        "sessionkey": "19ftq4hdk6zk.ch4fow.8.webm",
        "imageinfo": {
            "timestamp": "2022-08-24T12:24:45Z",
            "size": 257139,
            "width": 0,
            "height": 0,
            "canonicaltitle": "File:20220824122445!phpSFRBUZ.webm",
            "url": "https://lingualibre.org/wiki/Special:UploadStash/file/19ftq4hdk6zk.ch4fow.8.webm",
            "descriptionurl": "https://lingualibre.org/wiki/Special:UploadStash/file/19ftq4hdk6zk.ch4fow.8.webm",
            "sha1": "0dc33f65efaa4c1cbf267fef166c4742a738bf21",
            "metadata": [
                {
                    "name": "GETID3_VERSION",
                    "value": "1.9.20-202006061653"
                },
                {
                    "name": "filesize",
                    "value": 257139
                },
                {
                    "name": "avdataoffset",
                    "value": 0
                },
                {
                    "name": "avdataend",
                    "value": 257139
                },
                {
                    "name": "fileformat",
                    "value": "webm"
                },
                {
                    "name": "audio",
                    "value": [
                        {
                            "name": "dataformat",
                            "value": "A_OPUS"
                        },
                        {
                            "name": "sample_rate",
                            "value": 48000
                        },
                        {
                            "name": "channels",
                            "value": 1
                        },
                        {
                            "name": "language",
                            "value": "eng"
                        },
                        {
                            "name": "bits_per_sample",
                            "value": 32
                        },
                        {
                            "name": "streams",
                            "value": [
                                {
                                    "name": "fbdf6d9e2b41f8",
                                    "value": [
                                        {
                                            "name": "dataformat",
                                            "value": "A_OPUS"
                                        },
                                        {
                                            "name": "default",
                                            "value": true
                                        },
                                        {
                                            "name": "sample_rate",
                                            "value": 48000
                                        },
                                        {
                                            "name": "channels",
                                            "value": 1
                                        },
                                        {
                                            "name": "language",
                                            "value": "eng"
                                        },
                                        {
                                            "name": "bits_per_sample",
                                            "value": 32
                                        }
                                    ]
                                }
                            ]
                        },
                        {
                            "name": "channelmode",
                            "value": "mono"
                        }
                    ]
                },
                {
                    "name": "video",
                    "value": [
                        {
                            "name": "dataformat",
                            "value": "V_VP9"
                        },
                        {
                            "name": "resolution_x",
                            "value": 640
                        },
                        {
                            "name": "resolution_y",
                            "value": 480
                        },
                        {
                            "name": "display_unit",
                            "value": "pixels"
                        },
                        {
                            "name": "display_x",
                            "value": 640
                        },
                        {
                            "name": "display_y",
                            "value": 480
                        },
                        {
                            "name": "streams",
                            "value": [
                                {
                                    "name": "affc7a1157c058",
                                    "value": [
                                        {
                                            "name": "dataformat",
                                            "value": "V_VP9"
                                        },
                                        {
                                            "name": "default",
                                            "value": true
                                        },
                                        {
                                            "name": "resolution_x",
                                            "value": 640
                                        },
                                        {
                                            "name": "resolution_y",
                                            "value": 480
                                        },
                                        {
                                            "name": "display_unit",
                                            "value": "pixels"
                                        },
                                        {
                                            "name": "display_x",
                                            "value": 640
                                        },
                                        {
                                            "name": "display_y",
                                            "value": 480
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                },
                {
                    "name": "error",
                    "value": [
                        {
                            "name": 0,
                            "value": "EBML parser: ran out of file at offset 257139"
                        }
                    ]
                },
                {
                    "name": "warning",
                    "value": [
                        {
                            "name": 0,
                            "value": "Unhandled track.video element [module.audio-video.matroska.php:712] (5056::13c0 [1 bytes]) at 185"
                        },
                        {
                            "name": 1,
                            "value": "Unhandled audio type \"A_OPUS\""
                        }
                    ]
                },
                {
                    "name": "encoding",
                    "value": "UTF-8"
                },
                {
                    "name": "mime_type",
                    "value": "video/webm"
                },
                {
                    "name": "matroska",
                    "value": [
                        {
                            "name": "comments",
                            "value": [
                                {
                                    "name": "muxingapp",
                                    "value": [
                                        {
                                            "name": 0,
                                            "value": "Chrome"
                                        }
                                    ]
                                },
                                {
                                    "name": "writingapp",
                                    "value": [
                                        {
                                            "name": 0,
                                            "value": "Chrome"
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                },
                {
                    "name": "version",
                    "value": 2
                }
            ],
            "commonmetadata": [],
            "extmetadata": {
                "DateTime": {
                    "value": "2022-08-24T12:24:45Z",
                    "source": "mediawiki-metadata",
                    "hidden": ""
                },
                "ObjectName": {
                    "value": "20220824122445!phpSFRBUZ",
                    "source": "mediawiki-metadata",
                    "hidden": ""
                }
            },
            "mime": "video/webm",
            "bitdepth": 0
        }
    }
}

Step 5 logs : upload to Commons

Logs appearing when file is actually sent and published on Commons.

{
    "upload-to-commons": {
        "oauth": {
            "upload": {
                "result": "Success",
                "filename": "LL-Q33302_(fsl)-Yug-crabes.webm",
                "warnings": {
                    "badfilename": "LL-Q33302_(fsl)-Yug-crabes.webm"
                },
                "imageinfo": {
                    "timestamp": "2022-08-24T12:26:43Z",
                    "user": "Yug",
                    "userid": 5554,
                    "size": 257139,
                    "width": 0,
                    "height": 0,
                    "parsedcomment": "",
                    "comment": "",
                    "html": "<p>A file with this name exists already, please check <strong><a class=\"mw-selflink selflink\">File:LL-Q33302 (fsl)-Yug-crabes.webm</a></strong> if you are not sure if you want to overwrite it.\n</p>\n<div class=\"thumb tright\"><div class=\"thumbinner\" style=\"width:182px;\"><a href=\"/w/index.php?title=Special:Upload&amp;wpDestFile=LL-Q33302_(fsl)-Yug-crabes.webm\" class=\"new\" title=\"File:LL-Q33302 (fsl)-Yug-crabes.webm\">File:LL-Q33302 (fsl)-Yug-crabes.webm</a>  <div class=\"thumbcaption\"></div></div></div>",
                    "canonicaltitle": "File:LL-Q33302 (fsl)-Yug-crabes.webm",
                    "url": "https://upload.wikimedia.org/wikipedia/commons/2/20/LL-Q33302_%28fsl%29-Yug-crabes.webm",
                    "descriptionurl": "https://commons.wikimedia.org/wiki/File:LL-Q33302_(fsl)-Yug-crabes.webm",
                    "sha1": "0dc33f65efaa4c1cbf267fef166c4742a738bf21",
                    "metadata": [
                        {
                            "name": "GETID3_VERSION",
                            "value": "1.9.21-202109171300"
                        },
                        {
                            "name": "filesize",
                            "value": 257139
                        },
                        {
                            "name": "avdataoffset",
                            "value": 0
                        },
                        {
                            "name": "avdataend",
                            "value": 257139
                        },
                        {
                            "name": "fileformat",
                            "value": "webm"
                        },
                        {
                            "name": "audio",
                            "value": [
                                {
                                    "name": "dataformat",
                                    "value": "A_OPUS"
                                },
                                {
                                    "name": "sample_rate",
                                    "value": 48000
                                },
                                {
                                    "name": "channels",
                                    "value": 1
                                },
                                {
                                    "name": "language",
                                    "value": "eng"
                                },
                                {
                                    "name": "bits_per_sample",
                                    "value": 32
                                },
                                {
                                    "name": "streams",
                                    "value": [
                                        {
                                            "name": "fbdf6d9e2b41f8",
                                            "value": [
                                                {
                                                    "name": "dataformat",
                                                    "value": "A_OPUS"
                                                },
                                                {
                                                    "name": "default",
                                                    "value": ""
                                                },
                                                {
                                                    "name": "sample_rate",
                                                    "value": 48000
                                                },
                                                {
                                                    "name": "channels",
                                                    "value": 1
                                                },
                                                {
                                                    "name": "language",
                                                    "value": "eng"
                                                },
                                                {
                                                    "name": "bits_per_sample",
                                                    "value": 32
                                                }
                                            ]
                                        }
                                    ]
                                },
                                {
                                    "name": "channelmode",
                                    "value": "mono"
                                }
                            ]
                        },
                        {
                            "name": "video",
                            "value": [
                                {
                                    "name": "dataformat",
                                    "value": "V_VP9"
                                },
                                {
                                    "name": "resolution_x",
                                    "value": 640
                                },
                                {
                                    "name": "resolution_y",
                                    "value": 480
                                },
                                {
                                    "name": "display_unit",
                                    "value": "pixels"
                                },
                                {
                                    "name": "display_x",
                                    "value": 640
                                },
                                {
                                    "name": "display_y",
                                    "value": 480
                                },
                                {
                                    "name": "streams",
                                    "value": [
                                        {
                                            "name": "affc7a1157c058",
                                            "value": [
                                                {
                                                    "name": "dataformat",
                                                    "value": "V_VP9"
                                                },
                                                {
                                                    "name": "default",
                                                    "value": ""
                                                },
                                                {
                                                    "name": "resolution_x",
                                                    "value": 640
                                                },
                                                {
                                                    "name": "resolution_y",
                                                    "value": 480
                                                },
                                                {
                                                    "name": "display_unit",
                                                    "value": "pixels"
                                                },
                                                {
                                                    "name": "display_x",
                                                    "value": 640
                                                },
                                                {
                                                    "name": "display_y",
                                                    "value": 480
                                                }
                                            ]
                                        }
                                    ]
                                }
                            ]
                        },
                        {
                            "name": "error",
                            "value": [
                                {
                                    "name": 0,
                                    "value": "EBML parser: ran out of file at offset 257139"
                                }
                            ]
                        },
                        {
                            "name": "warning",
                            "value": [
                                {
                                    "name": 0,
                                    "value": "Unhandled track.video element [module.audio-video.matroska.php:712] (5056::13c0 [1 bytes]) at 185"
                                },
                                {
                                    "name": 1,
                                    "value": "Unhandled audio type \"A_OPUS\""
                                }
                            ]
                        },
                        {
                            "name": "encoding",
                            "value": "UTF-8"
                        },
                        {
                            "name": "mime_type",
                            "value": "video/webm"
                        },
                        {
                            "name": "matroska",
                            "value": [
                                {
                                    "name": "comments",
                                    "value": [
                                        {
                                            "name": "muxingapp",
                                            "value": [
                                                {
                                                    "name": 0,
                                                    "value": "Chrome"
                                                }
                                            ]
                                        },
                                        {
                                            "name": "writingapp",
                                            "value": [
                                                {
                                                    "name": 0,
                                                    "value": "Chrome"
                                                }
                                            ]
                                        }
                                    ]
                                }
                            ]
                        },
                        {
                            "name": "version",
                            "value": 2
                        }
                    ],
                    "commonmetadata": [],
                    "extmetadata": {
                        "DateTime": {
                            "value": "2022-08-24 12:26:43",
                            "source": "mediawiki-metadata",
                            "hidden": ""
                        },
                        "ObjectName": {
                            "value": "LL-Q33302 (fsl)-Yug-crabes",
                            "source": "mediawiki-metadata",
                            "hidden": ""
                        },
                        "CommonsMetadataExtension": {
                            "value": 1.2,
                            "source": "extension",
                            "hidden": ""
                        },
                        "Categories": {
                            "value": "",
                            "source": "commons-categories",
                            "hidden": ""
                        },
                        "Assessments": {
                            "value": "",
                            "source": "commons-categories",
                            "hidden": ""
                        }
                    },
                    "mime": "video/webm",
                    "mediatype": "VIDEO",
                    "bitdepth": 0
                }
            }
        }
    }
}

Logs for data-writing on Lingualibre wikibase

(This should not be relevant to current bug)

{
    "entity": {
        "labels": {
            "en": {
                "language": "en",
                "value": "crabes"
            }
        },
        "descriptions": {
            "en": {
                "language": "en",
                "value": "audio record - fsl - Yug (Yug)"
            }
        },
        "aliases": {},
        "sitelinks": {},
        "claims": {
            "P2": [
                {
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P2",
                        "hash": "6511c2931be27bcac018d16006b07c7c2e704c57",
                        "datavalue": {
                            "value": {
                                "entity-type": "item",
                                "numeric-id": 2,
                                "id": "Q2"
                            },
                            "type": "wikibase-entityid"
                        },
                        "datatype": "wikibase-item"
                    },
                    "type": "statement",
                    "id": "Q797707$A42F8F38-B0D5-4EF3-999D-6980346F3821",
                    "rank": "normal"
                }
            ],
            "P9": [
                {
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P9",
                        "hash": "0edb2831a379eec7c9bed1be6ea9d8c3025b30ad",
                        "datavalue": {
                            "value": {
                                "entity-type": "item",
                                "numeric-id": 8,
                                "id": "Q8"
                            },
                            "type": "wikibase-entityid"
                        },
                        "datatype": "wikibase-item"
                    },
                    "type": "statement",
                    "id": "Q797707$822B4435-97D4-41C5-B1B6-3320FA4DB5ED",
                    "rank": "normal"
                }
            ],
            "P3": [
                {
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P3",
                        "hash": "84062240524e8a58731cacc7e0ecb7b629835e27",
                        "datavalue": {
                            "value": "LL-Q33302 (fsl)-Yug-crabes.webm",
                            "type": "string"
                        },
                        "datatype": "commonsMedia"
                    },
                    "type": "statement",
                    "id": "Q797707$4EFA827E-A4B6-4672-90CD-A3742747FE14",
                    "rank": "normal"
                }
            ],
            "P4": [
                {
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P4",
                        "hash": "95591f07f16320e048886d049410164647254ec3",
                        "datavalue": {
                            "value": {
                                "entity-type": "item",
                                "numeric-id": 99628,
                                "id": "Q99628"
                            },
                            "type": "wikibase-entityid"
                        },
                        "datatype": "wikibase-item"
                    },
                    "type": "statement",
                    "id": "Q797707$22FECC98-6546-4A76-8035-452A9F7C544E",
                    "rank": "normal"
                }
            ],
            "P5": [
                {
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P5",
                        "hash": "b81a7211c2ebed5f07ba758a1a64b50b8f2c9f2b",
                        "datavalue": {
                            "value": {
                                "entity-type": "item",
                                "numeric-id": 480,
                                "id": "Q480"
                            },
                            "type": "wikibase-entityid"
                        },
                        "datatype": "wikibase-item"
                    },
                    "type": "statement",
                    "id": "Q797707$A97B78DE-9317-4BC8-8670-10E82E65B777",
                    "rank": "normal"
                }
            ],
            "P6": [
                {
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P6",
                        "hash": "5872c16c70df6f2a4abb4b509689b65f17656fbc",
                        "datavalue": {
                            "value": {
                                "time": "+2022-08-24T00:00:00Z",
                                "timezone": 0,
                                "before": 0,
                                "after": 0,
                                "precision": 11,
                                "calendarmodel": "http://www.wikidata.org/entity/Q1985727"
                            },
                            "type": "time"
                        },
                        "datatype": "time"
                    },
                    "type": "statement",
                    "id": "Q797707$C4093EFE-94AC-49BA-A224-8AEDA7D0F21B",
                    "rank": "normal"
                }
            ],
            "P7": [
                {
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P7",
                        "hash": "4f6d8a94b46a7d6808d58681ca3e18d98dfdbf1e",
                        "datavalue": {
                            "value": "crabes",
                            "type": "string"
                        },
                        "datatype": "string"
                    },
                    "type": "statement",
                    "id": "Q797707$D5926131-6EB9-453D-8376-07ABB69D32F0",
                    "rank": "normal"
                }
            ]
        },
        "id": "Q797707",
        "type": "item",
        "lastrevid": 785511
    },
    "success": 1
}

@Aklapper hello, I investigated as much as I could on this issue. Could you now refer this bug to folks with (fine) knowledge on UploadWizard ?

My best guess is that we (Lingualibre) make a typographic mistake in the data above, show in logs, so the file is considered by Commons as an audio file.
Something like "mediatype": "VIDEO", while Commons expect "mediatype": "video",, some missing data, or some wrong codec.

With some luck, one of UploadWizard folks can pin point the incorrect value and issue quickly.

Hi, please also connect the attachments F35487578, F35487577, F35487478 to this ticket; they are not visible for anybody else. See https://www.mediawiki.org/wiki/Developers/Maintainers for potential folks, however I don't see an issue with UploadWizard here.

Pulling the two videos from https://commons.wikimedia.org/wiki/Category_talk:Lingua_Libre_pronunciation-fsl , with the first one showing a video:

$:acko\> wget -q https://upload.wikimedia.org/wikipedia/commons/4/48/LL-Q33302_%28fsl%29-Antoine_l._%280x010C%29-livre.webm
$:acko\> ffprobe -v error -select_streams v:0 -show_entries stream=width,height,bit_rate -show_entries format=filename -of csv=s=x:p=0 -show_entries stream=codec_name -of default=noprint_wrappers=1:nokey=1 LL-Q33302_\(fsl\)-Antoine_l._\(0x010C\)-livre.webm 
vp8
640
480
N/A
LL-Q33302_(fsl)-Antoine_l._(0x010C)-livre.webm
$:acko\> ffmpeg -i LL-Q33302_\(fsl\)-Antoine_l._\(0x010C\)-livre.webm 
ffmpeg version 4.4.2 Copyright (c) 2000-2021 the FFmpeg developers
Input #0, matroska,webm, from 'LL-Q33302_(fsl)-Antoine_l._(0x010C)-livre.webm':
  Metadata:
    ENCODER         : Lavf58.29.100
  Duration: 00:00:03.02, start: -0.007000, bitrate: 2307 kb/s
  Stream #0:0(eng): Video: vp8, yuv420p(progressive), 640x480, SAR 1:1 DAR 4:3, 30 fps, 30 tbr, 1k tbn, 1k tbc (default)
    Metadata:
      DURATION        : 00:00:03.000000000
  Stream #0:1(eng): Audio: opus, 48000 Hz, mono, fltp (default)
    Metadata:
      DURATION        : 00:00:03.020000000

$:acko\> wget -q https://upload.wikimedia.org/wikipedia/commons/c/c6/LL-Q33302_%28fsl%29-Yug-chinois.webm
$:acko\> ffprobe -v error -select_streams v:0 -show_entries stream=width,height,bit_rate -show_entries format=filename -of csv=s=x:p=0 -show_entries stream=codec_name -of default=noprint_wrappers=1:nokey=1 LL-Q33302_\(fsl\)-Yug-chinois.webm 
vp9
640
480
N/A
LL-Q33302_(fsl)-Yug-chinois.webm
$:acko\> ffmpeg -i LL-Q33302_\(fsl\)-Yug-chinois.webm 
ffmpeg version 4.4.2 Copyright (c) 2000-2021 the FFmpeg developers
Input #0, matroska,webm, from 'LL-Q33302_(fsl)-Yug-chinois.webm':
  Metadata:
    encoder         : Chrome
  Duration: N/A, start: 0.000000, bitrate: N/A
  Stream #0:0(eng): Audio: opus, 48000 Hz, mono, fltp (default)
  Stream #0:1(eng): Video: vp9 (Profile 0), yuv420p(tv), 640x480, SAR 1:1 DAR 4:3, 29.67 tbr, 1k tbn, 1k tbc (default)
    Metadata:
      alpha_mode      : 1

3 screenshots should now be visible.

I also noticed a change from vp8 to vp9.

[[mw:Developers/Maintainers:maintainer]] has the 4 possibly relevant projects :

  • File management (unassigned)
  • Gallery (unassigned)
  • Uploading (unassigned)
  • Media Storage (Filippo_Giunchedi, Matthew Vernon)
  • Extension:CommonsMetadata (Bawolff)

I sent the following call to the 2 Media Storage devs :

== Videos appears as audio on mw ==
As a [[mw:Developers/Maintainers:maintainer]] of Media storage, could you take a look at this phabricator ticket :
* [[phab:T312554]] — Sign language video recordings appear as audio on Commons
The recorded video end up appearing as audio on MediaWiki and we don't know why. We provided logs to inspect for clues. Depending on the scope of Media storage, you may have the relevant expertise.
<br>If not, feel free to inform us of the contrary or to refer this ticket to more relevant developers.

Need to dig further later.

There's a lot of missing metadata. There is also Duration: N/A, start: 0.000000, bitrate: N/A...

Thank @Aklapper , your finding is very interesting. I have good will but I'm more a Community/Product manager, i do my best. It goes faster when engineers tackles it tho. I will repeat my call to Wikimedia France.

Yug updated the task description. (Show Details)
Yug updated the task description. (Show Details)

I guess the question is which exact software, software versions, and commands produce the recent videos. Where to find more info about the tech stack?

Bawolff added subscribers: TheDJ, Bawolff.

Unlikely to be uploadwizard or commonsmetadata related. Also unlikely related to the method of upload either.

Sounds sort of similar to T226311

"EBML parser: ran out of file at offset 257139"

So the ebml storage format said: go look here for more data. The parser went to that point in the file and the file ended before getting to that point. Thus the file created is broken/partial

VLC reports:

mkv warning: no cues/empty cues found->seek won't be precise
mkv debug: MKV/Ebml Parser: m_el[mi_level] == NULL
mkv warning: EOF
mkv warning: cannot get block EOF?
main debug: EOF reached

The video also doesn't want to open at all in Safari (which also indicates that there is something wrong with the video)

MediaWiki says:
https://commons.wikimedia.org/wiki/Special:ApiSandbox#action=query&format=json&prop=videoinfo&meta=&titles=File%3ALL-Q33302%20(fsl)-Antoine%20l.%20(0x010C)-livre.webm&viprop=timestamp%7Cuser%7Cmetadata%7Cextmetadata

Unhandled seekhead element (4x)
Unhandled track element (2x)
Unhandled audio type \"A_OPUS\"

Interestingly there are two durations listed as well, which also is strange, both with unicode replacement characters at the end of them.

Conclusion, the muxer you are using (in this case a muxer by the browser from the looks of it) is not spec compliant.

It seems you are using the browser directly, which originally was targeting web streaming for these apis. They likely are not very well tested for producing valid and clean webm files.

This is probably related to many of the comments/issues reported here:
https://bugs.chromium.org/p/chromium/issues/detail?id=642012
https://bugs.chromium.org/p/chromium/issues/detail?id=561606

You should probably do some webm cleanup using something like mkvclean or similar after taking the file from MediaRecorder
There is also this JS project which implements an ebml parser and apparently can be used to cleanup mediarecorder output: https://github.com/legokichi/ts-ebml/blob/master/readme.md

@TheDJ , @Aklapper, @Bawolff thank you for jumping in and providing interpretations and leads.
I asked Lingualibre community to test various browsers, need to wait those feedbacks.

Note also that the "stash on Lingualibre.org" reports the language as "eng" (English), while to should be "fra" (audio part, in French), or "fsl" (gesture video part, in French Sign Language)

Done in commit ee86f4ea.

I tried to fix the files before uploading them to the UploadStash using several javascript libraries, including ts-ebml mentionned above. The file became seekable and the metadata looked fine, but I was still facing the same issue once uploaded to a MediaWiki. To avoid loading even more heavy js library, I switched to do the cleaning server-side using ffmpeg, which is realy efficient.

Hello @0x010C , thanks already for this commit.
Also, do you think Wikimedia Commons should consider getting similar server-side file checker for uploaded files if it can restore metadata ?

I'm not sure, as this issue is very specific to files created by the MediaRecorder api.