This release contains sign language videos **embedded as csv files** inside zip archives. The landmarks are rounded to 4 decimal places which give a precision of 0.1mm in world coordinates and 1 pixel in a 10k resolution image.
Text transcription (gloss) of the signs is present in the file names. More synonyms and translations that map to these signs can be seen in the json data in the repo. The dataset has three categories:
- **Standard Dictionary**: (788 + 1)
Standard sign language dictionaries obtained from recognized organizations. The names are `country-organization-groupNumber.landmarks-embeddingModel-extension.zip`
- **Dictionary Replications**: (788 * 12 * 4 = 37,824) (coming soon!)
Manually recorded sign language videos that are replication of the reference clips. The names are `country-organization-groupNumber_personCode_cameraAngle.landmarks-embeddingModel-extension.zip`
<!--
- **Miscellaneous Sentences**:
These are labeled sign language videos scraped from the internet. The names are `languageName-source-serialNumber_featureType-subtype.zip`.
-->
MediaPipe landmarks Header
World coordinates are 3D body joint coordinates in meters. Image coodinates are fraction of the video height/width where the landmark is located and z value is depth from the camera.
For both models, we get 33pose landmarks and 21 landmarks per hand and 5 values per landmark (x, y, z, visibility, presence).
total_rows = number_of_frames in source video