# Tube-Archivist Scripts Small collection of Bash helpers used to prepare offline / archived YouTube videos for import into TubeArchivist. Written for Debian-like systems; should work in other Linux distributions with Bash and standard GNU utilities. --- ## Goal Normalize filenames and create accompanying metadata (.info.json) so TubeArchivist can ingest local archives (especially those from archive.org or other offline sources). Example input filename: - Example A: `20170311 (5XtCZ1Fa9ag) Terry A Davis Live Stream.mp4` - Example B: `20131003 - 001 - 1okW1RTPZ7Q - TempleOS Hymns #1.mp4` Resulting filename and sidecar JSON: - Example A: - `20170311 Terry A Davis Live Stream [5XtCZ1Fa9ag].mp4` - `20170311 Terry A Davis Live Stream [5XtCZ1Fa9ag].info.json` - Example B: - `20131003 - 001 - TempleOS Hymns #1 [1okW1RTPZ7Q].mp4` - `20131003 - 001 - TempleOS Hymns #1 [1okW1RTPZ7Q].info.json` --- ## How it works / Usage 1. Put all the scripts in the directory with your video files (scripts currently do not recurse into subdirectories). 2. Edit 'Example.info.json' - Update these lines (And also any other lines you want copied to each video that won't be scripted in. Null values I didn't have data for yet.) ``` "channel_id": "Change to Youtube Username", "uploader": "Change to Youtube Username", "uploader_id": "Change To Channel ID", "uploader_url": "https://www.youtube.com/channel/ChangeToChannelID-or-username", ``` 3. Run the scripts in order from the directory containing your media below: Each script performs a single transformation so you can inspect results between steps. ## Scripts (order and purpose) 1a. `convert-()-to-[].bash` - Replace parentheses containing an ID with square brackets (e.g. `(ID)` -> `[ID]`) and clean spacing. - If already have id at end skip to 3. 1b. `move-find-id-to-end-filename.bash` - Split filename into parts. Find video id between second and third " - " without brackets, adds backets, moves [id] to end of filename before extension. - Skip 1a/2a, straight to 3. 2a. `move-[id]-to-end-filename.bash` - Ensure the video ID appears at the end of the filename inside square brackets. 3. `create-json-alongside-each-file.bash` - Create an empty `.info.json` file for each video filename (sidecar). 4. `insert-id-into-json.bash` - Populate the sidecar JSON with the video ID field. 5. `insert-title-into-json.bash` - Insert the cleaned title into the sidecar JSON. 6. `insert-date-into-json.bash` - Insert the date from filename (if available) into the sidecar JSON. --- ## Notes and tips - Scripts do not process subdirectories. Run at the directory root for each archive. - Always test on a copy or run a subset first to confirm behavior. - If filenames contain unusual characters, run a quick grep for non-ASCII prior to processing. - Modify scripts to add dry-run mode if you want safer previews. - ElasticSearch Common Commands for updates: [ElasticSearch Common Commands](ElasticSearch-Common-Commands.md) --- ## Example archive Archive used for testing: `https://archive.org/details/TempleOS-TheMissingVideos` ---