- **BREAKING CHANGE**
- BEFORE:
- `create_list_for()` returned a `str` containing the name of the file the program wrote to
- NOW:
- `create_list_for()` returns a `tuple` containing
- a `list` of `list`s containing the video information found by the program for the current run
- by default, returns dummy video data to avoid cluttering the output
- to return the actual video data, set the `video_data_returned` ListCreator attribute to `True`
- dummy data: `[[0, '', '', '']]`
- a `tuple` containing a `str` with the name of the channel (taken from the channel's heading) and a `str` with the name of the file written to
- `('The Channel Name', 'the_name_of_the_file')`
- `('The Channel Name', '')` if the ListCreator attributes are `txt=False`, `csv=False`, `md=False`, AND `video_data_returned=True`
- see the **NEW FEATURES** section below for more details about `video_data_returned`
- access the full documentation for the updated `create_list_for` method with `help(ListCreator.create_list_for)` in the python interpreter
- **BUGFIX**
- fixes `cookie_consent` blocking logic for new HTML in GDPR regions
- YouTube updated the HTML formatting for blocking cookie consent, and the previous cookie consent blocking logic broke
- this release fixes the blocking logic to work with the new HTML formatting
- **NEW FEATURES**
- overview for the new ListCreator attributes given here, but run `help(ListCreator)` in the python interpreter or read the "More API information" section in the python README to see the full documentation:
- `file_suffix` allows more control over the file naming (`True` by default)
- `all_video_data_in_memory` scrapes the ENTIRE YouTube channel's videos page, EVEN if files exist for the channel already (`False` by default)
- must also set the `video_data_returned` attribute to `True` to actually get this information
- `video_data_returned` returns the video data for all videos the program scraped (`False` by default)
- data returned depends on a number of factors, see full documentation for more details
- `video_id_only` saves only the video ID instead of the entire URL (`False` by default)
- example: saves 'abcdefghijk' instead of 'https://www.youtube.com/watch?v=abcdefghijk'
- overview for the updated `file_name` argument options in the `create_list_for` method given here, but run `help(ListCreator.create_list_for)` in the python interpreter to see the full documentation:
- `file_name='auto'` names the output file(s) using the name that shows up under the banner when you navigate to the channel's homepage (with spaces removed)
- `file_name='id'` names the output file(s) using the identifier from the URL provided to the `url` argument
- run `help(ListCreator.create_list_for)` for a comprehensive list of examples
- using `file_name='id'` is very useful when multiple channels have the SAME channel name
- **PERFORMANCE IMPROVEMENTS**
- BEFORE:
- the program pulled the video data from the selenium instance and wrote to the file(s) directly
- NOW:
- the program loads the video data from the selenium instance into memory, THEN writes the saved video data from memory to the file(s)
- the performance improvement is more noticeable when writing more information
- for example:
- writing information for 200 videos to just a csv file: negligible performance difference between writing to csv file directly and loading to memory & THEN writing to csv file
- writing information for 200 videos to csv, txt, md files: slight performance difference between writing to files directly and loading to memory & THEN writing to files, but still not much of a performance difference
- writing information for 20000 videos to just a csv file: noticeable performance difference between writing to csv file directly and loading to memory & THEN writing to csv file
- writing information for 20000 videos to csv, txt, md files: significant performance difference between writing to to files directly and loading to memory & THEN writing to files
- summary:
- the performance difference between writing to ONE file directly and loading to memory & THEN writing to ONE file is barely noticeable for small jobs and more noticeable for larger jobs
- the performance difference between writing to MULTIPLE files directly and loading to memory & THEN writing to MULTIPLE file is more noticeable for small jobs (compared to writing to only ONE file) and SIGNIFICANT for larger jobs
- logs from tests used to benchmark performance included below:
<details>
<summary><strong><em>See logs</em></strong></summary>
<details>
<summary><strong>for https://www.youtube.com/user/schafer5 (small channel, 230 videos)</strong></summary>
<details>
<summary>writing to 1 file directly with csv=True, txt=False, md=False </summary>
- to create the file:
It took 9.240757292005583 seconds to find 230 videos from https://www.youtube.com/user/schafer5/videos
It took 4.265756259999762 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.csv
This program took 19.537945401003526 seconds to complete.
- to update the file:
It took 0.8453300589972059 seconds to find 60 videos from https://www.youtube.com/user/schafer5/videos
It took 0.6392399440010195 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.csv
This program took 7.754261410002073 seconds to complete.
</details>
<details>
<summary>writing to 1 file by loading video information into memory THEN writing to files with csv=True, txt=True, md=True</summary>
- to create the file:
It took 9.163404727999989 seconds to find 230 videos from https://www.youtube.com/user/schafer5/videos
It took 4.260267737000007 seconds to load information for 230 videos into memory
It took 0.002389371999996115 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.csv
This program took 19.483281371000004 seconds to complete.
- to update the file:
It took 0.8521808300000089 seconds to find 60 videos from https://www.youtube.com/user/schafer5/videos
It took 1.0964175420000117 seconds to load information for 60 videos into memory
It took 0.0015745449999826633 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.csv
This program took 7.985743492000012 seconds to complete.
</details>
<details>
<summary>writing to 3 files directly with csv=True, txt=True, md=True</summary>
- to create the files:
It took 9.166668037003546 seconds to find 230 videos from https://www.youtube.com/user/schafer5/videos
It took 10.160974278995127 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.txt
It took 10.164936708999448 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.csv
It took 10.168633003995637 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.md
This program took 25.594990328005224 seconds to complete.
- to update the files:
It took 0.8503098270011833 seconds to find 60 videos from https://www.youtube.com/user/schafer5/videos
It took 1.5225159670007997 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.csv
It took 1.5322243859991431 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.txt
It took 1.5359413480036892 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.md
This program took 8.472728426997492 seconds to complete.
</details>
<details>
<summary>writing to 3 files by loading video information into memory THEN writing to files with csv=True, txt=True, md=True</summary>
- to create the files:
It took 9.367390958000005 seconds to find 230 videos from https://www.youtube.com/user/schafer5/videos
It took 4.218187391999997 seconds to load information for 230 videos into memory
It took 0.003894963000000473 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.md
It took 0.005060710999998719 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.csv
It took 0.006283445999997639 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.txt
This program took 18.754924324 seconds to complete.
- to update the files:
It took 0.8672965029999986 seconds to find 60 videos from https://www.youtube.com/user/schafer5/videos
It took 1.0901944209999996 seconds to load information for 60 videos into memory
It took 0.005667658999996661 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.csv
It took 0.008393589000000645 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.txt
It took 0.008197031000001687 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.md
This program took 8.090583961999997 seconds to complete.
</details>
</details>
<details>
<summary><strong>for https://www.youtube.com/c/KhanAcademy (medium channel, 8095 videos)</strong></summary>
<details>
<summary>writing to 1 file directly with csv=True, txt=False, md=False </summary>
- to create the file:
It took 322.72226654399856 seconds to find 8095 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 256.63442500399833 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.csv
This program took 585.4076739919983 seconds to complete.
- to update the file:
It took 0.8482559289986966 seconds to find 60 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 0.5600300389996846 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.csv
This program took 7.653723870003887 seconds to complete.
</details>
<details>
<summary>writing to 1 file by loading video information into memory THEN writing to files with csv=True, txt=True, md=True</summary>
- to create the file:
It took 316.9717323640002 seconds to find 8095 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 248.92245618300012 seconds to load information for 8095 videos into memory
It took 0.07691853599999376 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.csv
This program took 572.114162118 seconds to complete.
- to update the file:
It took 0.8459371520000332 seconds to find 60 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 0.9670944140000302 seconds to load information for 60 videos into memory
It took 0.02941359300007207 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.csv
This program took 8.209143252000104 seconds to complete.
</details>
<details>
<summary>writing to 3 files directly with csv=True, txt=True, md=True</summary>
- to create the files:
It took 314.01985485899786 seconds to find 8095 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 519.1903085960002 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.txt
It took 519.1941804189992 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.csv
It took 519.197644068001 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.md
This program took 839.4073893879977 seconds to complete.
- to update the files:
It took 0.8488957250010571 seconds to find 60 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 1.580211615000735 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.csv
It took 1.681963879003888 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.txt
It took 1.6842712280049454 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.md
This program took 8.823843261001457 seconds to complete.
</details>
<details>
<summary>writing to 3 files by loading video information into memory THEN writing to files with csv=True, txt=True, md=True</summary>
- to create the files:
It took 316.342601403 seconds to find 8095 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 261.87072707100003 seconds to load information for 8095 videos into memory
It took 0.1363127509999913 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.csv
It took 0.1775351439999895 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.md
It took 0.18588107000005039 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.txt
This program took 584.703847726 seconds to complete.
- to update the files:
It took 0.8483775499998956 seconds to find 60 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 1.0671216570001434 seconds to load information for 60 videos into memory
It took 0.17331316700006028 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.csv
It took 0.22995445900005507 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.txt
It took 0.23345572800008085 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.md
This program took 8.503321469999833 seconds to complete.
</details>
</details>
<details>
<summary><strong>for https://www.youtube.com/user/NBCNews/videos (large channel, ~32550 videos)</strong></summary>
<details>
<summary>writing to 1 file directly with csv=True, txt=False, md=False </summary>
- to create the file:
It took 3420.0639533489993 seconds to find 32347 videos from https://www.youtube.com/user/NBCNews/videos
It took 4988.648231769999 seconds to write all 32347 videos to NBCNews_reverse_chronological_videos_list.csv
This program took 8414.909623333002 seconds to complete.
- to update the file:
forgot to run this test :D
</details>
<details>
<summary>writing to 1 file by loading video information into memory THEN writing to files with csv=True, txt=True, md=True</summary>
- to create the file:
It took 3367.386001154002 seconds to find 32357 videos from https://www.youtube.com/user/NBCNews/videos
It took 4880.191474030002 seconds to load information for 32357 videos into memory
It took 0.24478799300050014 seconds to write all 32357 videos to NBCNews_reverse_chronological_videos_list.csv
This program took 8253.73690525 seconds to complete.
- to update the file:
It took 0.8474488579995523 seconds to find 60 videos from https://www.youtube.com/user/NBCNews/videos
It took 1.1012943870009622 seconds to load information for 60 videos into memory
It took 0.11654774600174278 seconds to write the 5 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.csv
This program took 8.668505469999218 seconds to complete.
</details>
<details>
<summary>writing to 3 files directly with csv=True, txt=True, md=True</summary>
- to create the files:
It took 3396.025502143 seconds to find 32347 videos from https://www.youtube.com/user/NBCNews/videos
It took 7683.585577874001 seconds to write all 32347 videos to NBCNews_reverse_chronological_videos_list.txt
It took 7683.592947972 seconds to write all 32347 videos to NBCNews_reverse_chronological_videos_list.md
It took 7684.030176524999 seconds to write all 32347 videos to NBCNews_reverse_chronological_videos_list.csv
This program took 11086.336240618999 seconds to complete.
- to update the files:
It took 0.8738655359993572 seconds to find 60 videos from https://www.youtube.com/user/NBCNews/videos
It took 1.8775347520004289 seconds to write the 0 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.csv
It took 2.120259861001614 seconds to write the 0 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.txt
It took 2.132926509999379 seconds to write the 0 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.md
This program took 9.435579917999348 seconds to complete.
</details>
<details>
<summary>writing to 3 files by loading video information into memory THEN writing to files with csv=True, txt=True, md=True</summary>
- to create the files:
It took 3478.1540728540003 seconds to find 32353 videos from https://www.youtube.com/user/NBCNews/videos
It took 5022.493407319 seconds to load information for 32353 videos into memory
It took 0.5065521739998076 seconds to write the 6 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.csv
It took 0.587243801997829 seconds to write all 32353 videos to NBCNews_reverse_chronological_videos_list.txt
It took 0.6058889249979984 seconds to write all 32353 videos to NBCNews_reverse_chronological_videos_list.md
This program took 8507.703900004002 seconds to complete.
- to update the files:
It took 0.8569685050024418 seconds to find 60 videos from https://www.youtube.com/user/NBCNews/videos
It took 1.1060196290018212 seconds to load information for 60 videos into memory
It took 0.5880495099991094 seconds to write the 4 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.csv
It took 0.8386826800015115 seconds to write the 4 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.txt
It took 0.8496009250011411 seconds to write the 4 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.md
This program took 9.45503293100046 seconds to complete.
</details>
</details>
</details>