Let's suppose that I have any archive or file on yandex drive like that: https://disk.yandex.ru/d/KGA6qXDT87pTVA I want to download it directly to my google drive, how could I do it? First idea that comes to my mind is to:
from google.colab import drive
drive.mount('/content/drive/path/to/my/dir')
! wget https://disk.yandex.ru/d/KGA6qXDT87pTVA
But it downloads html page, not the content. Hence, to use wget I have to have direct link to the file. I can't find such direct link on page, so what should I do?
Another solution that I came up with is to parse HTML code of webpage and pull out download buttons using BeautifulSoup
. But it seems like to protect websites from potential attacks many developers set CAPTCHA protection. Hence the only proper way to do it without violating any rules is to use API requests.
The official YANDEX webpage describes in deets how to do it.
Check https://yandex.com/dev/disk/api/concepts/about.html
Unfortunately, it doesn't help in all cases because not all disk and sharing platforms provide you with API and at the same time protect their page with CAPTCHA. In these type of situations you should do everything manually without any automation.