Switch from Python urlretrieve() to requests for better features

less than 1 minute read

Python’s old urllib.request.urlretrieve was deprecated several years ago. One of the biggest problems with urlretrieve is it doesn’t have a way to handle connection timeouts. This can lead to user complaints where they think your program is hanging, when really it’s a bad internet connection since urlretrieve will hang for many minutes.

Fix: Upgrade to Python requests

Python requests is strongly recommended for most Python internet use. Instead of bashing your head against solved problems with the vagaries of network and internet connections, use requests.

This is a recommended, robust way to download files in Python with timeout. I name it urlretrieve to remind myself not to use the old one.

from pathlib import Path
import requests

def urlretrieve(url: str, fn: Path):
    with fn.open('wb') as f:
        f.write(requests.get(url, allow_redirects=True, timeout=10).content)

Why isn’t this in requests? Because the Requests BDFL doesn’t want it

Leave a comment