利用 TinyPNG 进行博客图片压缩

自己的生活博客 https://blog.zerolacqua.top/ 运行也有快三年了，相比与目前这个博客，一个问题就是图片不好处理。每篇博文都要配张图，一开始还会耐心去选图，后面就越来越无脑用二次元图片当封面了。因为配图麻烦，自己也不是很想写文章了，虽然有很多的想法，最后却懒得动笔。

不过最关键的问题是图片没怎么进行压缩处理。唯一的处理就是自己用 windows 的图片编辑器调整一下尺寸。不得不说，这些图片既占空间又费带宽，拖慢网页速度还吃我 CDN 的流量。在写这篇文章之前我一直在物色解决方案，希望能找到一个能整合到 hexo 部署流程中的 CI，但都不太理想。最近实在受不了，心一横抽了一天出来先整一个替换方案，先把图片压缩了再说。于是有了这篇利用 TinyPNG 进行博客图片压缩的文章。

使用 API

如果只是一两张图片，其实可以直接在 TinyPNG 网页上传进行压缩的，不过一次最多 20 张，并且有大小限制。

注册账号每个月 API 可以免费使用 500 次，对于小博客足够了。TinyPNG 的 API 文档非常简单，看一眼基本上就会了。API 还提供了 resize 相关的功能，可以按不同要求对你的图片进行大小调整（调整大小也会计入一次请求），比如头像我就用 thumb 方法，并调整到 128*128。即使是一些不是正方形的头像，也能自动调整。博文的封面也是用这个方法调整到一致的大小的（如果图片尺寸的比例差异较大，可能会裁剪得不太理想）。

在这个 API 的基础上，扫描图片文件夹，让压缩后的同名图片替换原来的图片就行了。但一个关键的问题是，如果每次新增几张图片，就用这套压缩流程，那不是有很多图片都得重新压缩一次？那 500 次请求也不够糟蹋的。

显然，我们只需要更新新增的、有变化的图片就行。于是我们通过一个文件指纹的方式，计算一个文件的哈希值，并把哈希值缓存起来。每次进行 API 请求前，先比较该文件的哈希值和缓存里的哈希是否相同，相同就不用再去请求了。

出于测试和文件整理的习惯，我将这个压缩图片的脚本与我的图片文件夹放在了不同的地方。为了保证脚本在它的项目路径里运行，而又能修改 assets 文件夹（将部署到我的网页 CDN 源站）下的图片。我使用了软链接的方式。这样脚本就不用放在 assets 里了，放在一个单独的项目文件夹里即可，需要处理的图片文件夹软链接到这个项目文件夹里就行。

下面是源码，原理比较简单（面向 GPT 编程即可）。

Python
import hashlib
import json
from pathlib import Path
import tinify

tinify.key = 'your_key'

IMAGE_DIR = Path("images")
CACHE_FILE = IMAGE_DIR / ".tinypng-cache.json"


def compute_file_hash(filepath: Path):
    hasher = hashlib.sha256()
    with filepath.open("rb") as f:
        while chunk := f.read(8192):
            hasher.update(chunk)
    return hasher.hexdigest()


def load_cache():
    if CACHE_FILE.exists():
        with CACHE_FILE.open("r") as f:
            return json.load(f)
    return {}


def save_cache(cache):
    with CACHE_FILE.open("w") as f:
        json.dump(cache, f, indent=2)


def is_image_file(filename):
    return str(filename).lower().endswith((".avif", ".webp", ".jpg", ".jpeg", ".png"))


def compress_image(filepath: Path, resize_config_dict):
    try:
        print(f"🔄 正在压缩: {filepath}")
        source = tinify.from_file(str(filepath))
        if resize_config_dict is not None:
            resized = source.resize(**resize_config_dict)
            resized.to_file(str(filepath))
        else:
            source.to_file(str(filepath))
    except tinify.errors.AccountError:
        print("❌ 账户验证失败，请检查 API Key 是否正确。")
        exit(1)
    except Exception as e:
        print(f"⚠️ 压缩失败: {filepath} - {e}")


def compress_folder_with_config(folder: Path, cache, resize_config_dict=None):
    updated = False
    
    
    for filepath in Path(folder).rglob("*"):
        if filepath.is_file():
            if is_image_file(filepath):
                file_hash = compute_file_hash(filepath)
                if cache.get(str(filepath)) == file_hash:
                    print(f"✔️ 已压缩，跳过: {filepath}")
                else:
                    compress_image(filepath, resize_config_dict)
                    cache[str(filepath)] = compute_file_hash(filepath)
                    updated = True
    return updated


def compress_only_current_folder(folder: Path, cache):
    updated = False
    for file in folder.iterdir(): 
        if file.is_file() and is_image_file(file.name):
            file_hash = compute_file_hash(file)

            if cache.get(str(file)) == file_hash:
                print(f"✔️ 已压缩，跳过: {file}")
            else:
                compress_image(file, resize_config_dict=None)
                cache[str(file)] = compute_file_hash(file)
                updated = True

    return updated


def add_cache_to_skip(folder: Path):
    cache = load_cache()
    updated = False

    for file in folder.iterdir():
        if file.is_file() and is_image_file(file.name):
            if str(file) in cache:
                print(f"✔️ 已存在缓存，跳过: {file}")
                continue
            cache[str(file)] = compute_file_hash(file)
            updated = True

    if updated:
        save_cache(cache)
        print("📦 缓存已添加。")


def main():
    cache = load_cache()
    updated = False
    updated |= compress_only_current_folder(IMAGE_DIR, cache)

    for child in IMAGE_DIR.iterdir():
        if child.is_dir():
            dir_name = child
            resize_config_dict = None
            if dir_name.name in ["link", "user_avatar"]:
                print(f"🔄 处理头像目录：{dir_name}")
                resize_config_dict = {
                    "method": "thumb",
                    "width": 128,
                    "height": 128,
                }
            elif dir_name.name in ["blog_covers"]:
                print(f"🔄 处理博文封面目录: {dir_name}")
                resize_config_dict = {
                    "method": "thumb",
                    "width": 640,
                    "height": 360,
                }
            elif dir_name.name in ["resources"]:
                print(f"🔄 处理资源封面目录: {dir_name}")
                resize_config_dict = {
                    "method": "fit",
                    "width": 320,
                    "height": 180,
                }
            else:
                print(f"🔄 处理常规目录: {dir_name}")

            updated |= compress_folder_with_config(dir_name, cache, resize_config_dict)

    if updated:
        save_cache(cache)
        print("📦 缓存已更新。")
    else:
        print("✅ 所有图片均为最新，无需压缩。")


if __name__ == "__main__":
    main()

改进之处

说到底，这个流程虽然解决了燃眉之急，但并没有完全解放我的双手。我还是需要在写博文之后拷贝图片，然后跑一次脚本。可能可以采用 gulp 编写 hexo 部署之前的逻辑。那这需要改的地方就有很多了。

我的 assets 文件夹里不只有博客的图片。或许图片可以先放在博客的源码路径里，并设置忽略文件，在构建和部署流程中直接把源码文件夹里的图片拷贝到 assets 文件夹对应位置里，由 gulp 或者 npm 插件调用 API 进行处理，还需要有一种方式将增量更新的图片上传到我的服务器上。

当然，在本地调试时还可以使用相对路径访问图片，在部署时 gulp 将图片 url 替换为网络地址，并把图片上传到服务器上。这样就只需要专心选图片，写文章就行了。

可惜我太懒，而且现在这个流程没有麻烦到要改进成更懒人专用的 CI，就先将就用吧。

目录

使用 API

改进之处

参考链接