-
Notifications
You must be signed in to change notification settings - Fork 4.3k
Description
Describe the bug
I have a maintenance script I run to keep a local copy of billing & usage data for my personal AWS account. It's identifying almost every file as changed, on every run even though most of the files haven't been modified in years.
Expected Behavior
Only changed files -- in this case, files representing the current billing period -- should be downloaded.
Current Behavior
Of 6,279 files that do not represent the current billing period, it's consistently re-downloading 5,831 of them. The files it downloads are, byte-for-byte identical to the existing ones. I spot-checked one of the files, and aws s3 ls
reports the exact same size and timestamp as ls
does.
Reported by aws s3 sync
:
download: s3://mrjoy-billing-data//cur//billing_and_usage/20210101-20210201/20210122T235314Z/billing_and_usage-00001.csv.gz to ../personal/Finance/AWS_Billing_Data/cur/billing_and_usage/20210101-20210201/20210122T235314Z/billing_and_usage-00001.csv.gz
Reported by aws s3 ls
:
% aws-vault exec mrjoy -- aws s3 ls s3://mrjoy-billing-data//cur//billing_and_usage/20210101-20210201/20210122T235314Z/billing_and_usage-00001.csv.gz
2021-01-22 15:53:24 296522 billing_and_usage-00001.csv.gz
Reported by ls
:
% ls -laD "%Y-%m-%d %H:%M:%S" ~/personal/Finance/AWS_Billing_Data/cur/billing_and_usage/20210101-20210201/20210122T235314Z/billing_and_usage-00001.csv.gz
-rw-r--r-- 1 jonathonfrisby staff 296522 2021-01-22 15:53:24 /Users/jonathonfrisby/personal/Finance/AWS_Billing_Data/cur/billing_and_usage/20210101-20210201/20210122T235314Z/billing_and_usage-00001.csv.gz
The post-fetch
commit in all cases shows diffs for the files in the current billing period (as would be expected), and no changes to any of the other files that aws s3 sync
reports as being downloaded.
All told, aws s3 sync
appears to be downloading around 700MB of files on each run that it shouldn't be.
Reproduction Steps
The relevant portion of my script is:
#!/bin/bash
IFS=$'\n\t'
set -euo pipefail
(
cd ~/personal
git add .
git commit --all --allow-empty -m "AWS bill snapshot, pre-fetch..."
aws-vault exec mrjoy -- aws s3 sync s3://mrjoy-billing-data/ ~/personal/Finance/AWS_Billing_Data/
git add .
git commit --all --allow-empty -m "AWS bill snapshot, post-fetch..."
)
The data in the bucket is written by AWS itself.
Possible Solution
No response
Additional Information/Context
No response
CLI version used
2.7.26
Environment details (OS name and version, etc.)
macOS 12.5.1