Unzipping big compressed files on upload to S3
I was looking up ways to solve this problem and I have already seen multiple threads about people having the same issue. Usually the proposed solutions are always :
- Aws Fargate
- decompressing on lambda with streaming
Now, I spent some time trying to find a python script that worked for streaming data to and from S3 and could not find one that worked, so I am starting to consider other approaches.
I thought about using an ec2 machine to decompress. When triggered the lambda function would evoke a bash script in the ec2 to download the zip, decompress it and upload the results again back to s3. However I have not seen this solution being proposed anywhere else yet. It does not have the time or memory constraints that lambda has and it does not require you to pay for another aws product if you already have ec2.
What do you think?