I receive anywhere from 4 to 100 very large tar (~20GB) archive files everyday. I have been concatenating them in the past by looping through each of the archives I see on the file system and doing something like this
/bin/tar -concatenate --file=allTars.tar receivedTar.tar
The problem with this however is that as I concatenate more and more tar files, it must read to the end of allTars.tar
to begin concatenating again. Sometimes it takes over 20 minutes to start adding another tar file. It is just too slow and I am missing an agreed upon delivery time of the complete allTars.tar
.
I also tried handing my tar command a list of files like so:
/bin/tar --concatenate --file=alltars.tar receiverTar1.tar receivedTar2.tar receivedTar3.tar...etc
This gave very odd results. allTars.tar
would be the expected size (ie close to all the receivedTar.tar
files' sizes added together) but seemed to overwrite files when allTars.tar
was unpacked.
Is there any way to concatenate all these tar files in one command or so it doesn't have to read to the end of archive being concatenated to every time and have them unpack correctly and with all files/data?
Answer
This may not help you, but if you are willing to use the -i
option when extracting from the final archive, then you can simply cat
the tars together.
A tar file ends with a header full of nulls and more null padding till the end of the record. With --concatenate
tar must go through all the headers to find the exact position of the final header, in order to start overwriting there.
If you just cat
the tars, you just have extra nulls between headers. The -i
option asks tar to ignore these nulls between headers. So you can
cat receiverTar1.tar receivedTar2.tar ... >>alltars.tar
tar -itvf alltars.tar
Also, your tar --concatenate
example ought to be working. However, if you have the same named file in several tar archives you will rewrite that file several times when you extract all from the resulting tar.
No comments:
Post a Comment