batch - How to compress multiple files with similar names?

Tuesday, January 3, 2017

batch - How to compress multiple files with similar names?

So I've got some 20 000 files that I want to compress and group by following logic:

compress every file that have identical characters up to (

also include files that have no (

So the files are like

file_123.foo
file_123(abc).foo
file_123(b9)(ca)[a1].foo
foobar(a).foo
foobar.foo
foobar(123).foo

which should be compressed to

file_123.7z
foobar.7z

I'm open to windows batch files, unix scripts or any compression program (I can work from there), though the most convenient combo would be .7z and windows.

UPDATE

cYrus gave me a perfect answer, the problem was my question wasn't precise enough :) Now that I'm smarter, here's the next set of problems I haven't figured out how to get around yet:

So everything works perfectly unless this happens:

file_123(abc).foo
file_123456789(b9).foo

Those two shouldn't be grouped, i.e., they should end up in two separate files:

file_123.7z
file_123456789.7z

This one:

for pfx in $(for i in *.foo; do echo "${i%%[.(]*}"; done | sort -u); do 7z a "$pfx.7z" $pfx*; done

creates those two separately, but the shorter file works as catch-all, i.e., file_123.7z includes both files, which it shouldn't.

Answer

This should work:

for pfx in $(for i in *.foo; do echo "${i%%[.(]*}"; done | sort -u); do 7z a "$pfx.7z" $pfx[.\(]*; done

Explanation

First we have to iterate all over the input files (*.foo) and strip away the suffix (${i%%[.(]*}) obtaining:

file_123
file_123
file_123
foobar
foobar
foobar

Then we can remove duplicates with sort -u:

file_123
foobar

Finally for each prefix ($pfx) we can build the archive using the prefix itself as both the name of the archive ("$pfx.7z") and the pattern to identify the files ($pfx[.\(]*); obtaining the equivalent of:

7z a file_123.7z 'file_123(abc).foo' 'file_123(b9)(ca)[a1].foo' 'file_123.foo'
7z a foobar.7z 'foobar(123).foo' 'foobar(a).foo' 'foobar.foo'

Blog

Tuesday, January 3, 2017

batch - How to compress multiple files with similar names?

Explanation

No comments:

Post a Comment

hard drive - Leaving bad sectors in unformatted partition?