Copying stuff with find and cpio

cpio is somewhat similar to tar, but I’m pretty sure all anyone uses it for nowadays is for copying directory structures (or subsets thereof).

A little background on my mundane task:
I have a big repository of different datafiles (zipped csv) that are generated daily. These are organized into a date-based directory structure going back several years. So, a given day’s directory contents might look like:

[someguy@somebox /data/2010/04/05]$ ls
type1.zip       type2.zip       type3.zip       type4.zip
[someguy@somebox /data/2010/04/05]$ ls ../06
type1.zip       type2.zip       type3.zip       type4.zip
[someguy@somebox /data/2010/04/05]$

The filenames are always the same.

Sometimes, I get a request for a dump of this data, usually to be burnt to DVD. However, the request varies as to which data types they want included, e.g. “please burn a disk containing the full repository, but only include type2 and type3 files”.

Maintaining the directory structure is necessary here so we know which date a file belongs to.

Anyway, the easiest way to do this is with find and cpio:

#!/bin/sh
find /data/ -name 'type2.zip' -print |cpio -pvd .
find /data/ -name 'type3.zip' -print |cpio -pvd .

This gives us a nice copy of the huge directory structure, but only containing the file types we want. Pretty simple.

p.s. If I’m burning this to DVD, I create an iso from my dump locally so I’m not copying thousands of little files over the network to my workstation:

mkisofs -o dump.iso -v -J -l data