You might remember my bringing up a lot of questions about dealing with huge numbers of inodes to support an application we use.
Well, now that application has another interesting need.
They are starting to use NIS on a bunch of the application servers and so have the change the UIDs on MILLIONS of files and directories on our F840. Obviously the usual methods (chmod, chgrp) are exceedingly slow for something like this.. we could be at it for weeks.
Can anyone suggest a method to change the uid and gid on massive quantities of files/directories more quickly?
Thanks! Tom
If you run a separate chown process for each file, yes, that will be very slow. However, if these files are all in a small number of directory trees, then just use the recursive option of chown.
chown -R newuser:newgrp dir
This chowns the entire tree and I can't think of anything on the client side that would be faster. I don't think ONTAP has anything on the server side to do this operation.
Of course this assumes that you want the entire directory tree chown-ed to the same user and group. If you need to pick and choose based on the existing uid (probably what you want) then I suggest doing something like this:
find dir -print | mychown.pl
And write a mychown.pl perl script like this:
===========
%uidmap = ( # in %uidmap each line is olduid => newuid 120 => 507, 153 => 515, ... ); $defaultuid = 507; # for when the old uid is not in %uidmap
%gidmap = ( # in %gidmap each line is oldgid => newgid 200 => 217, 201 => 234, ... ); $defaultgid = 212; # for when the old gid is not in %gidmap
while(<>) { chop; @stat = stat($_); next if @stat == 0; # stat call failed, move on $olduid = $stat[4]; $oldgid = $stat[5]; $newuid = exists($uidmap{$olduid}) ? $uidmap{$olduid} : $defaultuid; $newgid = exists($gidmap{$oldgid}) ? $gidmap{$oldgid} : $defaultgid; chown($newuid, $newgid, $_); } ===========
This will run much faster than an individual chown process for each file. perl calls the system chown() function and the script runs as a single process, so you don't have the overhead of creating a new unix process for each file that you chown.
Your 840 filer can easily handle several of these running in parallel. If you can break the job into separate directories, then you can take advantage of this. You can even spread the load out on multiple NFS clients if you have them. But even if you only have one NFS client, you will get much faster throughput if you can run about 10 of these in parallel rather than just one job. If the NFS client has multiple CPUs, you can run even more parallel jobs.
find dir1 -print | mychown.pl & find dir2 -print | mychown.pl & ... find dirN -print | mychown.pl &
You might remember my bringing up a lot of questions about dealing with huge numbers of inodes to support an application we use.
Well, now that application has another interesting need.
They are starting to use NIS on a bunch of the application servers and so have the change the UIDs on MILLIONS of files and directories on our F840. Obviously the usual methods (chmod, chgrp) are exceedingly slow for something like this.. we could be at it for weeks.
Can anyone suggest a method to change the uid and gid on massive quantities of files/directories more quickly?
Thanks! Tom
Steve Losen scl@virginia.edu phone: 804-924-0640
University of Virginia ITC Unix Support
three points
i have found that "walking the tree" on netapps to be slow. parallel recursive chown's will increase speed by an order of magnitude. something like: for i in * .??* ; do chown -R $i & ; done
be careful not to walk the .snapshot tree. it's a waste of time and annoys the filer [+]
when using the -R option in chown, read up to see how symbolic links are handled. depending on your OS, you might or might not need a "-h" in there. we've been bit by people doing "chown -R" and changing files that they did not realize due to symlinks. usually shows up in the weirdest ways - like a selection of users *almost* logging correctly, while other users having no problems at all.
[+] apologizes to Samuel Clemens.
-- email: lance_bailey@pmc-sierra.com box: Lance R. Bailey, unix Administrator vox: +1 604 415 6646 PMC-Sierra, Inc fax: +1 604 415 6151 105-8555 Baxter Place http://www.lydia.org/~zaphod Burnaby BC, V5A 4V7 Disk is cheap. Deleting files is cheaper.
Steve Losen wrote:
If you run a separate chown process for each file, yes, that will be very slow. However, if these files are all in a small number of directory trees, then just use the recursive option of chown.
chown -R newuser:newgrp dir
Steve Losen scl@sasha.acc.virginia.edu writes:
chown($newuid, $newgid, $_);
This rings a warning bell. How does Perl treat symlinks when doing a chown? If it follows symlinks you have to be careful, or else you can wind up giving files all over the file system to the wrong user.