Hi folks,
We recently copied some netapp volumes to a non-netapp file server via NFS and rsync. Unfortunately we have now converted a lot of file names that were created via CIFS on the Netapp using a variety of character sets, such as Arabic, Cyrillic, Persian, etc.
As I understand it, a Netapp keeps two file names for each file/folder. The CIFS filename is 16 bit Unicode while the NFS filename is "literal" 8 bit characters (i.e., not UTF-8).
I presume that when a file is created on the Netapp via CIFS that the Netapp creates both the CIFS filename and a NFS filename. When we copied the volumes via NFS and rsync, the NFS names were copied over, so we "lost" the CIFS names.
Is there any algorithm or procedure that we can use to convert these 8 bit "NFS" names to their equivalent 16 bit Unicode names? Ultimately we need to convert to UTF-8 which is what the destination file server uses for file names (both NFS and CIFS use the same UTF-8 filename).
We have turned off CIFS access to the Netapps, and I would like to avoid turning it back on, but I may have no other way to retrieve the original 16 bit unicode names.
The Netapp volumes are Unix security style and were accessed both by NFS and CIFS. Some files were originally created via NFS, some by CIFS. And unfortunately, some CIFS filenames had non-English character sets.
Here are some sample NFS filenames. I have no idea what the corresponding CIFS filenames are or what character set.
:31K4J00 :44O8K00 :5J0TA00 :8QLDT00 :BQLDT00 :EA0IL00 :FA0IL00
These look like the result of a hash, so there may be no way to convert back to CIFS 16bit Unicode.
Steve Losen scl@virginia.edu
You might be able to write up a bash script to reconvert it, but I'm not very conversant with the different methods of text encoding. It's possible that the damage is one-way and that you're missing information needed to translate it to unicode. Another option would be to reshare the data on the NAS as CIFS and redo the migration using another method. Either mount it using a cifs mount on unix and try a unix copy, or mount a new unix target on a windows machine and use robocopy. One of those might work.
On Fri, Dec 12, 2014 at 7:01 AM, scl@virginia.edu wrote:
Hi folks,
We recently copied some netapp volumes to a non-netapp file server via NFS and rsync. Unfortunately we have now converted a lot of file names that were created via CIFS on the Netapp using a variety of character sets, such as Arabic, Cyrillic, Persian, etc.
As I understand it, a Netapp keeps two file names for each file/folder. The CIFS filename is 16 bit Unicode while the NFS filename is "literal" 8 bit characters (i.e., not UTF-8).
I presume that when a file is created on the Netapp via CIFS that the Netapp creates both the CIFS filename and a NFS filename. When we copied the volumes via NFS and rsync, the NFS names were copied over, so we "lost" the CIFS names.
Is there any algorithm or procedure that we can use to convert these 8 bit "NFS" names to their equivalent 16 bit Unicode names? Ultimately we need to convert to UTF-8 which is what the destination file server uses for file names (both NFS and CIFS use the same UTF-8 filename).
We have turned off CIFS access to the Netapps, and I would like to avoid turning it back on, but I may have no other way to retrieve the original 16 bit unicode names.
The Netapp volumes are Unix security style and were accessed both by NFS and CIFS. Some files were originally created via NFS, some by CIFS. And unfortunately, some CIFS filenames had non-English character sets.
Here are some sample NFS filenames. I have no idea what the corresponding CIFS filenames are or what character set.
:31K4J00 :44O8K00 :5J0TA00 :8QLDT00 :BQLDT00 :EA0IL00 :FA0IL00
These look like the result of a hash, so there may be no way to convert back to CIFS 16bit Unicode.
Steve Losen scl@virginia.edu
Toasters mailing list Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters
Silly question. Do you have a snapshot?
From: toasters-bounces@teaparty.net [mailto:toasters-bounces@teaparty.net] On Behalf Of Basil Sent: Friday, December 12, 2014 7:41 AM To: scl@virginia.edu Cc: toasters@teaparty.net Subject: Re: CIFS and NFS file names
You might be able to write up a bash script to reconvert it, but I'm not very conversant with the different methods of text encoding. It's possible that the damage is one-way and that you're missing information needed to translate it to unicode. Another option would be to reshare the data on the NAS as CIFS and redo the migration using another method. Either mount it using a cifs mount on unix and try a unix copy, or mount a new unix target on a windows machine and use robocopy. One of those might work.
On Fri, Dec 12, 2014 at 7:01 AM, <scl@virginia.edumailto:scl@virginia.edu> wrote: Hi folks,
We recently copied some netapp volumes to a non-netapp file server via NFS and rsync. Unfortunately we have now converted a lot of file names that were created via CIFS on the Netapp using a variety of character sets, such as Arabic, Cyrillic, Persian, etc.
As I understand it, a Netapp keeps two file names for each file/folder. The CIFS filename is 16 bit Unicode while the NFS filename is "literal" 8 bit characters (i.e., not UTF-8).
I presume that when a file is created on the Netapp via CIFS that the Netapp creates both the CIFS filename and a NFS filename. When we copied the volumes via NFS and rsync, the NFS names were copied over, so we "lost" the CIFS names.
Is there any algorithm or procedure that we can use to convert these 8 bit "NFS" names to their equivalent 16 bit Unicode names? Ultimately we need to convert to UTF-8 which is what the destination file server uses for file names (both NFS and CIFS use the same UTF-8 filename).
We have turned off CIFS access to the Netapps, and I would like to avoid turning it back on, but I may have no other way to retrieve the original 16 bit unicode names.
The Netapp volumes are Unix security style and were accessed both by NFS and CIFS. Some files were originally created via NFS, some by CIFS. And unfortunately, some CIFS filenames had non-English character sets.
Here are some sample NFS filenames. I have no idea what the corresponding CIFS filenames are or what character set.
:31K4J00 :44O8K00 :5J0TA00 :8QLDT00 :BQLDT00 :EA0IL00 :FA0IL00
These look like the result of a hash, so there may be no way to convert back to CIFS 16bit Unicode.
Steve Losen scl@virginia.edumailto:scl@virginia.edu
_______________________________________________ Toasters mailing list Toasters@teaparty.netmailto:Toasters@teaparty.net http://www.teaparty.net/mailman/listinfo/toasters