Hi! I have several NetApp filers with close to 25TBs of user data stored on them (Best guess is that there is some 50 million files, but we aren't sure as our backup software only backups the current data set, not the snapshots). We are looking to consolidate down the environment given the latest storage capacities and thought I would ask the community here what tools that they used to understand what data needed to be migrate, what data needed to be destroyed, and what the data costs in terms of per user and department? We have the costing for the storage down to a per $ fee per MB/GB.
Any thoughts? We have tried some freeware and shareware programs, but they require weeks, if not months of personnel time to make sense out of the data that they collect. We are looking for a packaged solution.
Thank you for your help.
Hi !
1. You can use "df -i" command to know how many inodes you have. This would be higher than the number of files (since it'll represent directories as well) but would be a good estimate for the ceiling. 2. Look at Kazeon - NetApp resells them.
Eyal.
On 7/30/06, netster netster_cooper@yahoo.com wrote:
Hi! I have several NetApp filers with close to 25TBs of user data stored on them (Best guess is that there is some 50 million files, but we aren't sure as our backup software only backups the current data set, not the snapshots). We are looking to consolidate down the environment given the latest storage capacities and thought I would ask the community here what tools that they used to understand what data needed to be migrate, what data needed to be destroyed, and what the data costs in terms of per user and department? We have the costing for the storage down to a per $ fee per MB/GB.
Any thoughts? We have tried some freeware and shareware programs, but they require weeks, if not months of personnel time to make sense out of the data that they collect. We are looking for a packaged solution.
Thank you for your help.
View this message in context: http://www.nabble.com/Data-Classification-tf2023386.html#a5563372 Sent from the Network Appliance - Toasters forum at Nabble.com.
Personally,
I like what StorageX (now part of Brocade) has to offer.
It will help you make sence of what you have as far of types of files, their ages and with a policy based strategy, you will be able to off load some of the files you are targeting into a lower tier of storage.
I have used it at multiple clients and it's an awesome solution to keep close to your tool box.
Julio Calderon West Region - Systems Engineer
agami Mobile: 408.394.5638 Email: jcalderon@agami.com IM: juliocus (Yahoo)
-----Original Message----- From: owner-toasters@mathworks.com [mailto:owner-toasters@mathworks.com] On Behalf Of netster Sent: Sunday, July 30, 2006 6:20 AM To: toasters@mathworks.com Subject: Data Classification
Hi! I have several NetApp filers with close to 25TBs of user data stored on them (Best guess is that there is some 50 million files, but we aren't sure as our backup software only backups the current data set, not the snapshots). We are looking to consolidate down the environment given the latest storage capacities and thought I would ask the community here what tools that they used to understand what data needed to be migrate, what data needed to be destroyed, and what the data costs in terms of per user and department? We have the costing for the storage down to a per $ fee per MB/GB.
Any thoughts? We have tried some freeware and shareware programs, but they require weeks, if not months of personnel time to make sense out of the data that they collect. We are looking for a packaged solution.
Thank you for your help. -- View this message in context: http://www.nabble.com/Data-Classification-tf2023386.html#a5563372 Sent from the Network Appliance - Toasters forum at Nabble.com.
There are several solutions on the market that can provide data classification and management. I work for a company, Intermine, Inc. that has been providing a software solution for data classification and management since 1999. Having spent several years at Network Appliance myself I understand the challenges of managing a growing distributed environment.
There is an iterative process that needs to be put in place for managing the environment, including:
1) Analysis of the information in place, including: A) Security for the information - ownership and access to the files
2) Improvement of the environment: A) Cleanup and archival or deletion of information - not just duplicates (each organization defines duplicates a bit differently; it is not just as simple as saying there are two copies of the same file.) B) Capacity recovery
3) Control of the information A) Policy creation B) Policy Management C) Data Migration - tiered storage
4) Measurement A) Understanding consumption B) Monitoring change
Based on our working with 100's of organizations we have found that each stage in the process takes time and many organizations want to jump right to the migration or archival step without taking the time to understand their data and make appropriate decisions that will ultimately affect the migration-archival solution. The goal of the organization should be to reduce the risk, cost, and complexity of information at each stage, setting the goals as a group, including the information owners and information managers.
This is as iterative process because organizations grow and change over time and the information (data) within an organization changes as well.
Brett P. Cooper Director Intermine, Inc. Brett.Cooper@Intermine.com