Anyone know why NMC 3.1 (DFM 5.1.0) sometimes reports "No data is currently available…"?
and how to fix?
Makes it even harder to track issues when the reporting tools fail.
thanks





On Jan 26, 2013, at 5:42 PM, Nicholas Bernstein <nick@nicholasbernstein.com> wrote:

Usually the ps will show you the process that's using the io indirectly, since its also probably using some CPU. Disk scrub, media scrub, reallocate_measure are just a couple things I can think off off the top of my head that are things that could cause read io. 

Stats explain should be able to give you more info on that counter. Sorry this don't a more useful response, I'm on my phone and sick in bed. :/

--
Sent from my mobile device

On Jan 26, 2013, at 10:15 AM, Fletcher Cocquyt <fcocquyt@stanford.edu> wrote:

On Nick's advice I setup a job to log both wafltop and ps -c 1 once per minute - and we had a sustained sata0 disk busy from 5am-7am as reported by NMC.
First question I have from wafltop show is - what is the first row (sata0::file i/o) reporting ?  What could be the source of these 28907 non-volume specific  Read IOs? 

           Application   MB Total MB Read(STD) MB Write(STD) Read IOs(STD) Write IOs(STD) 
           -----------   -------- ------------ ------------- ------------- -------------- 
      sata0::file i/o:       5860         5830            30         28907              0
   sata0:backup:nfsv3:        608            0           608            31              0           

I'm just starting to go through the data

aggr status                 
           Aggr State           Status            Options
          sata0 online          raid_dp, aggr     nosnap=on, raidsize=12
                                64-bit            
          aggr2 online          raid_dp, aggr     nosnap=on, raidsize=19
                                64-bit            
          aggr1 online          raid_dp, aggr     root, nosnap=on, raidsize=14
                                32-bit            
na04*> df -Ah                      
Aggregate                total       used      avail capacity  
aggr1                     13TB       11TB     1431GB      89%  
aggr2                     19TB       14TB     5305GB      74%  
sata0                     27TB       19TB     8027GB      72%  


<sataIOPSJan26.jpeg>

thanks


On Jan 25, 2013, at 5:33 PM, Nicholas Bernstein <nick@nicholasbernstein.com> wrote:

Try doing a 'ps -c 1' or a wafltop show (double check the syntax) while you're getting the spike; those will probably help you narrow down the processes that are using your disks. Both are priv set advanced/diag commands.

Nick