Daniel Quinlan quinlan@transmeta.com writes:
I'm also interesting in NTP support, but I think I'd like to see the current software become a lot more reliable (and complete) before adding another new feature. All I need is more NetApp crashes because of NTP bugs.
I should add that this happened to us just a few days ago:
- xntpd died on our adminhost (that ran the cron job to set the NetApp dates) - time drifted by 8 minutes on the adminhost and the filers over the course over a few days - someone started having problems, managed to figure out that it was because of the time drift, and we fixed it. (Other people probably had problems, but didn't report them.)
To prevent it from happening again, we're remotely monitoring the system time of the adminhost (using "mon") to make sure it doesn't drift off again. Incidentally, there seems to be no way to remotely query the system time from a NetApp (except by creating a new file and running stat() on it, or by using unsupported/hidden commands).
However, I'm still saying that I don't want NTP until other supported protocols (such as NDMP) start working reliably.
Dan
I thought this follow-up (to a thread from several months ago) might be interesting for people at sites that stress time synchronization.
Daniel Quinlan quinlan@transmeta.com wrote:
To prevent it from happening again, we're remotely monitoring the system time of the adminhost (using "mon") to make sure it doesn't drift off again. Incidentally, there seems to be no way to remotely query the system time from a NetApp (except by creating a new file and running stat() on it, or by using unsupported/hidden commands).
So, I started monitoring the time with the kludgey method of creating a file and using stat() on it. I wasn't very happy with that method because it tended to give false alerts, so I eventually disabled it.
Earlier this week, we had another problem with time synchronization. It wasn't actually the NetApp, but I still had to rule it out because it wasn't being monitored. I tried port-scanning a NetApp, looking for some hidden protocol that might let me reliably query the time. Here's what I found:
port tcp udp ----- ----------- ----------- 23 telnet - 80 http - 111 sunrpc sunrpc 137 - netbios-ns 138 - netbios-dgm 139 netbios-ssn - 161 - snmp 514 shell syslog 520 - route 602 - unknown 603 unknown - 604 - unknown 605 unknown - 606 - unknown 607 unknown - 608 - unknown 609 unknown - 618 - unknown 619 - unknown 620 unknown - 1063 - unknown 2049 nfs nfs 10000 unknown -
A lot of unknown services (ones not listed in /etc/services, some are probably NDMP). Anyway, my first thought was sunrpc. rpcbind version 3 has a GETTIME function that does exactly what I want, but NetApp only supports up to rpcbind version 2.
I'm glad it doesn't have GETTIME too because it finally dawned on me. Every single http request returns a perfectly functional "Date" header with the system time, represented in a very standard form. Simple and it works, so we're using that now.
Of course, life would have been easier if NetApp just supported the tcp/time service or you could get system time with SNMP.
Dan