Hi All,
We have a customer with SBS2003 and every week or so the server fails to accept network connections. Clients are unable to Terminal Service into the server, log on to the domain, map drives and open outlook. There are no third party applications running scans or backups at this time so I am discounting these as the cause of the issues. I have checked the AT scheduled tasks and there is nothing. After a reboot the server works fine for a number of days 6-10, or so and then it all happens again.
Below are the first errors from each event log after I believe the server went off line.
Application Log:Event Type: ErrorEvent Source: MSExchangeALEvent Category: LDAP OperationsEvent ID: 8026Date: 12/03/2006Time: 04:37:16User: N/AComputer: P1SVR
Description: LDAP Bind was unsuccessful on directory p1svr.p1international.local for distinguished name ''. Directory returned error:Ύ.34] Unavailable.
System Log:Event Type: ErrorEvent Source: Application PopupEvent Category: NoneEvent ID: 333Date: 12/03/2006Time: 04:38:14User: N/AComputer: P1SVR
Description: An I/O operation initiated by the registry failed unrecoverably. The registry could not read in, or write out, or flush, one of the files that contain the system's image of the registry.
Directory Service Log:Event Type: ErrorEvent Source: NTDS GeneralEvent Category: Global CatalogEvent ID: 1126Date: 12/03/2006Time: 12:24:26User: NT Authority\systemComputer: P1SVR
Description: Active directory was unable to establish a connection with the global catalog.
Additional DataError value: 9852 No DNS servers configured for local system.Internal ID: 3200d11User Action: Make sure a global catalog is available in the forest, and is reachable from this domain controller. You may use the nltest utility to diagnose this problem.
DNS Server Log:Event Type: ErrorEvent Source: DNSEvent Category: NoneEvent ID: 4015Date: 12/03/2006Time: 03/50:38User: N/AComputer: P1SVR
Description: The DNS Server has encountered a critical error from the Active Directory. Check that the Active Directory is functioning properly. The extended error debug information (which may be empty) us "". The event data contains the error.
Data:0000: 51 00 00 00
Please let me know of you have any ideas or thoughts as to what the cause/fix maybe.
Thanks,Joe
Hi Joe,
I have had similar symptoms on a customers box.
I unchecked the following boxes in the "Exchange system manager- global settings – (right click) mobile services – properties window"
Enable up-to-date notifications …….
Enable direct push over HTTPs
The fault cleared and users could logon almost instantly, not sure if this will help but it will only take a minute to test
Rob
Hi Guys,
Tried both your ideas and the server is still unhappy. A couple of new "features" have presented themselves now on this server.
A weekly reboot stops the server from crashing but I'd rather get this resolved.
One other thing to add. I added the /3G /USERVA=3030 switches to the boot.ini file and the server crashed with same symtoms as before but within 6-12 hours. It seemed to accelerate the process.
Afterwards I read this http://weblogs.asp.net/oldnewthing/archive/2004/08/06/209840.aspx article which brings me to the conclusion that the server has memory management problems.
I believe the mainboard is AMD and the processor is an Operton 244.
Thanks again Joe
I'd be really suprised if this wasn't a memory fault.
There is a boot disc (floppy LINUX) I can dig out if you like which tests the memory doesn't use the HDDs but these days memory is so cheap - can you change all the memory on the box?
Let us know how you get on.
Cheers, G
Thanks again guys,
We are going to replace and increase the memory on the server. The issue is that we do see very similar problems on other hardware with AMD processors and mainboards.
We are installing only Intel now and have done so for the last six months, but that doesn't help with the older servers we maintain. Of course we don't see any of these problems with Intel hardware.
Will post reports on how the memory upgrades go.
Joe
Still waiting for memory but I've been chatting to a tech support agent at the manufacturer. He claims to have seen similar issues after 'bum' installations of SP1 for SBS. He advises uninstalling SP1 and reinstalling.
I am going to try this on a test server first but would like to hear of any pitfalls, advice or procedures.
Thanks,
Joe hi,
If it's a clean server without data etc, go for a re-install (nothing to loose) but if the machine is in production, I'd await the memory being delivered.
We recently had a workstation PC which wouldn't switch on the windows firewall (XP) - turned out to be a faulty motherboard / cpu - I think it's too easy to point to the os / software, in my experience problems like this don't develop without the help of failing hardware.
Ben,
I know this is not popular, but I would honestly suggest a call to support. If it is a bug it costs you nothing (you get your cash returned) otherwise it costs around £200. Potentially costing you much less than your chargeable time
The error you are getting means that one of the applications on the system is crashing - is there any more informatrion in the eventlog - could you hit the copy button and paste the event log entry into the post?
thanks
David
(c)David Overton 2006-23