DavidOverton.com
This site is my way to share my views and general business and IT information with you about Microsoft, IT solutions for ISVs, technologists and businesses, large and small.  

Unwell SBS Server :(

rated by 0 users
This post has 10 Replies | 3 Followers

Top 25 Contributor
Points 220
Joe Posted: Tue, Apr 4 2006 11:57 AM

Hi All,

 

We have a customer with SBS2003 and every week or so the server fails to accept network connections.  Clients are unable to Terminal Service into the server, log on to the domain, map drives and open outlook.  There are no third party applications running scans or backups at this time so I am discounting these as the cause of the issues.  I have checked the AT scheduled tasks and there is nothing.  After a reboot the server works fine for a number of days  6-10, or so and then it all happens again.

 

Below are the first errors from each event log after I believe the server went off line.

 

Application Log:
Event Type:  Error
Event Source: MSExchangeAL
Event Category: LDAP Operations
Event ID:  8026
Date: 12/03/2006
Time: 04:37:16
User: N/A
Computer: P1SVR

 

Description: LDAP Bind was unsuccessful on directory p1svr.p1international.local for distinguished name ''. Directory returned error:Ύ.34] Unavailable.

 

System Log:
Event Type: Error
Event Source: Application Popup
Event Category: None
Event ID: 333
Date: 12/03/2006
Time: 04:38:14
User: N/A
Computer: P1SVR

 

Description: An I/O operation initiated by the registry failed unrecoverably.  The registry could not read in, or write out, or flush, one of the files that contain the system's image of the registry.

 

Directory Service Log:
Event Type: Error
Event Source: NTDS General
Event Category: Global Catalog
Event ID: 1126
Date: 12/03/2006
Time: 12:24:26
User: NT Authority\system
Computer: P1SVR

 

Description: Active directory was unable to establish a connection with the global catalog.

 

Additional Data
Error value: 9852 No DNS servers configured for local system.
Internal ID: 3200d11
User Action: Make sure a global catalog is available in the forest, and is reachable from this domain controller. You may use the nltest utility to diagnose this problem.

 

DNS Server Log:
Event Type: Error
Event Source: DNS
Event Category: None
Event ID: 4015
Date: 12/03/2006
Time: 03/50:38
User: N/A
Computer: P1SVR

 

Description: The DNS Server has encountered a critical error from the Active Directory.  Check that the Active Directory is functioning properly.  The extended error debug information (which may be empty) us "".  The event data contains the error.

 

Data:
0000: 51 00 00 00 

 

Please let me know of you have any ideas or thoughts as to what the cause/fix maybe.

 

Thanks,
Joe

  • | Post Points: 53
Paulie replied on Wed, Apr 12 2006 10:51 PM
Strange indeed!

I used to have a PC that used to do something similar a couple of years ago in XP.  Turned out to be a crappy NIC.

I would check to see if there are any updated drivers or consider using a new high quality NIC.

Paul
  • | Post Points: 5
Top 100 Contributor
Points 21
SmartFix replied on Thu, Apr 13 2006 12:20 AM

Hi Joe,

I have had similar symptoms on a customers box.

I unchecked the following boxes in the "Exchange system manager- global settings – (right click) mobile services – properties window"

          Enable up-to-date notifications …….

          Enable direct push over HTTPs

 The fault cleared and users could logon almost instantly, not sure if this will help but it will only take a minute to test

Rob

  • | Post Points: 21
Top 25 Contributor
Points 220
Joe replied on Thu, May 11 2006 10:34 AM

Hi Guys,

Tried both your ideas and the server is still unhappy.  A couple of new "features" have presented themselves now on this server.

  1. CALs disappear
  2. In Task Manager the processes are no longer listed on the processes tab and the performance details are no longer on the performance tab.

A weekly reboot stops the server from crashing but I'd rather get this resolved.

One other thing to add.  I added the /3G /USERVA=3030 switches to the boot.ini file and the server crashed with same symtoms as before but within 6-12 hours.  It seemed to accelerate the process. 

Afterwards I read this http://weblogs.asp.net/oldnewthing/archive/2004/08/06/209840.aspx article which brings me to the conclusion that the server has memory management problems.

I believe the mainboard is AMD and the processor is an Operton 244.

Thanks again
Joe

  • | Post Points: 21
Paulie replied on Thu, May 11 2006 1:20 PM
I know I am going to get my ass kicked for this......

But one peice of advice I always give people when putting together any machine be it a server or a PC is...

Intel, Intel, Intel!!

Have as much Intel as possible: CPU, Mainboard, NICs.

The value of intel isn't in speed etc, it is in reliability.  I have known so many people with seemingly random problems on thier machines which have chipsets/CPUs which are not Intel based.  I don't know for a fact that the two are related, but my own experience shows that Intel based systems seem to experience less problems.

You have my sympathy, problems like this are always time consuming and very difficult to get to the bottom of.


  • | Post Points: 21
Top 25 Contributor
Points 93
Gareth replied on Thu, May 11 2006 7:26 PM

I'd be really suprised if this wasn't a memory fault.

There is a boot disc (floppy LINUX) I can dig out if you like which tests the memory doesn't use the HDDs but these days memory is so cheap - can you change all the memory on the box?

Let us know how you get on.

Cheers, G

  • | Post Points: 21
Top 25 Contributor
Points 220
Joe replied on Fri, May 12 2006 12:42 PM

Thanks again guys,

We are going to replace and increase the memory on the server.  The issue is that we do see very similar problems on other hardware with AMD processors and mainboards.  

We are installing only Intel now and have done so for the last six months, but that doesn't help with the older servers we maintain.    Of course we don't see any of these problems with Intel hardware. 

Will post reports on how the memory upgrades go. 

Joe

  • | Post Points: 5
Top 25 Contributor
Points 220
Joe replied on Wed, May 24 2006 12:22 PM

Still waiting for memory but I've been chatting to a tech support agent at the manufacturer.  He claims to have seen similar issues after 'bum' installations of SP1 for SBS.  He advises uninstalling SP1 and reinstalling.

I am going to try this on a test server first but would like to hear of any pitfalls, advice or procedures.

Thanks,

Joe

  • | Post Points: 21
Top 25 Contributor
Points 93
Gareth replied on Wed, May 24 2006 12:30 PM

Joe hi,

If it's a clean server without data etc, go for a re-install (nothing to loose) but if the machine is in production, I'd await the memory being delivered.

We recently had a workstation PC which wouldn't switch on the windows firewall (XP) - turned out to be a faulty motherboard / cpu - I think it's too easy to point to the os / software, in my experience problems like this don't develop without the help of failing hardware.

 

  • | Post Points: 5
Top 100 Contributor
Points 21
ben59 replied on Wed, Jun 28 2006 4:51 PM
Hi
I have same effects, I found out, that it all starts with

Event Type: Error
Event Source: Application Popup
Event Category: None
Event ID: 333

This starts as soon I log on as a remote terminal (to do some support tasks)

Then, the server becomes overloaded and all the other messages start to occur. I guess this is caused by a memory leak or a heavy ressource using process.

I removed all programs from startup folder and also automatic login script from Administrator profile.

This didn't prevent the error message from starting a minute after the remote session started. Without using remote session, everything is fine!

My system is fully Intel (hp Proliant ML350 , brand new), German SBS 2003 Premium Svp 1, freshly installed.

Just wondering if MS is having a clue for that ?

Ben

  • | Post Points: 21
Top 10 Contributor
Points 84,771

Ben,

I know this is not popular, but I would honestly suggest a call to support. If it is a bug it costs you nothing (you get your cash returned) otherwise it costs around £200.  Potentially costing you much less than your chargeable time

The error you are getting means that one of the applications on the system is crashing - is there any more informatrion in the eventlog - could you hit the copy button and paste the event log entry into the post?

 

thanks

 

David

  • | Post Points: 5
Page 1 of 1 (11 items) | RSS

(c)David Overton 2006-23