Questions

sqlstate = 08S01 [microsoft][odbc sqlserver]communication link failure

+
0 Votes
Locked

sqlstate = 08S01 [microsoft][odbc sqlserver]communication link failure

naiomie_asks
we are using mssql server 6.5 and just recently, the application we used to run started giving communication link failure error alerts... could anyone please give me details why we are experiencing this kind of error. could it be from the application or from the mssqlserver itself or from network. we've been looking for this for almost a month now...

any reply would be gladly appreciated...

thanks very much

n
  • +
    0 Votes
    animatech

    What you are getting is a network connection problem.
    If you are running clusters check cluster.log
    Check switches for errors on ports if you are on the network.
    Check accounts permissions and passwords make sure nothing changed.
    Check other units that are running connection to the SQL server for any network errors

    +
    0 Votes
    naiomie_asks

    thanks, animatech

    i'l ask the services team to check on the hardwares but what if everything was fine in the hardware side and yet the users still recieves the same error alerts?

    thanks a lot for your help

    +
    0 Votes
    HAL 9000 Moderator

    If the DBase has got too big and you are using the cut down free version of MS SQL that will be a problem as it only services up to a certain DBase size. If you have the complete Version of MS SQL it could be a patch for one of the OS's that is causing a problem with MS SQL 6.5 and preventing it working properly.

    If this is just happening on one Node I would be looking at the hardware on that node for the problem starting with Switches/Hub or whatever you are using.

    However if it's happening Network Wide I would first be checking that you have the correct number of CAL's/Terminal Services Licenses available which ever you are using and then start looking and any known problems that have arisen recently from the makers of the product that is using SQL on the MS Server. The program maker is quite often the best place to start looking for fixes to problems unless of course this is an In House Product in which case it's possible that one of the Programing Shortcuts used when it was originally configured has been disabled by a M$ Patch and is causing the program to become unresponsive.

    Col

    +
    0 Votes
    naiomie_asks

    to start, thanks for the reply. your suggestions greatly help. anyway, we have a complete version of MS SQL 6.5 and it's authenticated, it runs with windows 2000 sp4. is service packs and patches not the same? should i look for a patch or perhaps one of the patches causes the problem?

    thanks a lot.

    +
    0 Votes
    HAL 9000 Moderator

    Lets start at the beginning. SQL in whatever form is always at the back-end of any DBase program generally with something loaded on top of it doing the leg work to manage the DBase that SQL is holding.

    Now there are several things that happen to stop it working properly the most obvious is a new M$ Patch which can be either Windows or an Office Patch if Office is loaded on the computer holding the SQL program. These patches have been known to break things so it's possible that a recently applied M$ Patch has broken the application or if you are running 2000 Pro with SP4 it could be enforcing the 10 concurrent connections limit that is built into 2000. If you are using 2000 Server you could be exceeding the available CAL's or Terminal Services Licenses which will cause the problem as well. Though unless there is something wrong with the 2000 machine it should only affect any excess to the number of CAL's/Terminal Services that you have or the upper limit of 10 concurrent connections for the desktop version of 2000. So if you have more than 10 computers connection at the same time there could be a problem for the Desktop version of 2000 as the extra computers will not gain a connection, and if you have lets say 15 CAL's available and they are not being released when a computer logs off your upper limit of 15 computers connected at the same time can be compromised and the number of computers connected at the same time can be restricted with the excess no longer being allocated a connection.

    Though there could be a problem where when a computer logs off that connection is not being released and when they attempt to log back in it takes another Licensed Connection or another Supported Connection on the desktop version so you can have 1 computer holding several CAL's/Terminal Service Licenses/Concurrent Connections which will limit the number of other computers that can be connected at the same time.

    To test if this is happening a reboot will do and if the problem disappears for a while then you'll have to start looking at a possible problem on the File Server which is failing to release connections from connected computers when they log off.

    The next thing that can happen is that the Front End Program which is doing all of the Leg work on SQL could have been corrupted by a recent M$ Patch as it does something which was perfectly acceptable but is now considered as a Security Risk so that avenue of performing that operation has been shut down and stops the Front End from working properly. If this Front End Program is from a Software House of some kind contact them to see if they have a work around for the problem and if the Front End Program is a In House program you are generally stuffed as these tend to have more shortcuts in them which can be broken by M$ Patches. Even some small scale software like Medical Programs for Doctors Surgeries can have some massive problems as they only occupy a very small market and don't have the necessary staff available to rewrite as required when M$ change the rules and plug some openings that where being used previously to make something happen. Also because they only have a small staff of programmers you may not necessarily get the right information. Recently a Medical Program stopped working and it was because the DBase had exceeded 45 GIG which the free version of SQL could no longer handle so the program makers told the Doctor to install 2003 SBS as it had a full version of SQL in it. When I installed a new Server application I installed SQL 2005 and the medical program didn't work as it's only been written for SQL 2003. Unfortunately they couldn't say that before they tried some fault finding after the new Server Application and necessary CAL's had been purchased and installed. The companies solution is to wait a few months till the next rewrite of their program becomes available which will be SQL 2005 Compliant. It's not so much that the support desk lied about a fix but that they didn't know the proper answer to give and I had to spend hours on the phone to work out what was going wrong with one of the programmers.

    But when did the application stop working properly and what changes where made just before this happened? That is what you are going to have to look at to find a solution here as you have the full version of SQL and I'm taking it that you have the right version of the front end program to run on so you need to look at what changes where implemented before it broke the application that you are running.

    You should keep a list of new things added to any server by list and date and at the same time list and problems that arise. I know that it's not easy particularly if you allow the unit to Auto Update which is dangerous when Mission Critical Business Systems are involved. You should always do things manually as that way you retain control over what is happening and know where to start to look when something goes wrong. But with Shrinking Budgets and more work that is easily handled per day things can get missed and the Auto Updates be allowed to run to save time and effort. This has always in my experience lead to problems eventually as when things break you don't know what has happened recently so you have no idea where to start looking to fix the problem that has arisen.

    I hope that explication is of some use to you

    Col

    +
    0 Votes

    Hi

    animatech

    I agree with HAL as his recent artical :) covered most of the solutions.
    From my experience with those errors in SQL the SQL error log will give you a hint most of the time.
    A lot of the times it is a database too big, concurrent connection limit or network problem that was caused by hardware or 3rd party application.

    +
    0 Votes
    tallsexyblone

    To reiterate Col's statement of that this could be just about anything:

    SAN attached cluster, W2K3x64 SP2.
    As part of a migration process, I had to disconnect a number of connections to SAN attached drives on the active SQL cluster node.

    At exactly the same time as the disconnect of the SAN drive paths happened, the application event log listed a number of ID19019, all pointing to a network connectivity issue:

    ------------------------------
    [sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Native Client]TCP Provider: The specified network name is no longer available.

    [sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Native Client]Communication link failure
    ------------------------------

    No SQL cluster resources go down though.

    How could this happen? Well here's my take on it:

    Windows is humming along, gets wind of a few devices it knows about going AWOL, has to engage the hardware management subsystem to go and verify who's there and who not and make status change notifications to other subsystems as necessary...

    This process holds up network communications at low (driver) level for a few split seconds too, making it look like there was a problem with the link, but it's actually not.

    Rgs,
    K28.5

    +
    0 Votes
    burtonnirmal

    Hi I'm getting the same error (sqlstate = 08S01 [microsoft][odbc sqlserver]communication link failure ).

    Please help me out to solve this error.
    Thanks in advance.
    Burton.

  • +
    0 Votes
    animatech

    What you are getting is a network connection problem.
    If you are running clusters check cluster.log
    Check switches for errors on ports if you are on the network.
    Check accounts permissions and passwords make sure nothing changed.
    Check other units that are running connection to the SQL server for any network errors

    +
    0 Votes
    naiomie_asks

    thanks, animatech

    i'l ask the services team to check on the hardwares but what if everything was fine in the hardware side and yet the users still recieves the same error alerts?

    thanks a lot for your help

    +
    0 Votes
    HAL 9000 Moderator

    If the DBase has got too big and you are using the cut down free version of MS SQL that will be a problem as it only services up to a certain DBase size. If you have the complete Version of MS SQL it could be a patch for one of the OS's that is causing a problem with MS SQL 6.5 and preventing it working properly.

    If this is just happening on one Node I would be looking at the hardware on that node for the problem starting with Switches/Hub or whatever you are using.

    However if it's happening Network Wide I would first be checking that you have the correct number of CAL's/Terminal Services Licenses available which ever you are using and then start looking and any known problems that have arisen recently from the makers of the product that is using SQL on the MS Server. The program maker is quite often the best place to start looking for fixes to problems unless of course this is an In House Product in which case it's possible that one of the Programing Shortcuts used when it was originally configured has been disabled by a M$ Patch and is causing the program to become unresponsive.

    Col

    +
    0 Votes
    naiomie_asks

    to start, thanks for the reply. your suggestions greatly help. anyway, we have a complete version of MS SQL 6.5 and it's authenticated, it runs with windows 2000 sp4. is service packs and patches not the same? should i look for a patch or perhaps one of the patches causes the problem?

    thanks a lot.

    +
    0 Votes
    HAL 9000 Moderator

    Lets start at the beginning. SQL in whatever form is always at the back-end of any DBase program generally with something loaded on top of it doing the leg work to manage the DBase that SQL is holding.

    Now there are several things that happen to stop it working properly the most obvious is a new M$ Patch which can be either Windows or an Office Patch if Office is loaded on the computer holding the SQL program. These patches have been known to break things so it's possible that a recently applied M$ Patch has broken the application or if you are running 2000 Pro with SP4 it could be enforcing the 10 concurrent connections limit that is built into 2000. If you are using 2000 Server you could be exceeding the available CAL's or Terminal Services Licenses which will cause the problem as well. Though unless there is something wrong with the 2000 machine it should only affect any excess to the number of CAL's/Terminal Services that you have or the upper limit of 10 concurrent connections for the desktop version of 2000. So if you have more than 10 computers connection at the same time there could be a problem for the Desktop version of 2000 as the extra computers will not gain a connection, and if you have lets say 15 CAL's available and they are not being released when a computer logs off your upper limit of 15 computers connected at the same time can be compromised and the number of computers connected at the same time can be restricted with the excess no longer being allocated a connection.

    Though there could be a problem where when a computer logs off that connection is not being released and when they attempt to log back in it takes another Licensed Connection or another Supported Connection on the desktop version so you can have 1 computer holding several CAL's/Terminal Service Licenses/Concurrent Connections which will limit the number of other computers that can be connected at the same time.

    To test if this is happening a reboot will do and if the problem disappears for a while then you'll have to start looking at a possible problem on the File Server which is failing to release connections from connected computers when they log off.

    The next thing that can happen is that the Front End Program which is doing all of the Leg work on SQL could have been corrupted by a recent M$ Patch as it does something which was perfectly acceptable but is now considered as a Security Risk so that avenue of performing that operation has been shut down and stops the Front End from working properly. If this Front End Program is from a Software House of some kind contact them to see if they have a work around for the problem and if the Front End Program is a In House program you are generally stuffed as these tend to have more shortcuts in them which can be broken by M$ Patches. Even some small scale software like Medical Programs for Doctors Surgeries can have some massive problems as they only occupy a very small market and don't have the necessary staff available to rewrite as required when M$ change the rules and plug some openings that where being used previously to make something happen. Also because they only have a small staff of programmers you may not necessarily get the right information. Recently a Medical Program stopped working and it was because the DBase had exceeded 45 GIG which the free version of SQL could no longer handle so the program makers told the Doctor to install 2003 SBS as it had a full version of SQL in it. When I installed a new Server application I installed SQL 2005 and the medical program didn't work as it's only been written for SQL 2003. Unfortunately they couldn't say that before they tried some fault finding after the new Server Application and necessary CAL's had been purchased and installed. The companies solution is to wait a few months till the next rewrite of their program becomes available which will be SQL 2005 Compliant. It's not so much that the support desk lied about a fix but that they didn't know the proper answer to give and I had to spend hours on the phone to work out what was going wrong with one of the programmers.

    But when did the application stop working properly and what changes where made just before this happened? That is what you are going to have to look at to find a solution here as you have the full version of SQL and I'm taking it that you have the right version of the front end program to run on so you need to look at what changes where implemented before it broke the application that you are running.

    You should keep a list of new things added to any server by list and date and at the same time list and problems that arise. I know that it's not easy particularly if you allow the unit to Auto Update which is dangerous when Mission Critical Business Systems are involved. You should always do things manually as that way you retain control over what is happening and know where to start to look when something goes wrong. But with Shrinking Budgets and more work that is easily handled per day things can get missed and the Auto Updates be allowed to run to save time and effort. This has always in my experience lead to problems eventually as when things break you don't know what has happened recently so you have no idea where to start looking to fix the problem that has arisen.

    I hope that explication is of some use to you

    Col

    +
    0 Votes

    Hi

    animatech

    I agree with HAL as his recent artical :) covered most of the solutions.
    From my experience with those errors in SQL the SQL error log will give you a hint most of the time.
    A lot of the times it is a database too big, concurrent connection limit or network problem that was caused by hardware or 3rd party application.

    +
    0 Votes
    tallsexyblone

    To reiterate Col's statement of that this could be just about anything:

    SAN attached cluster, W2K3x64 SP2.
    As part of a migration process, I had to disconnect a number of connections to SAN attached drives on the active SQL cluster node.

    At exactly the same time as the disconnect of the SAN drive paths happened, the application event log listed a number of ID19019, all pointing to a network connectivity issue:

    ------------------------------
    [sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Native Client]TCP Provider: The specified network name is no longer available.

    [sqsrvres] printODBCError: sqlstate = 08S01; native error = 40; message = [Microsoft][SQL Native Client]Communication link failure
    ------------------------------

    No SQL cluster resources go down though.

    How could this happen? Well here's my take on it:

    Windows is humming along, gets wind of a few devices it knows about going AWOL, has to engage the hardware management subsystem to go and verify who's there and who not and make status change notifications to other subsystems as necessary...

    This process holds up network communications at low (driver) level for a few split seconds too, making it look like there was a problem with the link, but it's actually not.

    Rgs,
    K28.5

    +
    0 Votes
    burtonnirmal

    Hi I'm getting the same error (sqlstate = 08S01 [microsoft][odbc sqlserver]communication link failure ).

    Please help me out to solve this error.
    Thanks in advance.
    Burton.