Could not find the index entry for RID problem keeps reoccurring

  • We have been getting the same RID error ever since we migrated to a new server. It is always on the same table. In an attempt to fix the problem, I have completely dropped the table and indexes and recreated them. I'm not sure what else to try.

    Could not find the index entry for RID '160000b19600694600730008000dd40091000b13004a1009000' in index page (1:501509), index ID 2, database 'Frst'..

    Thanks for any input,

    Chip

  • Chip - here's some info from the MSN site for pointers/guidance....

    "Because a stale read or a lost write results in data storage that is not expected, a wide variety of behaviors may occur. It may appear as missing data, but some of the more common effects of missing data appear as index corruptions, such as Error 644 or Error 625:

    Error 644 Severity Level 21 Message Text Could not find the index entry for RID '%.*hs' in index page %S_PGID, index ID %d, database '%.*ls'.

    Error 625 Severity Level 21 Message Text Cannot retrieve row from page %S_PGID by RID because the slotid (%d) is not valid.

    Some customers have reported missing rows after they perform row count activities. This problem occurs because of a lost write. Perhaps the page was supposed to be linked to the clustered index page chain. If the write was physically lost, the data is also lost.

    Important If you experience any of the behaviors, or if you are suspicious of similar problems together with disabling caching mechanisms, Microsoft strongly recommends that you obtain the latest update for SQL Server and the latest SQL Server I/O Stress Simulator. Microsoft also strongly encourages that you perform a strict review of your operating system and its associated configurations.

    For additional information, click the following article number to view the article in the Microsoft Knowledge Base:

    (231619) HOW TO: Use the SQLIOStress Utility to Stress a Disk Subsystem Such as SQL Server"







    **ASCII stupid question, get a stupid ANSI !!!**

  • Thanks for your reply. We have not upgraded to SP 4 yet. We also have not moved to 2003. We are going to monitor the disk for errors, but nothing shows up in the SQL error logs except this error.

     

    We never had this problem on our other server, so it appears as if this is something server specific. The difference between the two servers is that the new one has two Zeon processors (vs two intel Pentiums) and the disk storage is on a SAN. Do you think any of these could be a contributing factor?

     

    Chip

  • to eliminate any hardware related issues causing data-integrity compromise, you could try upgrading to sp4 and see if that resolves the issue ...however, googling revealed that heaps of other people seem to have run into this problem...maybe someone on this site with personal knowledge of this problem will happen along and respond - but meanwhile, here're a few more pointers that I unearthed...

    look specifically at the msn help link first since you mention that it happens only with a specific table and it might have to do with the collation....

    "This looks like data corruption. Run DBCC CHECKDB on the database and see

    how you go. If CHECKDB won't fix corruption, restore from your latest good

    backup. "

    "DBCC DBREINDEX"

    msn help

    google group







    **ASCII stupid question, get a stupid ANSI !!!**

  • I had trouble like that once...  Had no idea how to fix it - I dropped clustered indices, recreated them, etc, etc - ran repairs which never fixed the problem (even with allowing data loss).  If I dropped one index, I had NO errors reported, but then if I recreated the index, the exact same error came back... It made no sense

    I think I exported the data from the table and recreated all of the RI constraints etc, and then renamed the new table to match the old one...  Awful issue though, especially when the backup hasn't been running because of the settings in the maintenance plan stopping it backing up if there was an error.

  • Luckily I can recreate the data in this table programmatically. So When this happens I actually delete the table and indexes, recreate them and then run my program to create the data. This works for a few days and then the problem comes back.

     

     

  • This may be from the same MSN post above but read these:

    http://msdn.microsoft.com/SQL/default.aspx?pull=/library/en-us/dnsqldev/html/sqldev_04222005.asp

    and

    http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx

    The first article has some read world examples seen my MS SQL PSS

    First thoughts - latest SP for OS, latest SP for SQl, HBA firmware and drivers.

    Do you have (excuse terminology if wrong) dedicated SAN disks are are part of the general melting pot?

  • If you can afford a server downtime it would be worthwhile to run sqliostress utility as sushila suggested. In addition to the mentioned article you can see these:

    http://support.microsoft.com/default.aspx?scid=kb;en-us;826433

    http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/sqlIObasics.mspx

    http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsqldev/html/sqldev_04222005.asp

    The general rule I took from these is that such errors are usuallyhardware related (disks, controllers, caching) - so sqliostress mightreveal something. I had similar problem some time ago (irregular transienterror 605) and controller driver and firmware updates solved the problem.

  • Don't forget an IO filter driver such as Antivirus, 3rd party backup solutions etc messing things up

  • I've seen this problem with an MSA1000 SAN on SQL SP3.  In my case, the "corruption" went away when I issued a DBCC DROPCLEANBUFFERS.   Kind of a buzz-kill solution, but it proved to be an acceptable workaround.

    The correct solution, later implemented, was to upgrade the SAN firmware.  It was long ago, so I don't remember what version had the badness.

     

  • I was searching error 625 and come across this post.

    Does anyone know how to find this link, I would like to see how PSS says about this error.

    "This may be from the same MSN post above but read these:

    http://msdn.microsoft.com/SQL/default.aspx?pull=/library/en-us/dnsqldev/html/sqldev_04222005.asp" -- Location Cannot Be Found.

  • I had this problem doing an update to the clustered index which was *not* a unique index. If I updated the column to Null (it had a default of '' (empty string)), it would blow out with this error, blow out my spid, leave the DB in 'Suspect' status, which could only be detached and restored from backup. Thankfully, I had done that, so I restored it and guess what, :w00t: it happened again if I tried to again update it to Null. Did the CheckDB and CheckAlloc and CheckTable and all were perfectly fine before the update. :angry:

    The 'solution' was to update the column to the empty string instead of trying to nullify it. That was acceptable to the end user and once I did that, no errors. 🙂

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply