Cannot fetch a row from OLE DB provider IBMDADB2 for linked server

We have a very peculiar error where we spent months working with Microsoft and IBM to figure it out what could be wrong. You can see the short summary below

I have SQL Server 2008R2 SP2 with the DB2 drivers installed, v. 11.64.11.00 (64 bits). I have two instance in the same box, both using the same ODBCs, one instance is fully functional but second instance gives me this result when I try to execute a command on a linked server:

The OLE DB provider IBMDADB2 for linked server reported an error. The provider did not give any information about the error. Msg 7330, Level 16, State 2, Line 1 Cannot fetch a row from OLE DB provider IBMDADB2 for linked server

I already verified allow in process is = 1, and also assign the users the “Create global objects” user right.

This affects only the windows account but it works fine with SQL Logins. After months of troubleshooting we found the error can be fixed by Disabling the UAC you can refer to https://technet.microsoft.com/en-us/library/cc709691(v=ws.10).aspx

  1. Click Start, and then click Control Panel.
  2. In Control Panel, click User Accounts.
  3. In the User Accounts window, click User Accounts.
  4. In the User Accounts tasks window, click Turn User Account Control on or off.
  5. If UAC is currently configured in Admin Approval Mode, the User Account Control message appears. Click Continue.
  6. Clear the Use User Account Control (UAC) to help protect your computer check box, and then click OK.
  7. Click Restart Now to apply the change right away, or click Restart Later, and then close the User Accounts tasks window.

I sincerely hope this could save somebody months of workaround. If this is your case, leave a comment.

A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: seconds. Working set, committed, memory utilization

Long story short: you need to add the SQL service account to Lock Pages in Memory and set a max SQL memory configuration.

The problem is when SQL Server stops working but don’t attempt to failover. There are a few scenarios where this error applies but most of the cases SQL doesn’t not accept new connections for short periods (5 seconds.)  The problem can be identified in SQL Error Log with the following entry:

A significant part of sql server process memory has been paged out. This may result in a performance degradation. Duration: 0 seconds. Working set (KB): 157348, committed (KB): 444456, memory utilization: 35%.

Environment

This error is more visible in virtual server and server with more than 128 GB in RAM. This affects so far SQL Server 2012 and 2014 Standard or Enterprise Editions.

 Root Cause

Problem happened because Windows paged out SQL server memory. KB 918483 discusses the issue and the resolution.

Resolution

You will need to work with Enterprise Windows or Wintel team to resolve the paging out issue. You can refer “How to troubleshoot this problem” of the “More information” section of the following article: How to reduce paging of buffer pool memory in the 64-bit version of SQL Server https://support.microsoft.com/en-us/kb/918483

Workaround

You can grant the SQL service account is ‘DOMAIN\SA_SQLAccount_P’ Lock pages in memory user right to avoid the issue. Before you grant the privilege, you need to make sure SQL maximum memory is set such that there is enough memory left for the OS to work smoothly. This needs to be done in each node of the cluster.

To enable the lock pages in memory option

1. On the Start menu, click Run. In the Open box, type gpedit.msc.

2. On the Local Group Policy Editor console, expand Computer Configuration, and then expand Windows Settings.

3. Expand Security Settings, and then expand Local Policies. –> Lock Pages in Memory

Lock Pages in Memory

In the Local Security Setting – Lock pages in memory dialog box, click Add User or Group.

Adding user to Lock pages in memory
Adding user to Lock pages in memory

In the Select Users, Service Accounts, or Groups dialog box, add an account with privileges to run sqlservr.exe.

Log out and then log back in for this change to take effect.

SQL Server has now lock page in memory and it is registered in the SQL Error Log:

sql_after_plm_configured

Conclusion

SQL Server memory is tricky when running in a virtual machine. Anybody can get the Page Out Memory error but it will become more and more frequent. The problem is when it goes down, not even monitoring tools can catch it as recovery time is two to five seconds. However, important SSIS, jobs, or request can occur during those seconds and it will reject all connections. The problem goes very serious when the application is not able to recover itself and SQL did not register any attempt of failover.

Fixing the problem with this relatively easy solution will not only improve the performance but also maintain a consistent database health. Make sure to configure the SQL Server max memory 80/20 to avoid any issue.

 

References

Enable lock pages https://msdn.microsoft.com/en-us/library/ms190730(v=sql.120).aspx

How to reduce paging of buffer pool memory in the 64-bit version of SQL Server https://support.microsoft.com/en-us/kb/918483

SQL Server Set Up Error The RPC Server is Unavailable

This error occurs at installation time when SQL is unable to authenticate the service accounts to start the services. This occurs with A domain accounts, B domain accounts, and local system accounts. This error stops all the installation of SQL Server and it says “The RPC server is unavailable.” The screen below shows how this error occurs.

SQL Error

I troubleshoot this error by changing the accounts to start the services with no luck. Also attempted with a valid domain_A\sa_account, domain_B\valid_account, and NT AUTHORITY\Local Account but none of them worked.

Solution
Windows Server by default has some policies enabled. In order to fix this error HP configured the following
1. Below marked FW rules should be enabled from group policy.
Enable_Windows_FW_Rules

2. Network binding should be below order. Yours may change but the production/primary must be listed first.
Windows_Binding

3. Make sure DNS suffix entry should be correct order.
DNS_Suffix_Order

4. Execute below command if any issue found on Time synchronization

w32tm.exe /register
net stop w32time
net start w32time

After these changed were made, DBA team was able to proceed with the installation and the error “The RPC server is unavailable” disappear.

Technical Specifications
OS Version: Microsoft Windows Server 2012 R2 Standard x64
SQL Version: SQL Server Standard Edition 2014 x64
Memory: 16Gb
Processor: 4 CPUs
Two nodes Active-Passive Cluster.

SSRS Subscription Failure sending mail: The user or group name ‘domain\user’ is not recognized.Mail will not be resent.

When you’re getting the error above is because the user is no longer valid for Reporting Services.
Quick Fix –> Re-create the subscription.
Best Fix –> Reassign the subscriptions to a Service Account.

However, if you have several subscriptions it is not feasible to recreate all of them. Here it is how can you reassign these subscription to someone else.

1. Create a Service Account to assign it as owner of these subscriptions. This is because the Active Directory account set before is no longer valid (the employee left the company.) If you just use another user account and that user left the company you’ll have to do it over and over again. Account created: SA_REPORTS_P

2. Grant DBO permissions on ReportServer database to the new Service Account (SA_REPORTS_P)

3. Grant ALL permissions to the Service Account on Report Server Home directory. Make sure all the folder inherit permissions from parent, if not, individually check the permissions in each folder.

SSRS Permissions grant
SSRS Permissions grant

4. Find the OWNERID of the subcriptions that have an invalid user or group name. Run the query below in your ReportServer database and find those with the error.

--Query to get the Last Status of each Subcription and its OwenerID.
SELECT s.[SubscriptionID] -- Subscription ID
,s.[OwnerID] -- Report Owner
--,s.[Report_OID] -- Report ID
, c.Path -- Report Path
--,rs.ScheduleID as SQLJobName -- Name of Job on SQL Server
,s.[LastStatus] -- Status of last subscription execution.
,s.[Description] -- Description of the report subscription
,s.[EventType] -- Subscription type
,s.[LastRunTime] -- Last time subscription executed
--,s.[Parameters] -- Parameters used for subscription
,s.[DeliveryExtension] -- How to deliver the subscription
FROM [ReportServer].[dbo].[Subscriptions] as s left join dbo.Catalog as c
on c.ItemID = s.Report_OID left join dbo.ReportSchedule as rs
on rs.ReportID = s.Report_OID
order by s.LastStatus

OwnerID_LastStatus

5. Identify OwnerID for the Service Account Created (or whatever account you’re going to use). Run the query below to list the users.

SELECT [UserID],[UserName]
FROM ReportServer.dbo.Users

SSRS_users
6. Replace the invalid OwnerID with a valid OwnerID. I use the Service Account ID.

UPDATE [Subscriptions]
SET [OwnerID] = '96C17DCA-42D6-44AE-B036-9EA1F870AEFD' -- New VALID OwnerID
WHERE [OwnerID] = 'EFBCD353-3745-41E4-B2BF-8C23198014AD' -- Old and invalid OwnerID

7. Verify the subscriptions were updated. Run the query below to see how the new user name is not in place.

--Query to get the Report Name and its Subcription Owner.
SELECT
jobs.name AS JobName,
C.path AS ReportPath,
C.name AS ReportName,
u.username AS SubscriptionOwner
FROM dbo.ReportSchedule RS JOIN msdb.dbo.sysjobs jobs
ON CONVERT(varchar(36), RS.ScheduleID) = jobs.name
INNER JOIN dbo.Subscriptions S
ON RS.SubscriptionID = S.SubscriptionID
INNER JOIN dbo.Catalog C
ON s.report_oid = C.itemid
INNER JOIN dbo.users u
ON s.ownerid = u.userid
order by u.username

SSRS_SubscriptionOwner

The problem should be fixed at this point. Nevertheless, if you want to check if it really works, you can wait for the next subscription to run or invoke it manually. So, here you have a bonus on how to manually call a subscription.

A. In ReportServer, execute the query below and it will give you the SQL Job Name. Two Jobs (subcriptions) that were failing due to an invalid OwnerId are 36F500A6-58DA-4ACB-87AF-9947FEBB5797 and
542C8818-914D-4BAF-A9AA-C58FCADC24E0

--Query to get the SQL Job Name of each subscription in the server
SELECT
jobs.name AS JobName,
C.path AS ReportPath,
C.name AS ReportName,
u.username AS SubscriptionOwner,
s.LastStatus
FROM dbo.ReportSchedule RS JOIN msdb.dbo.sysjobs jobs
ON CONVERT(varchar(36), RS.ScheduleID) = jobs.name
	INNER JOIN dbo.Subscriptions S
ON RS.SubscriptionID = S.SubscriptionID
	INNER JOIN dbo.Catalog C
ON s.report_oid = C.itemid
	INNER JOIN dbo.users u
ON s.ownerid = u.userid
order by s.LastStatus

SSRS_SQLJobName

B. Execute the job manually with the script below.

EXEC msdb.dbo.sp_start_job @job_name = '36F500A6-58DA-4ACB-87AF-9947FEBB5797'

C. Validate the new status by executing the script in step A.
SSRS_SQLJobName_result2.jpg

Veeam Failed to truncate transaction logs for SQL instances: MICROSOFT##WID. Possible reasons: lack of permissions, or transaction log corruption

This error occurs for several reason, but it can be fixed by applying a hotfix that replaces some libraries. However, follow the steps outlined in this KB article https://www.veeam.com/kb2027 as is relevant to this type of errors, if none of them work out, then you’re in the right place!

NOTE: The hotfix I’ve uploaded is in PDF format but it is a ZIP file, wordpress doesn’t allow to upload ZIP files.

Short solution
– Download this ZIP file and replace them on your affected VM C:\Program Files\Veeam\Backup and Replication\VSS\VeeamGuestHelpres\. (rename it from PDF to ZIP)

Detailed solution
In my architecture we are working with a VM with the following software:
– SQL Version: Microsoft SQL Server 2014 Standard Edition (64-bit)
– OS Version: Windows Server 2012 Standard
– Veeam Backup & Replication 8.0.0.2084

The error or warning shows a full text like this:
Unable to truncate SQL server transaction logs. Details: Failed to process ‘TruncateSQLLog’ command.
Failed to truncate transaction logs for SQL instances: MICROSOFT##WID. Possible reasons: lack of permissions, or transaction log corruption.

In your Veeam console you may see something like this:

Console warning
Console warning

To fix this, follow these steps:
1. Download the new files here (I got them directly from Veeam Support – make sure to rename it form pdf to zip)
2. Copy the zip file or extracted files to you VM.
3. Connect to the affected VM and find the path C:\Program Files\Veeam\Backup and Replication\VSS\VeeamGuestHelpres\. Where you’re going to see some libraries like VeeamVSSSupport*
4. Take a backup of those files
5. Replace the files with the new files. The new files size should be similar to this:
Files_To_Replace
6. Execute the backup job in your Veeam Console

The issue will be fixed after you replace the old files for the new files.

AlwaysOn Availability Replica is in Resolving State

AlwaysOn is a “new” feature in SQL server and there is a little understanding on how it really works. Microsoft has set a lot of functions in a black box and even Microsoft’s help says it is complicated to explain.

I had a problem with an Availability Replica that was shown in Resolving state after Windows team applied the monthly patches in the server.

As result all the databases in the Availability Group were set in Recovery Pending state.
 
 
If you try to bring to start the availability replica you will get an error as this:

TITLE: Microsoft SQL Server Management Studio
------------------------------
The local node is not part of quorum and is unable to initiate a failover. This could be caused by one of the following reasons:
•   The local node is not able to communicate with the WSFC cluster.
•   No quorum set exists across the WSFC cluster.
 For more information on recovering from quorum loss, see SQL Server Books Online.
  (Microsoft.SqlServer.Management.HadrTasks)
------------------------------
BUTTONS:
OK
------------------------------
Recovery Pending
Recovery Pending

WSFC error
WSFC error

 
 
 
 

Before you go further and spend time looking up for the WSFC (Windows Failover Service Clustering) here it is the solution that worked for me. In my case we have configured a cluster with 3 nodes. Only two nodes have a vote SQL01 and SQL02, however, SQL03 is a replica and it doesn’t have a vote.

So the next step is to open the Cluster Administrator.

Cluster Administrator
Cluster Administrator

Next, go to the Node section and you you’ll see the SQL03 is down. The server that is resolving state is down in nodes. It doesn’t have a vote and that is why it is not visible in the main page of the Cluster Administrator.

Node is down
Node is down

Finally, bring the node online as the picture below:

Bring node online
Bring node online

Once the node is online the Availability group got fixed and the synchronization begins in all the databases. If your node was up and running and everything was well configured then it could be something in the ports, endpoints, and so on. Here it is a useful link from Microsoft https://msdn.microsoft.com/en-us/library/ff878308.aspx

If your node is constantly going down because of different reasons you can schedule a script to bring it online automatically. I have posted a solution in my Windows/Scripting section that may help you to simplify your Administrator life.

Transactional Replication is not getting synchronized

First, let’s define what a replication is in human terms. A replication is when we have one table, schema, or database in different locations at the same time. For example, one database that is consumed for reporting purposes in the west and east cost  will be slower in one side of another. To avoid this, and in order to have a higher data availability (and some many other reasons) we set up a replicated database. Now we have one central server named publisher (Chicago), one distributor server (NYC) that will be sending the most updated transaction to the subscribers. Two subscribers  (NYC and SFO) that will be pulling the data from the distributor.

The schema above is a replicated database working around the US.

Now, we’ve been experiencing a lot of problems with one of our replications.  This is the replication topology:

=============================================================
REPLICATION TOPOLOGY
=============================================================
SELECT THE PUBLICATION-SUBSCRIPTION PAIR FOR SCOPING THE CASE
=============================================================
MiamiServerDB04 (DISTRIBUTOR :: 11.0.3128 X64) (Distribution Database: distribution)
|- PUBLISHERS
|- MSSQLCLU (Publisher 1)
|- PUBLICATIONS
|- PublicationXMS (Publication ID: 25; Publication type: Transactional (1-way); Publication database: Xms)
|- ARTICLES
|- 639 article(s) - Aprox 200 Gb.
|- SUBSCRIPTIONS
|- AtlantaServerDB (Subscription Database: RPT_DB; Subscription Type: Push)
|- MiamiServerDB04 (Subscription Database: Repl_DB; Subscription Type: Push

The problem here is that we cannot add new articles to the replications because every time we add a new article the changes are not reflected on MiamiServerDB04 however, AtlantaServerDB shows the changes.

The solution is simple but slow. Unfortunately you must restart the replication for the damaged subscriber (MiamiServerDB04), take a new snapshot, and apply it to the destination server.

Many times when we rebuilt the subscription it didn’t work. We figure it out that we had to stop the agents manually and delete the jobs manually in the distributor server.  Execute the following query in the distributor server to get the status of the replication and the name of the agents and job (the one that you’re going to remove manually)

The code below has been provided by Microsoft

--Query to get replication information, agents, servers, transactions
exec sp_configure 'Ad Hoc Distributed Queries', 1
go
RECONFIGURE
go
Use Master;
GO
Set NOCOUNT ON;
GO

IF object_id('tempdb..#tempsub2') is not null
DROP TABLE #tempsub2
IF object_id('tempdb..#tempsub1') is not null
DROP TABLE #tempsub1
Create TABLE #tempsub2 ( publisher SYSNAME , publisher_db SYSNAME, publication SYSNAME, subscriber SYSNAME, subscriber_db SYSNAME, Pending_Commands int , time_to_deliver_pending_Commands int)

DECLARE @publisher SYSNAME, @publisher_db SYSNAME,@publication SYSNAME,@subscriber SYSNAME,@subscriber_db SYSNAME
SELECT
sub3.publisher
,sub1.publisher_db
,sub1.publication
,CASE when sub1.anonymous_subid is not null then upper(sub1.subscriber_name) ELSE UPPER (srv.name) END 'Subscriber'
,sub1.subscriber_db
,Sub1.job_id,sub1.id
,subscription_type
,sub1.name
into #tempsub1
FROM
(
SELECT * FROM distribution..msdistribution_agents agents
Where subscriber_db not in ('virtual') -- Don't retrieve Virtual subscriptions
and anonymous_subid is null -- Don't retrieve anonymous subscriptions
) sub1
Inner join
(
SELECT
publisher
,publisher_db
,publication
,publication_type
,agent_name
,publisher_srvid
,job_id
FROM distribution..MSreplication_monitordata
WHERE publication_id is not null
AND agent_type = 3 -- Distribution agent
)sub3

on sub1.publisher_id = sub3.publisher_srvid
and cast(sub1.job_id as uniqueidentifier) = sub3.job_id
and sub1.publisher_db=sub3.publisher_db
and sub1.publication= sub3.publication
and sub1.subscription_type=sub3.publication_type
and sub1.name =sub3.agent_name
join master.sys.servers as srv
on srv.server_id = sub1.subscriber_id

DECLARE subscribers cursor for SELECT publisher, publisher_db ,publication ,subscriber ,subscriber_db from #tempsub1
OPEN subscribers FETCH NEXT FROM subscribers INTO @publisher, @publisher_db ,@publication ,@subscriber ,@subscriber_db
WHILE @@FETCH_STATUS = 0 BEGIN
INSERT into #tempsub2
EXEC
(
'
SELECT '''+ @publisher +''' , '''+ @publisher_db +''' ,'''+ @publication + ''' , ''' + @subscriber + ''' , ''' + @subscriber_db + ''' ,*
FROM OPENROWSET (''SQLOLEDB'',''Server=(local);TRUSTED_CONNECTION=YES;'',''set fmtonly off EXEC distribution..sp_replmonitorsubscriptionpENDingcmds @publisher= '''''+ @publisher +''''' ,@subscription_type=0, @publisher_db= '''''+ @publisher_db +''''',@publication = '''''+ @publication+''''',@subscriber= '''''+@subscriber+''''' ,@subscriber_db='''''+@subscriber_db+''''''')
'
)

FETCH NEXT FROM subscribers INTO @publisher,@publisher_db ,@publication ,@subscriber ,@subscriber_db

END

CLOSE subscribers DEALLOCATE subscribers

SELECT
Pending_commands.*
,comment.comments
,comment.delivery_latency 'Delivery_latency MSs'
,comment.time 'Time of message'
,CASE comment.runstatus

when 1 then 'Started'

when 2 then 'Succeeded'

when 3 then 'In progress'

when 4 then 'Idle'

when 5 then 'Retrying'

when 6 then 'Failed ' END status ,

CASE Info.subscription_type When 0 then 'Push' When 1 then 'Pull' When 2 then 'Anonymous' END 'Subscription Type'
,Info.name 'Distribution agent name'
,jobs.name 'Distribution_agent_job'

FROM

#tempsub1 Info

inner join

#tempsub2 PENDing_commands

on Info.publisher_db = PENDing_commands.publisher_db
and Info.publication = PENDing_commands.publication
and Info.subscriber = PENDing_commands.subscriber
and Info.subscriber_db = PENDing_commands.subscriber_db

left outer join msdb..sysjobs jobs

on Info.job_id=jobs.job_id

inner join
(
SELECT time, agent_id ,runstatus,delivery_latency, comments,row_number() over ( partition by agent_id order by time desc ) as pos

FROM distribution..MSdistribution_history
)comment

on comment.agent_id = Info.id

where comment.pos =1 ;
GO

DROP TABLE #tempsub1 ;
go
exec sp_configure 'Ad Hoc Distributed Queries', 0
go
RECONFIGURE
go

Using the GUI and the wizard get the code of the current publisher replication configuration. Save the T-sql to execute it once you have cleaned the agent jobs.

The code above gives you many information once you have the name of the agent jobs proceed to delete them. Finally execute the script in the publisher server to recreate the replication as follow:

The code below was generated using SSMS

-----------------------------------------------------------------------------------
--script to recreate the subscriptions
-----------------------------------------------------------------------------------
-- Enabling the replication database
use master
exec sp_replicationdboption @dbname = N'Xms', @optname = N'publish', @value = N'true'
GO

exec [Xms].sys.sp_addlogreader_agent @job_login = null, @job_password = null, @publisher_security_mode = 0, @publisher_login = N'lms_replication', @publisher_password = N''
GO
-- Adding the transactional publication
use [Xms]
exec sp_addpublication @publication = N'TestXms', @description = N'Transactional publication of database ''Xms'' from Publisher ''MSSQLCLU''.', @sync_method = N'concurrent', @retention = 0, @allow_push = N'true', @allow_pull = N'true',<strong> @allow_anonymous = N'false'</strong>, @enabled_for_internet = N'false', @snapshot_in_defaultfolder = N'false', @alt_snapshot_folder = N'\\path\f$\Replication', @compress_snapshot = N'false', @ftp_port = 21, @ftp_login = N'anonymous', @allow_subscription_copy = N'false', @add_to_active_directory = N'false', @repl_freq = N'continuous', @status = N'active', @independent_agent = N'true', <strong>@immediate_sync = N'false', @allow_sync_tran = N'false',</strong> @autogen_sync_procs = N'false', @allow_queued_tran = N'false', @allow_dts = N'false', @replicate_ddl = 1, @allow_initialize_from_backup = N'true', @enabled_for_p2p = N'false', @enabled_for_het_sub = N'false'
GO
exec sp_addpublication_snapshot @publication = N'Testxms', @frequency_type = 1, @frequency_interval = 0, @frequency_relative_interval = 0, @frequency_recurrence_factor = 0, @frequency_subday = 0, @frequency_subday_interval = 0, @active_start_time_of_day = 0, @active_end_time_of_day = 235959, @active_start_date = 0, @active_end_date = 0, @job_login = null, @job_password = null, @publisher_security_mode = 0, @publisher_login = N'xms_replication', @publisher_password = N''
exec sp_grant_publication_access @publication = N'Testxms', @login = N'sa'
GO
exec sp_grant_publication_access @publication = N'Testxms', @login = N'domain\loginX'
GO

Finally, attached is a short manual on how to initialize a transnational subscription from a backup. This document comes from Microsoft support. Initialize a Transactional Subscription from a Backup.

DM