Troubleshooting Exchange Event ID 4002 from MSExchange Avalability.

This blogpost is about a strange incident I had with a fresh Exchange 2016 two node DAG.

The environment is a virtual VmWare environment. The case was a customer case where I was hired to migrate from a working Exchange 2013 environment to a new Exchange 2016 deployment. The customer had a relatively simple setup with a single AD site and nothing more.

I installed the new Exchange servers, and configured the environment accordingly setting up DAG and configuring mailflow etc. Proceeded with the pilot users and did some testing to confirm the environment was Ok. Everything checked out Ok, and the customer moved all users from Exchange 2013 to 2016. The 2013 servers were decommissioned and everything was Ok.

After a couple of months we suddenly experienced free-busy problems. The users with mailbox on one node would not be able to see free-busy from users on the other node.. This started happening out of the blue without no changes being done in the environment. We also started to see Event ID 4002 in the server logs on the server trying to do free-busy lookup.

Process 17932: ProxyWebRequest CrossSite from S-1-5-21-1409082233-1343024091-725345543-35887 to https://dagmember02.domain.com:444/EWS/Exchange.asmx failed. Caller SIDs: NetworkCredentials. The exception returned is Microsoft.Exchange.InfoWorker.Common.Availability.ProxyWebRequestProcessingException: Proxy web request failed. —> System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a send. —> System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. —> System.Net.Sockets.SocketException: An existing connection was forcibly closed by the remote host
at System.Net.Sockets.Socket.EndReceive(IAsyncResult asyncResult)
at System.Net.Sockets.NetworkStream.EndRead(IAsyncResult asyncResult)
— End of inner exception stack trace —
at System.Net.TlsStream.EndWrite(IAsyncResult asyncResult)
at System.Net.ConnectStream.WriteHeadersCallback(IAsyncResult ar)
— End of inner exception stack trace —
at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult)
at Microsoft.Exchange.InfoWorker.Common.Availability.Proxy.RestService.EndGetUserPhoto(IAsyncResult asyncResult)
at Microsoft.Exchange.InfoWorker.Common.UserPhotos.UserPhotoApplication.EndProxyWebRequest(ProxyWebRequest proxyWebRequest, QueryList queryList, IService service, IAsyncResult asyncResult)
at Microsoft.Exchange.InfoWorker.Common.Availability.ProxyWebRequest.EndInvoke(IAsyncResult asyncResult)
at Microsoft.Exchange.InfoWorker.Common.Availability.AsyncWebRequest.EndInvokeWithErrorHandling()
— End of inner exception stack trace —
. Name of the server where exception originated: dagmember01. LID: 43532. Make sure that the Active Directory site/forest that contain the user’s mailbox has at least one local Exchange 2010 server running the Availability service. Turn up logging for the Availability service and test basic network connectivity.
I immediately started to search for a possible solution to this strange behaviour, but there was no working solution to be found anywhere(read through most of the posts on this event ID on the internet :)) All other services on the Exchange environment were working fine, there were no other error messages in the logs indicating something wrong, just this Event ID 4002 from time to time when people where trying to add someone to a meeting using Scheduling assistant.
After quite a while of research and asking a couple of colleagues, the solution suddenly appeared.

A colleague of mine asked me wether or not the customer used templates to create the VM’s. After checking this with the customer, we could confirm this. He then told me that he had been experiencing some strange similar behaviour on Exchange 2007 some years ago, and asked me to check if the servers had unique SID’s. I did so, and discovered that both the new Exchange 2016 servers had identical SID’s. The tool used was the PsGetSID from Microsoft Sysinternals.
Turned out that the servers were created from a template in VmWare and not sysprepped. After removing one of the servers and reinstalling it, everything started working fine again.
Bottom line is:
If your Exchange servers start acting weird and there doesn’t seem to be a logical explanation to the problem, check the server SID’s. They have to be unique, or obviously, strange things can start to happen in your environment. In my case, there was no obvious reason to the problems that suddenly started to appear and the server setup was made in good faith 🙂
This might bee a noob fault, but I can imagine that someone else but me would have experienced this or other strange problems with no logical explanation, so I think the tip would be useful in case everything else leads nowhere.
The weird part here is the servers functioning 100% ok for a couple of months before the problems started. I’ve never experienced this before, so for all I know that’s how Exchange handles this kind of misconfiguration?

Exchange 2013 Event ID 1039: Failed to detect the bitlocker state for EDS log drive ‘C:\’.

I came across this event on an Exchange 2013 CU9 server which I was configuring for a customer.

EventID1039
Event ID 1039, Exchange 2013 CU9

Searching for solutions to this event made me understand that this is something that’s been going on since Exchange 2013 Cu7. The fix is quite simple and does not have any impact on the Exchange system.

Simply disable the Bitlocker check on the drive where diagnostics root directory exists.

Open below file in notepad (run as admin):
C:\Program Files\Microsoft\Exchange Server\V15\bin\Microsoft.Exchange.Diagnostics.Service.exe.config

Change the parameter “DriveLockCheckEnabled” value=”True” to “DriveLockCheckEnabled” value=”False” and save the config-file.

<!– Settings used when checking Bitlocker state of the drive where the diagnostics root directory exists –>
<add key=”DriveLockCheckEnabled” value=”False” />
<add key=”DriveLockCheckInterval” value=”00:00:10″/>
<add key=”DriveLockMaxDuration” value=”00:04:00″/>

Restart MicrosoftExchangeDiagnostics service, and the error message is gone.

Failed mailbox migration to Office 365: mrsproxy.svc’failed because no service was listening on the specified endpoint. The remote server returned an error: (404) Not Found

I was doing a migration between Exchange 2013 and Office 365 in a Hybrid configuration when I recieved the above error message. Couldn’t quite figure out why until I stumbled accross a forum thread that pointed me in the right direction.

This is what you have to check out and remediate if you have this error:

The ExchangeGuids of on-premise users are different to the ExchangeGuids of the corresponding users in Office 365.

Update the online user’s ExchangeGuid to match the on-premise ExchangeGuid and start migration.

1. On the on-premise Exchange server:

Get-MailBox -Identity userID | Select ExchangeGuid

2. In an Office 365 PowerShell session:

Get-MailUser -Identity UserID | Select ExchangeGuid

If the results don’t match, copy the guid result from command 1 and then run the following command in the Office 365 PowerShell session:

Set-MailUser -Identity userID -ExchangeGuid “copied guid”

Start migration:

$Cred = Get-Credential

$s = New-PSSession -ConfigurationName Microsoft.Exchange -ConnectionUri https://ps.outlook.com/powershell -Credential $cred -Authentication Basic -AllowRedirection

Import-PSSession $s

$OnPremAdmin = Get-Credentials

New-MoveRequest -identity “UPN” -Remote -RemoteHostName “remote host ex OWA URL mail.domain.com” -RemoteCredential $OnPremAdmin -TargetDeliveryDomain domain.mail.onmicrosoft.com