Friday, 6 July 2012

Resolving issues with Search Query component when the server hosting the query component go off line and cannot be restored.

Resolving issues with Search Query component when the server hosting the query component go off line and cannot be restored.

I suspect that this is set to 2 (the default), which would cause problems if one of the Query Components were not available.



1.       Verify  the current value for MinimumReadyQueryComponentsPerPartition using the following shell command :
$ssa = Get-SPEnterpriseSearchServiceApplication "Your Search Service Application Name"
$ssa.MinimumReadyQueryComponentsPerPartition

·         This property defines the minimum number of the query components used in each index partition
·         The value was currently set to 2 (the default value)
o        When the query server in ATT-2 was taken offline, the farm only had 1 query component available for this index partition (e.g. which was below the ‘Minimum’ value)
o        Secondly, during a crawl, the crawler will attempt to propagate to the query components
§         With only 1 query component available and a ‘Minimum’ of 2 required, the propagation could not complete – making the crawl appear hunged ( crawl stuck on running )
§         This will also apparently put the query in a state that prevented queries from completing

2.       Updated the value for MinimumReadyQueryComponentsPerPartition using the following:
$ssa.MinimumReadyQueryComponentsPerPartition = 1
$ssa.update()
·         Upon updating, the query component from the offline server was no longer present in the UI

3.       We then removed the inactive query topology using:
$ssa | Get-SPEnterpriseSearchQueryTopology
$oldTopo = Get-SPEnterpriseSearchQueryTopology -identity [Id-GUID-value-for-inactive-qTopo] -SearchApplication $ssa
$oldTopo | Remove-SPEnterpriseSearchQueryTopology
4.       When we attempted to remove the Query topology in ‘Deactivating’ state, it failed because it was not inactive
o        We attempted to use $oldTopo.cancelActivate(), but this made no difference
o        Attempting to stop the currently running crawl also made no difference
5.        We overcame the pinned crawl and deactivating query component by resetting the index
o        Afterwards, the ‘Deactivating’ Query topology went to an ‘Inactive’ state, which allowed it to then be removed
6.       We were then  able to start a full crawl







I also tested if having 4 query components (e.g. two index partitions with Query Components 1 & 2m on ServerA and Query Components 1m & 2 on ServerB) made any difference, but the crawl also “hung” waiting for query components in my test. Given the definition for this parameter (“This property defines the minimum number of the query components used in each index partition”), this is expected.

Also, in my test, the Query components on the server I took offline (e.g. I simply paused my VM hosting this component) were not marked as Offline or Disabled even after setting this ‘Minimum’ value. Therefore, when I started my crawl after pausing this server, the crawl appeared to hang. However, I was able to manually disable the Query components on this offline server by using the following PowerShell (it’s also worth noting that this took effect immediately – without cloning/activating a topology – and it also took affect during an active crawl):

$qTopo = $ssa | Get-SPEnterpriseSearchQueryTopology -active
$QCs = Get-SpEnterpriseSearchQueryComponent -QueryTopology $qTopo
$QCs | format-list
   #using the IDs of the Query Components in this output
#i entered the Id to the shell identity below
$p2QC = Get-SpEnterpriseSearchQueryComponent -QueryTopology $qTopo -identity 111-111-111-111
$p2QC.setOffline()
   #repeat for any Query Component in the now-offline data center







SharePoint 2010 internal SharePoint load balancer send traffic to a server offline? and it causes sharepoint end point failuires

When an end point fails, a user will get the “Internal Server Error” exception, and the topology service will mark the end point “bad” for a configurable period of time (BadListPeriod). This time period is 10 minutes by default.

After “BadListPeriod” time has elapsed, the end point that had failed is marked “good” again, and a user may get another “Internal Server Error” exception if the server is still down. At this time, the particular server is taken out of rotation for “BadListPeriod” amount of time.

You can increase the “BadListPeriod” to the number of hours it will take to bring the server back online. This can be done using the following script that has been created and tested :




The above script sets the “BadListPeriod” to 10 hours, which means that after a user gets an error, we will take the endpoint out of rotation for 10 hours.




Set-SPTopologyServiceApplicationProxy ( BadListPeriod ) SharePoint 2010 end point failuires :

To prevent sharepoint from constantly going to a bad server for requests run the following commends

------------------EXAMPLE-----------------------
Set-SPTopologyServiceApplicationProxy 67877d63-bff4-4521-867a-ef4979ba07ce –BadListPeriod 1234


Specifies the time period that a node is kept in a bad list.
The type must be a valid value between 1 and 480 (in minutes).
The default value is 10.

or use script that i created

$proxy = Get-SPTopologyServiceApplicationProxy
$proxy.ExpireFailedEndPointsInterval = [System.TimeSpan]::FromHours(10)
$proxy.Update()


I did some additional testing and found the following two parameters (I’ve provided additional details below):

·        $topoProxy.ExpireFailedEndPointsInterval = [System.TimeSpan]::FromHours(10)
·        $SSA.TimeBeforeAbandoningQueryComponent = 10

Tuesday, 3 July 2012

Remove or Delete Enterprise Search Query Component

Remove or Delete Enterprise Search Query Component

GET-SPEnterpriseSearchServiceApplication | GET-SPEnterpriseSearchQueryTopology | GET-SPEnterpriseSearchQueryComponent | where {$_.Name -eq “8464b697-04df-4cc7-a5b4-c5e1a4e38f51-query-1″} | Remove-SPEnterpriseSearchQueryComponent


Display your SharePoint Farm Topology via power shell

Display your SharePoint Farm Topology via power shell

run the commands below using sharepoint powershell console - copy and past it into the window

$searchApp = Get-SPEnterpriseSearchServiceApplication "Search Service Application"

$InitialQueryTopology = $searchApp | Get-SPEnterpriseSearchQueryTopology -Active

$QueryTopology = $searchApp | New-SPEnterpriseSearchQueryTopology -Clone -QueryTopology $InitialQueryTopology

$qcomp = $QueryTopology | Get-SPEnterpriseSearchQueryComponent

$qcomp | Remove-SPEnterpriseSearchQueryComponent


Monday, 2 July 2012

List you entire SharePoint 2010 search Topology in your search farm

This Script will List your entire SharePoint 2010 / SharePoint 2013 Search Topology in your SharePoint Search farm.
You will get a detailed list of the search topology and configuration. Use the power shell command lines below  to see everything in full detail :



function Get-SPSearchServiceAppInfo()
{
    $searchapps = Get-SPEnterpriseSearchServiceApplication
        foreach ($ssa in $searchapps) {
            Write-Host ======================================================================================
            Write-Host -foregroundcolor DarkCyan $ssa.Name
            Write-Host ======================================================================================
            $ssa
                foreach ($ct in $ssa.CrawlTopologies){
                Write-Host ----------------------------------------------------------------------------------
                Write-Host -foregroundcolor DarkCyan "*** Crawl Topology - " $ssa.Name "***"
                $ct
                Write-Host ----------------------------------------------------------------------------------
                Write-Host -foregroundcolor DarkCyan "*** Crawl Components - " $ssa.Name "***"
                $ct.CrawlComponents
                Write-Host ----------------------------------------------------------------------------------
                }
                foreach ($qt in $ssa.QueryTopologies){
                Write-Host ----------------------------------------------------------------------------------
                Write-Host
                Write-Host -foregroundcolor DarkCyan "*** Query Topology - " $ssa.Name "***"
                $qt
                Write-Host -foregroundcolor DarkCyan "*** Query Components - " $ssa.Name "***"
                $qt.QueryComponents
                Write-Host
                }
        }
}
Get-SPSearchServiceAppInfo

SharePoint Web Services Round Robin Service Load Balancer Event: EndpointFailure Process Name: w3wp Process ID: 10720

SharePoint Web Services Round Robin Service Load Balancer Event: EndpointFailure Process Name: w3wp Process ID: 10720 AppDomain Name: /LM/W3SVC/994389949/ROOT-1-129855207409220439 AppDomain ID: 2 Service Application Uri: urn:schemas-microsoft-com:sharepoint:service:ae4bee41343344b8a2dc69cd2b41d7ba#authority=urn:uuid:4f9c761a93684405b938c2739ccb0cb9&authority=https://spplpb2sbp65212:32844/Topology/topology.svc Active Endpoints: 1 Failed Endpoints:1 Affected Endpoint: http://spplpb3abx25212:32843/ae4bee41343344b8a2dc69cd2b41d7ba/SearchService.svc



SharePoint Topology not handling off line servers ( clone the topology, delete the query component for the server that was turned off )

You  will get alot of issues if you swich off the server that has the search Admin component on or one of the search Quary or Crawl component on it . You will see alot of errors and endpoint failures in the logs . in order to recover from this issues perform the following steps :

clone the topology, delete the query component for the server that was turned off with the following:
$searchApp = Get-SPEnterpriseSearchServiceApplication "Search Service Application"

$InitialQueryTopology = $searchApp | Get-SPEnterpriseSearchQueryTopology -Active

$QueryTopology = $searchApp | New-SPEnterpriseSearchQueryTopology -Clone -QueryTopology $InitialQueryTopology

$qcomp = $QueryTopology | Get-SPEnterpriseSearchQueryComponent

$qcomp | Remove-SPEnterpriseSearchQueryComponent

- We selected Y for the component we wanted to remove and N for the query component on the server we wished to keep:


SharePoint 2010 error in server logs : Windows detected your registry file is still in use by other applications or services. The file will be unloaded now. The applications or services that hold your registry file may not function properly afterwards

Basically some registry entry was holding onto the fact the query component 5 was still part of the farm even though it wasn’t.  You will start to get COM errors and SharePoint Topology enpoint failuires as welll as the errors higlighted below : Error is most often caused by a certain OS setting, which may have been caused by logging off the old query server then shutting it down. Here is what to do on the servers to ensure this does not occur on future :

Change the behavior of the OS by following the steps at
Procedure:
- On each SharePoint server, start gpedit.msc
- Go to Computer Configuration > Administrative Templates > System > UserProfiles
- Enable the policy "Do not forcefully unload the user registry at user logoff"
- Reboot each server after enabling the policy to make the change take effect.

See below for futher information :


SharePoint Log Error

You'll see events like this in the application event log:
Log Name:      Application
Source:        Microsoft-Windows-User Profiles Service
Date:          10/26/2009 8:22:13 AM
Event ID:      1530
Task Category: None
Level:         Warning
Keywords:      Classic
User:          SYSTEM
Computer:      SVR2008-SP2010
Description: Windows detected your registry file is still in use by other applications or services. The file will be unloaded now. The applications or services that hold your registry file may not function properly afterwards. 
DETAIL -
1 user registry handles leaked from \Registry\User\S-1-5-21-1049297961-3057247634-349289542-1004_Classes:
Process 2428 (\Device\HarddiskVolume1\Windows\System32\dllhost.exe) has opened key \REGISTRY\USER\S-1-5-21-1049297961-3057247634-349289542-1004_CLASSES

Problem Description You have a COM+ server application. The application is set to run as a particular user. After working for sometime on Windows Server 2008 the application may stop working and keep failing. Unless you restart the COM+ application, it won’t come back. In the meantime you may see an error like this in the application event log on the CLIENT machine:

Event Type:        Error
Event Source:    DCOM
Event Category:                None
Event ID:              10006
Date:                     10/17/2009
Time:                    1:36:39 PM
User:                     Domain\user
Computer:          *****
Description:
DCOM got error "Unspecified error " from the computer ‘servername’ when attempting to activate the server: {EF047BF9-F91A-4D5B-A18F-BED49553703B}
In this case the event message tells you that the error (E_FAIL or 80004005 or Unspecified error ) is returned from the server during activation vs. a method call. The component CLSID is {EF047BF9-F91A-4D5B-A18F-BED49553703B}

The Cause  The identity user initially logged on to the server when the application launched. The issue happens when the identity user logs off and the COM+ application can no longer read registry keys in the profile of the identity user because of a new User Profile Service functionality of forcing the unload of the user profile on Windows 2008 when the user logs off. Note this new User Profile Service functionality is built into the OS by default.This is a situation where the functionality of forcing the unload of the user profile may break an application if registry handles are not closed in the process.
If you enable COM tracing, you’ll see the error ERROR_KEY_DELETED in the ole32 trace log:
[2] 0BA8.15D0::10/17/2009-13:07:54.390 [OLECOM](:CComRegCatalog::GetClassInfoW) CLSID:ecabafae-7f19-11d2-978e-0000f8757e2a 1018(ERROR_KEY_DELETED)
[2] 0BA8.15D0::10/17/2009-13:07:54.390 [OLECOM](:CComCatalog::GetClassInfoInternal) CLSID:ecabafae-7f19-11d2-978e-0000f8757e2a Flags:0 IID:00000000-0000-0000-c000-000000000046 0x800703fa(ERROR_KEY_DELETED)
You'll see events like this in the application event log:
Log Name:      Application
Source:        Microsoft-Windows-User Profiles Service
Date:          10/26/2009 8:22:13 AM
Event ID:      1530
Task Category: None
Level:         Warning
Keywords:      Classic
User:          SYSTEM
Computer:      DAVIDQIU2008
Description:
Windows detected your registry file is still in use by other applications or services. The file will be unloaded now. The applications or services that hold your registry file may not function properly afterwards. 
DETAIL -
1 user registry handles leaked from \Registry\User\S-1-5-21-1049297961-3057247634-349289542-1004_Classes:
Process 2428 (\Device\HarddiskVolume1\Windows\System32\dllhost.exe) has opened key \REGISTRY\USER\S-1-5-21-1049297961-3057247634-349289542-1004_CLASSES
Resolution As a workaround it may be necessary to disable this feature which is the default behavior. The policy setting 'Do not forcefully unload the user registry at user logoff' counters the default behavior of Windows 2008. When enabled, Windows 2008 does not forcefully unload the registry and waits until no other processes are using the user registry before it unloads it.
The policy can be found in the group policy editor (gpedit.msc)
Computer Configuration->Administrative Templates->System-> UserProfiles
Do not forcefully unload the user registry at user logoff
Change the setting from “Not Configured” to “Enabled”, which disables the new User Profile Service feature.
'DisableForceUnload' is the value added to the registry

Note the same issue can happens on Vista, Windows 7 and Windows 2008 R2.