Monday, June 4, 2012

Ongoing Indexing Issues on DAG02

See the update to this problem here.

We've been having issues with indexing on one DAG and we thought we had it fixed when we went to Service Pack 1, Rollup 6. This indexing issue doesn't happen very often and I have to learn everything all over again. Putting all my notes here, gives me one place to refresh my view of history.

Before Rollup6 we would see these issues...
People would call and say "I click search and it just spins ... nothing is ever returned."

Do a "Get-MailboxDatabaseCopyStatus DB01" and it reports all is well.
Odd that the Get-MailboxDatabaseCopyStatus would say the Content Index is fine when it obviously isn't. You could failover to another database copy and have the same issue. Reports good, but search doesn't return anything.

(During this, we also found that some clients would search and return results, but when you click on one of the items, you'd get an error message: "Could not display item." This turned out to be a client issue and an update to a newer version cleared that up. But it sent us on a wild goose chase for a bit.)

To fix the "index-don't-work-but-reports-good" issue, we would log onto the server where the passive database copy lived, then:
  1. Suspend the database copy
  2. Stop both indexing services: "Microsoft Search (Exchange)" Service and "Microsoft Exchange Search Indexer" Service
  3. Remove the catalog for that database
  4. Start both Indexing services
  5. Resume the database copy
(This link shows a better way to reset these indexes. Although it's essentially the same thing, it's done via a script already written for you.
http://blogs.msdn.com/b/pepeedu/archive/2010/12/14/exchange-2010-database-copy-on-server-has-content-index-catalog-files-in-the-following-state-failed.aspx)

This generated a "Crawl" of the database to rebuild the index from scratch. After the crawl is compete you can activate that copy of the database and then update the Content Index on the now passive database. using:

Update-MailboxDatabaseCopy DB01\MBX02 -CatalogOnly -Force

I still need to know when this was failing, and not by a customer calling and saying "I ain't getting nuthin."

Test-ExchangeSearch to the rescue -- at least this was something real. The test adds a message to the System mailbox and then checks to see if it is available through the Search Indexer. An actual honest to goodness test!

Get-Mailboxdatabase -Server MBX02 | Test-ExchangeSearch

This command gets us the results we want. They tell us if the Indexing is crap (Result = -1) or how long it takes to retrieve the results. Sometimes this will return a MAPI Error, but you run the test again and it returns a value just fine. Maybe by testing it you woke it up?

After Rollup6
We never found out why those indexes were getting corrupt and why only on that DAG. We have 7 other DAGs that don't have that issue. But after Rollup6, we never saw those indexing issues again.
No, not those issues. We found new ones. But not so terrible ones. At least the Get-MailboxDatabaseCopyStatus reports correctly now.

And in this lastest episode we started seeing a Content Indexing error on the active copy of the database. Update-DatabaseCopy -CatalogOnly doesn't work on the Active copy of a database. And since we are under strict orders to not change anything without Change Management documentation and waiting 14 days for approval, we could not move the database to fix this content index.

I tried to restart the Microsoft Search (Exchange) Service, which tries to restart the Microsoft Exchange Search Indexer Service and the Indexer service hung. I could restart the indexing duo on any other servers in the DAG. It was very clear this database and this server were in a deadlock battle with no relief in sight. In fact this proved it:

sl $exscripts
.\TroubleShoot-CI.ps1 -Database DB01

So I tried to restart the Microsoft Search (Exchange) Service again and the Microsoft Exchange Search Indexer hung while stopping again. I loaded up Task Manger and killed the Exchange Search process, which did immediately restart because the Process ID changed. It finally dawned on me that the Microsoft Search (Exchange) Service had never actually restarted, so I found that process and killed it.

Everything started to work again. So to clear a deadlock, at least in this case, was to kill the Microsoft Search (Exchange) Service process.



This Explains the Troubleshooters:
http://blogs.technet.com/b/exchange/archive/2011/01/18/3411844.aspx

Resetting the Content Indexer to force a new crawl:
http://blogs.msdn.com/b/pepeedu/archive/2010/12/14/exchange-2010-database-copy-on-server-has-content-index-catalog-files-in-the-following-state-failed.aspx


Thursday, May 24, 2012

Enterprise Wide PST Import - Morning Status Report

This is Part 9 in a series of posts about my experience tackling the migration of PST files.
The first post in the series is here.

The next post in the series is here.


This is the report I get each morning. The information is derived from several sources.

First, we took a snapshot of our starting point in this project. That was done on Oct 9th, 2011. That's why all the calculation is done from that date.

We run a report every Sunday. It looks for all users, if they have a HomeDirectory defined, and do they have any PST files there? We log each PST file, so we know when there are new ones, if one is removed.
It's just a simple CSV file.

And every day, we get a list of all the people that have an Archive Mailbox and check each one again. Has that file been removed?, is the PST file in use?, etc. The user also gets a reminder message of PST files still attached to Outlook and instructions on how they should be detaching those PST files.  The user gets a message every 14 days from the day they were migrated.

We run another process manually on Wednesdays that looks to see how long it's been since a PST file was last accessed.  If it is older than 30 days, it tries to move the PST file to the user's local hard drive. Assuming there is space to hold it, of course.

New Hires are put in a GPO that "Disallows PST Growth."  Archive Mailbox users are put in that same GPO. We are calling these people "Controlled Users" because we can controll their PST file habit.

In the gap between New Hires and Archive Mailbox users, are the "Uncontrolled Users." Users not in the GPO and free to add messages and create new PST files.

The ultimate goal of this project is the removal of the PST files from the HomeDiectory.


What this report means:

· We have removed 4.3TB of PST data off Home shares
· Uncontrolled users (Not Online PST Users and Not New Hires) added 1TB of data to new PST files since Oct 2011
· Uncontrolled users have added, to their existing PST files, 2TB of data
· Leaving us a net removal of 1.3TB
· Uncontrolled Users added 55 PST files just last week.
· The 42% is: NumberOfPSTFilesRemoved / NumberOfPSTFilesOnOct92011


Percent Complete
42 %
People with Archive MBX
1,585
All Users with PST files on HomeShares
9-Oct-11
24-May-2012
Diff
Number of PSTs
23,516
18,739
4,777
Size on Shares
12,396,679
11,068,508
1,328,171
Users with PSTs on H:
3,004
2,506
498
Since Begining
Last Week
New PSTs on Shares
2,293
55
Size New PSTs
1,063,297
29,538
Users w/Archive Mailbox: PST files on HomeShares
Discovered
Processed
Removed
Number of PST Files
14,248
8,810
9,916
Size on Shares
6,992,846
3,495,739
4,346,864




Introduction: The Beginings
Part 1: Script Requirements
Part 2: Add-PSTImportQueue
Part 3: Process-PSTImportQueue
Part 4: Some Tools we added
Part 5: Set-PSTImportQueue
Part 6: About PST Capture
Part 7: More PST Import Tools
Part 8: Using RoboCopy
Part 9: Morning Status Report
Part 10: Using BITS Transfer
Part 11: Get the script / Set up
Part 12: The Functions


Monday, May 21, 2012

Enterprise Wide PST Import - Using RoboCopy

This is Part 8 in a series of posts about my experience tackling the migration of PST files.
The first post in the series is here.
The next post in this series is here.

Our users are world-wide and even though we are concentrating on our Headquarters, many other users are seeing the benefit of importing their PST files. Especially those who travel a lot. Nearly everyone of these heavy travelers carry an external hard drive with their PST files. Now they can get those messages via OWA and they can discard the extra weight of the external hard drive.

We started getting many request for Archive mailbox from all over the world. The challenge is to copy the PST file to our staging area so it can be processed by the server.

Alas, Copy-Item seems to be really slow when working over a WAN line. And many of our pipes can get saturated. I started experimenting with Robocopy, and I am not entirely happy with the results, but it's better than copy-item.

I created a small stand alone script (RoboCopy-Item) that does the very simple  copy of a file. I am experimenting with the settings like /IPG:300 /Z, etc -- trying to find the best overall throughput for our environment. Fast, but not choking the WAN. I am still experimenting there.

In the Add-PSTImportQueue function we now look to see if a user is in HQ or outside somewhere. If they are outside HQ then we mark the job with a status of RoboCopy.

Then we use: Robocopy-PSTImportQueue which looks at the import queue and starts background jobs for each job with a RoboCopy status. I do this on a by user, by location selection. We don't want to have 50 jobs running for 30 users in 1 location. I keep it at 1 user per 1 location at any given time.

Then we use a Get-RoboCopyjob <jobnumber> that just gets the 1st 10 lines and that last 20 lines of a job, so you can quckly see the status of the job.

We toyed with the idea of incorporating this into the overall script so we could just run it, but too many things can go wrong and the powershell window with all the jobs running can get closed ot the server rebooted, etc.

We want to find a better way to do this.

Still bigger files -- over 1.5G -- are taking forever. We are using BITS to control the amount of traffic that can be allowed, so we have to go modify the Registry and restart BITS service. But copies from computers with that setting fills up the pipe to that location.

I'll update as we search for the better way.



Introduction: The Beginings
Part 1: Script Requirements
Part 2: Add-PSTImportQueue
Part 3: Process-PSTImportQueue
Part 4: Some Tools we added
Part 5: Set-PSTImportQueue
Part 6: About PST Capture
Part 7: More PST Import Tools
Part 8: Using RoboCopy
Part 9: Morning Status Report
Part 10: Using BITS Transfer
Part 11: Get the script / Set up
Part 12: The Functions





Tuesday, May 8, 2012

More PST Import Utils - Get-ImportStatus & Lock-File

This is Part 7 in a series of posts about my experience tackling the migration of PST files.
The first post in the series is here.
The next post in this series is here.

Quicker Stats
We found the MailboxImport queue needs a little tender care from time to time. We always would run this command to get a quick understanding of what was going on in the queue:

Get-MailboxImportRequest | Get-MailboxImportRequestStatistics

Sometimes we needed :

Get-MailboxImportRequest -Status Failed | Resume-MailboxImportRequest

I really got tried of typing all that out all the time, so I created a short function for me.

Function Get-ImportStatus (){
      #---------------------------------------------------------
      # a helper function to display mailbox import info
      # option to show only a subset -- by batchname
      #      sometimes the list is just long
      # option to restart failed jobs --
      #      sometimes jobs failed because the service crashed on a bad PST file
      #      or too many jobs for one mailbox
      # option to restart suspended jobs
      #      the script can suspend jobs it thinks may be causing issues
      # option to suspend all jobs
      #      a single PST can crash the MB rep service and all jobs start over
      #      you can't tell exactly which is the culprit so suspend all jobs
      #      and investigate

      # left the -confirm off on purpose
     
      param (
             $Batch=$null,
              [switch]$RestartFailed,
              [switch]$RestartSuspended,
              [switch]$SuspendAll
       )
      If($Batch) {
              Get-MailboxImportRequest -BatchName $Batch |
              Get-MailboxImportRequestStatistics
       }
      ElseIf($RestartFailed.IsPresent) {
              Get-MailboxImportRequest -Status Failed |
              Resume-MailboxImportRequest
       }
      ElseIf($RestartSuspended.IsPresent) {
              Get-MailboxImportRequest -Status Suspended |
              Resume-MailboxImportRequest
       }
      ElseIf($SuspendAll.IsPresent) {
              Get-MailboxImportRequest |
              Suspend-MailboxImportRequest
       }
      Else {
              Get-MailboxImportRequest | Sort Name |
              Get-MailboxImportRequestStatistics
       }
     
}

Stepping all over each other
You now how it is, everyone gets busy and stops checking with others about what's going on and crap happens. The way we were handling the queue files became an issue. If two people ran the script, the last one to write was the winner. I noticed this when I tried to schedule a task and the queue was trashed. (Lots of manual fixing up there.) And there were a few other small disasters, too.

I remembered and old way to make sure one process did not step on the other, create a "lock" file when you started your work and then delete the "lock" file when you done.
This wasn't exactly elegant, but it works just fine. We're just doing this with a zero lenth file.

Simply check for the existance of the file (Test-PSTIQLock) and if false, lock the file (Lock-PSTIQ) process the queue and then remove the lock (Unlock-PSTIQ)



Introduction: The Beginings
Part 1: Script Requirements
Part 2: Add-PSTImportQueue
Part 3: Process-PSTImportQueue
Part 4: Some Tools we added
Part 5: Set-PSTImportQueue
Part 6: About PST Capture
Part 7: More PST Import Tools
Part 8: Using RoboCopy
Part 9: Morning Status Report
Part 10: Using BITS Transfer
Part 11: Get the script / Set up
Part 12: The Functions




Tuesday, April 10, 2012

Enterprise Wide PST Import -- PST Capture

This is Part 6 in a series of posts about my experience tackling the migration of PST files.
The first post in the series is here.
The next post in the series is here.

When the Exchange Team at Microsoft posted about PST capture in July of 2011, I was very excited. That post was really the catalyst that got us started thinking we could really Import PSTs. We started putting our infrastructure together to handle all the PSTs floating out in the wild. The future look bright.

And we waited...

During our waiting period we had our own crisis or two that propelled us into the PST import business. By the time PST Capture was officially out in Jan of 2012, we were already fully functional with our scripts and we saw no compelling reason to change.

Still I saw value. We have many users in the field that don't have Home Shares at Headquarters and they were saving their PST files locally. We needed to get those PST as well, eventually. So I downloaded the PST Capture tool and did some testing.

I was sad to find out that to do discovery on a PC, you had to have an agent installed. We still have that same issue with delivering an EXE to all the PC's in the world. So that was out. I tried it on a few PCs, thinking we could do this from time to time. But I could not get it to work. I didn't go deep into troubleshooting, and just thought it was a firewall issue. That would be a nightmare to get open!

But I did try using the UNC file path and that was working. Until I found out that this did not work on some clients. Seems like it failed on all the clients we needed it to work on. Like Office 2007.

There is no reporting feature with PST Capture, true we could still use the same reports we use now, but we would lose a few statistics, like how many files were processed and skipped.

We stopped testing.

So if you're wondering why we went to all the trouble to write this family of scripts to import PSTs, instead of using PST Capture, now you know the rest of the story.


Introduction: The Beginings
Part 1: Script Requirements
Part 2: Add-PSTImportQueue
Part 3: Process-PSTImportQueue
Part 4: Some Tools we added
Part 5: Set-PSTImportQueue
Part 6: About PST Capture
Part 7: More PST Import Tools
Part 8: Using RoboCopy
Part 9: Morning Status Report
Part 10: Using BITS Transfer
Part 11: Get the script / Set up
Part 12: The Functions

Tuesday, March 27, 2012

Enterprise Wide PST Import -- Set-PSTImportQueue

This is Part 5 in a series of posts about my experience tackling the migration of PST files.
The first post in the series is here.
The next post in the series is here.

As we progressed with importing PST files and processing more and more users, we found we were having to Repair PST files or mark them to be skipped manually. Some users were saying - please wait, don't process me yet, wait for 2 days.
This was becoming a pain more than anything, so I sat down and decided to do something about it.

Right about this same time, I had decided we had too many little scripts scattered about everywhere and I wanted them all in one place easily accessible. And all of these scripts shared some functions. So I created a Module and started to migrate all the scripts there. Well it's not a real Module, but more like a repository for all the functions and scripts we used with the PST migrations.
Then I added two new functions: Set-PSTImportQueue and Remove-PSTImportQueue
(I've only used remove once, when a user decided to not have their PST imported.)

Set-PSTImportQueue is just a way for me to change settings on a Import Job without loading up Excel and making mistakes. Here are the options:

  • -DisplayName -- The person we are working on. This allows you to work on all the jobs associated with this name. You use it with the other options to change settings, like -JobStatus, etc.
  • -JobName -- Isolate this update to a particular job
  • -JobStatus -- Change the JobStatus -- reset back to New, etc. Sometimes it useful to change this status to something the script doesn't recognize, just to skip this job, or set of jobs.
  • -IP -- This is the computer Name, sometimes you may not have known it during the add, this is just a way to add the computername to the jobs. We use this entry later when moving PST files to the local PC
  • -OrgUNCName  -- There are cases where the user moved the file, and rather than doing a new discovery, just change the location.
  • -MRServer  -- In our 2 AD site world, having the wrong MR server setting can make the jobs just sit in the queue. We have dedicated CAS servers for this process, one in each site. If the Archive database is in Site 1 and an MRServer in site 2 get chosen by mistake, or the database moved, you need to reset the job status and change this to a MRServer in Site 1.
  • -SkipReason -- A Place to log why a PST file was skipped, "Age, Size, Backup, Sharepoint List, Corrupt, Missing, etc" This shows up on the Final Report.
  • -ClientVer -- A place to log the client version, mainly for records and reports.
  • -ClientVerOK -- The is true or false. By default Process-PSTImportQueue will not process jobs that have ClinetVerOK set to false. It just skips them. Sometimes setting this to false on all jobs for a user allows you to skip this user for now.
  • -ProcessFileOff -- As it sounds, changes the ProcessJob to $false -- skipped jobs are set to false. You might want to set the SkipReason at the same time.
  • -ProcessFileOn -- As it sounds, changes the ProcessJob to $true
  • -CompleteQueueFile -- The PSTCompleteQueue file needs maintenance from time to time, so this setting allows you to work on that file. The default is the PSTImportQueue
Remove-PSTImportQueue just takes two options:
  • -DisplayName -- This will remove all jobs with this users name.
  • -JobName -- This will remove this particular Job

Tuesday, March 13, 2012

Enterprise Wide PST Import -- Some Tools we added

This is Part 4 in a series of posts about my experience tackling the migration of PST files.
The first post in the series is here.
The next post in the series is here.


Tools To Help Do the Job 


When this project started we were mandated to import over 23,000 PST files into mailboxes and get the PST files off the home shares. The important part here is "get the space cleared up off the home shares."

So each week we are given a list of names of people moved to Windows 7 OS and Office 2010. The list can have 20 names, or 120 names. So we process them. In many cases, there can be upwards to 50 PST files per person. One user had over 500. (I know! I had to double check.)

So the queue can be fat with over 1500 items at times, we found that sometimes the processing of the queue was very slow, mostly due to one person having a lot of files and most of them big.


Optimized For Speed

We needed to optimize the queue. We talked about just sorting buy the size of the file, and then decided that would make some people with only one file wait too long to be finished. We wanted to finish as many people as possible in the night. So all the people with just one PST file needed to be moved to the front of the line.

Pipe the queue to group by Displayname, sort by count, then recreate the queue from the smallest to the largest. Quick and easy. This usually gets 80 - 90 % of the users done over night. We start about 7PM.

Move the PST files off.

We really struggled to get a good handle on this. People are pretty freaky about their PST files. In our final report, we ask them to disconnect the PST from their client and then move them locally. But a large majority can't do that without help, or don't care. Many never read the message at all.

We had to figure out how to get the PST files off the Home shares and do this without freaking out the user.

Finally we agreed on a place to put the files. In the users "my documents" directory on their C drive. It's not the most perfect place, but it is somewhat secure. At least secure enough from the common user. We had to know the PC name, and what OS it was, just to find the correct place quickly. We also needed to know that the user is in our HQ building. a small percentage of users with Archive mailboxes are outside our HQ office. We wanted to skip those.

We agreed the PST files needed to be disconnected for 30 days. This was long enough, everyone thought, for the user to forget they had them. If a PST file has a LastWrite Timestamp of "right now" -- that PST is most likely open and connected to Outlook.

So:
If you have PST files on your home share,
And you are in the HQ office,
And those PST last write timestamp is 30 days old.
We move the file for you to ..\Documents\Outlook Files\
If there is a file there with that name already, we just add a random number to the name and keep going.

Grading Our Progress

Since moving the PSTs off the Home shares is the real ultimate goal, we started to keep track of all the PST files, creation dates and removal dates.

As of this date, we had over 23,000 files to move, and we moved about 6,000. Not too bad. Not fantastic, but good.  I'll post some info on our reports later on.


But there is more work to do...

I am still constantly opening the PST Import Queue in a spreadsheet and modifying it and saving it again. I am human and make many mistakes. I must figure out how to do that easier...
Next time: "Set-PSTImportQueue"


Introduction: The Beginings
Part 1: Script Requirements
Part 2: Add-PSTImportQueue
Part 3: Process-PSTImportQueue
Part 4: Some Tools we added
Part 5: Set-PSTImportQueue
Part 6: About PST Capture
Part 7: More PST Import Tools
Part 8: Using RoboCopy
Part 9: Morning Status Report
Part 10: Using BITS Transfer
Part 11: Get the script / Set up
Part 12: The Functions