Sunday, April 7, 2013

Email Archive Re-hydration - The Plan

re·hy·drate 
tr.v. re·hy·drat·edre·hy·drat·ingre·hy·drates
1. To cause (something dehydrated) to take up fluid.
2. To replenish the body fluids of.

rehy·dration n.

When a mailbox has many shortcuts put there by an Email Archiver it shrinks the mailbox size down to the lowest size possible. Similar to removing the water from a sponge. When we put back the original email removing the shortcut, the mailbox grows larger. Much like adding the water back to that sponge.

We call this process of replacing the email represented by a shortcut with the original message retrieved from the Email Archive as 're-hydration."

The Plan
This project calls for re-hydrating 25,000 users without disrupting the users daily activities. Also we need to remember some users have PST files - with stubs - and Personal Archives with stubs.

To start with we needed a way to keep track of the users re-hydrated and give those users a new quota of 'no quota'. During testing we discovered the mailboxes would be expanding by 170%. We decided the easiest way for this to work was create new mailbox databases and name them with <Group><number> so we could just grow the database to a certain size. Then we'd just create a new mailbox database and increment the number. For example "IT01" -- when the mailbox database became 75G we created "IT02" and so on. The quota on these databases was nothing, unlimited. We do issue warnings at 2GB. The warning is just to remind them they have to move some items to their personal archive.

Since these are new databases and they were all going to grow, we needed a new spacious place to put them. All the mailbox servers got a two new 1 TB drives. The plan was to restrict the number of databases to 6 per drive in the beginning and then see where we stood for free space. We knew the databases would grow, not only from re-hydration, but from new data that wasn't being stubbed anymore. We needed extra room for growth. We created a new mailbox database for each group and tagged on a "01" to the name.

Next task to tackle was how to select the users and in what order. At first the obvious thing was: "Let's get all the smallest mailboxes and work our way up." The thought was there was less data to migrate and therefore the process would be fast and we could tear through the largest number of users in the least amount of time.

But there was a complication. Many users have PST files and those users have very small mailboxes. It's what we've taught them to do. Avoid your quota problems by using PST files. We needed to skip these people, because we want to re-hydrated to PST files and import them into an Personal Archive for them.

When we started out PST file migration project, we created a GPO that disallowed growth of PST files. New users created after a certain date were added to this GPO and have no PST files. We'll work these people first. Also people created after 1/1/2013 were put in these new mailbox database right away, so they were never touched by the Email Archiving system.

So the list of people to re-hydrate was compiled by the date they were hired or "when the mailbox was created."

$Date = '1/1/2013'
Get-Mailbox -ResultSize Unlimited  | ?{$_.WhenMailboxCreated -gt $Date}

But that gets all mailboxes and we want to limit the mailboxes to the databases that don't have a "01" -or any numbers at the end.

$MBXDB = Get-MailboxDatabase | ?{ $_.Name -notmatch "\d{2}$"}

$Date = '1/1/2013'
$MBXDB | Get-Mailbox -ResultSize Unlimited  | ?{$_.WhenMailboxCreated -gt $Date}

That gets us a list of potential mailboxes to process. Another requirement of the project is to skip certain mailboxes. Those who use mobile devices Blackberry and Good Messaging. Also we want to skip any users who have Personal Archives.

Skipping those with Personal Archives is easy, just add (-and $_.ArchiveDatabase -eq $null)


Skipping Blackberry and Good Messaging users was not too hard. The nice thing is we have a DL that contains these people. Instead of checking for the membership of this for every single user, we just load them up into an array and just check the array with -Contains for each user and skip them if true.


$BBs =  get-qadgroupmember 'BlackberryDevices' -SL 0 | sort Displayname |%{$_.Displayname }
$GDs =  get-qadgroupmember 'GoodDevices' -SL 0 | sort Displayname |%{$_.Displayname }


(I find some things easier to accomplish with the QAD tools.)

Turned out we wanted to exclude more mailbox database because we were getting service accounts and other false positives. We wanted to exclude those so the mailbox database selection became:

$MBXDBs = Get-Mailboxdatabase | ?{$_.Recovery -eq $False -and $_.Name -notmatch "Mailbox" -and $_.Name -notmatch "^SA_" -and $_.Name -notmatch "\d{2}$" -and $_.Name -notmatch "Apple"}

And for the same false positive reasons, the mailbox selection became:

$MBX = $MBXDBs | Get-Mailbox -resultsize unlimited |?{($_.DisplayName -notmatch "^System" -or $_.DisplayName -notmatch "SA_" -or $_.DisplayName -notmatch "^CAS"-and $_.ArchiveDatabase -eq $null -and $_.WhenMailboxCreated -gt $Date}


And the basic outline of the script looked like this:

$MBXDBs = Get the mailbox databases
ForEach ($DB in $MBXDBs) {
    $MBX = Select the mailboxes in this $DB
    If ($MBX) {
        Is this a Blackberry or Good user?
            If so skip
            If not then collect mailbox info, size item count location etc
    }
}

Save results to a CSV file we will use later to move these users to their proper new mailbox database and then re-hydrate

I am doing a lot of the steps here manually, piecing together bits when I have time. Right now I just pushing a lot of buttons over and over, just to get the initial group of easy people done.

People hired after 1/1/2013 did not need and work done on them and all I had to do was to move their mailboxes tot the correct new Mailbox Database.

People hired after 1/1/2012 -- those who can not create PST files -- went fairly fast, but the further I go back in time the more data that needs to be re-hydrated.

I believe I've stepped in all the pot holes we're going to find so it's time to get someone else to concentrate on the button pushing part and just churn out some numbers. They have a cheat sheet and can always come back to me for issues they can't seem to figure out and we'll work on those together.  I can then concentrate on gaining some ground on my other projects.  Soon I'll have some time to post some more on this re-hydration adventure.





No comments:

Post a Comment