Archive for the ‘Servers’ Category

At CF Webtools we run SpamAssassin on our Windows email server via means of “SpamAssassin in a Box” by JAM Software. (We also use their TreeSize software which I love to use to track down disk space issues in Windows.) SpamAssassin is a local service that our SmarterMail server talks to, filtering out SPAM based on a scoring system.

SA-eventThis morning I was alerted to the service not responding via our alerting software Nagios. When I checked on the service I saw it was stopped. From there I went down the freak’n rabbit hole.

I opened the even log and saw “SpamAssassin daemon (spamd.exe) restart successful. Looking through the list I see it restarts every hour. I hate when this happens because that means that some other process is likely running on a scheduled basis and causing it to crash. I did note in the “Recovery” tab of the services window that the service is set to restart upon failure. So my thought is it’s crashing and then automatically restarting.

On to further investigation. I found a “SpamD.log” file in “C:\ProgramData\JAM Software\spamdService\sa-logs\”. Upon inspection around the time of the alleged crash I found this entry:

[4980] info: spamd: server killed by SIGINT, shutting down

Okay, so what is “SIGINT” and what is it doing to my service? Googling really didn’t help. Any suggestions were Linux based and didn’t seem to be relevant.

I then turn to “Service.config” under  “C:\ProgramData\JAM Software\spamdService\”. Here I find “<SpamdRestartIntervalInMinutes>”. Wait what?! Looking at the manual I find this bit “Specifies the interval for restarting ‘SpamD’ (in minutes).”. Well this is interesting. I do some further Googling and find a changelog note for version 1.0 back from 2010. It reads:

To consider possible changes in SpamAssassin configuration and to prevent potential memory leaks,SpamAssassin in a Box periodically restarts the daemon.

Okay, now this sounds like some hacks I’ve done in my younger years.

But at least I found my answer. It’s supposed to do that!

Why didn’t the service automatically restart? Just a one-off issue I suppose. I found this in the log:

SpamAssassin daemon (spamd.exe) was started, but didn’t respond to any signals from localhost.
Please restart the service. Should the problem persist, please contact support

Here at CF Webtools we’ve been shifting towards Virtual Machines to replace our dedicated “iron” as Mark Kruger likes to say. Let me say for starters that I’m very impressed with the Dell PowerEdge VRTX Shared Infrastructure Platform. There is so much back-end power in that thing we’ve been able to move our entire set of staging platforms onto one M620 Server Node along with a RAID 6 shared PERC disk array. It handles it like a champ and each virtual server is extremely fast. After about a year of testing and no real issues we’ve been able to move some of our production servers onto another server node. We’re looking forward to adding a couple more server nodes as well. Each node runs VMware ESXi HyperVisor 6.0 via RAID 1 SD cards.

In addition we’re also migrating some workstation VM’s away from Hyper-V and onto a separate Dell Server that we’ve reclaimed.

vcenterconverter61But here’s the real reason I’m writing today:

FAILED: A file I/O error occurred while accessing ”.

I get this error when using “WMware vcenter Converter Standalone 6.0.0” to convert any powered-on Windows machines onto one of the ESXi instances. I don’t get this issue when converting power-on Linux machines. Very odd and Google results of forums haven’t been very helpful. Mostly just a lot of run chkdsk and check for fully qualified domain resolutions.

I’m not going to cover Linux conversions here since they work. But basically what a powered-on Windows conversion does is it installs a helper VM on the machine to be converted. It’s run as a service and you have the option to manually uninstall when finished or let it automatically uninstall.

Something, probably this helper service, then takes a snapshot of the source system. Then the helper VM does a block-level clone for each volume it finds.

Mine always failed after the snapshot and before the clone.

What I did was used the “Export logs…” link in the converter. The line I found interesting, reading the file vmware-converter-server-1.log, was:

error vmware-converter-server[01288] [Originator@6876 sub=Ufa.HTTPService] Failed to read request; stream: <io_obj p:0x03dc40ac, h:-1, <pipe ‘\\.\pipe\vmware-converter-server-soap’>, <pipe ‘\\.\pipe\vmware-converter-server-soap’>>, error: class Vmacore::TimeoutException(Operation timed out)

After some Google searching it dawned on me that I am using two IP subnets. One for the general network and one for management. My machine runs 10.0.0.* (general) and 10.1.1.* (management) subnets. The source system has 10.0.0.* assigned to it while the destination ESXi server has 10.1.1.* assigned to it.

Because my system can communicate with both networks, everything could communicate just fine with both the source and destination machines.

However once things get rolling, the process moves from my machine to communicating between the source and destination. My machine merely monitors the progress. Which makes sense. Keep out the middle man and you have efficient network data transfer.

So the fix here was to bind a temporary management subnet address (10.1.1.*) to the source machine’s NIC. Now the helper VM is able to communicate with the destination server over that management subnet. (more…)

Our normal web server consists of a OS and Program File drive (C:) and a data drive to hold website files (E:). This provides an extra layer of security, speed and helpful structure. Sometimes we will also add another data drive (F:) for clients with really large storage needs. For example all user uploaded photos goes onto a 2TB drive array.

So let’s say you have user upload photos dedicated to one drive. You may want to just place the data onto the root of the drive. Simple right?

Well here’s what you may run into: When migrating/copying that drive to a new drive/machine using Robocopy you’ll find a few issues: (robocopy \\OLD-SERVER\UserPhotos F:\Data\UserPhotos /e /copy:DT /MT:8)

  1. If you’re putting the data into a subfolder this time, that root subfolder will become a system-hidden folder. The reason is you are copying the root of a drive. Pretty annoying.
    1. You can fix this by running this after the copy starts: “attrib -H -S F:\Data”
  2. It will try copy “System Volume Information” and “Recycle Bin”. But you’ll find out that your process will just get stuck because it doesn’t have permissions to do so.
    1. You can fix this by not copying any system or hidden files/folders:
      “robocopy \\OLD-SERVER\UserPhotos F:\Data\UserPhotos /e /copy:DT /MT:8 /xd $Recycle.bin “System Volume Information”” FYI: I tried using “/xa:HS” instead of the /xd, but that didn’t work as expected.
    2. If you’ve already gone 8 hours into your copy operation just to find this out, speed things up by syncing things instead using: “robocopy \\OLD-SERVER\UserPhotos F:\Data\UserPhotos /mir /copy:DT /MT:8 /xd $Recycle.bin “System Volume Information” /xo /fft”

So my point is, don’t put your data folder/file structure in the drive root. It’ll get mixed up with hidden-system files and folders and one day throw you for a loop. Instead put that all in a subfolder such as “F:\data”. Another example might be “E:\websites”.

Side-note: There are other copy methods to avoid this situation, however Robocopy is going to be one of your fastest options.

Today’s challenge at CF Webtools for myself was to find and replace any “_” (underscore) characters in a URL .htm file name and replace it with “-” (dash). The list I was given had file names with up to 7 underscores in any position. Example: my_file_name.htm

While I figured this would be a straight-forward task with IIS URL Rewrite, I was wrong.

End the end I found that I either had to create one rule for each possible underscore count or write a custom rewrite rule. I went the one rule per count route. I read in one blog you can only use up to 9 variables ({R:x}).

The other part of the rule was they had to be only in the “/articles/” directory.

My first challenge was just to get the right regular expression in place. What I found out was that the IIS (7.5) UI’s “Test Pattern” utility doesn’t accurately test. In the test this worked:

Pattern: ^.*\/articles\/(.*)_(.*).htm$
Capture Groups: {R:1} : "my", {R:2} : "test"

However, this does not match in real-world testing. #1, don’t escape “/” (forward-slash) (really??). #2 the pattern is only matched against everything after the domain and first slash (

So really, only this works:

Pattern: ^articles/(.*)_(.*).htm$
Capture Groups: {R:1} : "my", {R:2} : "test"

In order to match against up to 8 underscores, you need 8 rules, each one looking for more underscores. So the next one would be:

Pattern: ^articles/(.*)_(.*)_(.*).htm$
Capture Groups: {R:1} : "my", {R:2} : "test", {R:3} : "file"

To do this efficiently you just edit the web.config in the web root for that site. The end result ended up being:

<?xml version="1.0" encoding="UTF-8"?>
                <rule name="AUSx1" stopProcessing="true">
                    <match url="^articles/(.*)_(.*).htm$" />
                    <action type="Redirect" url="articles/{R:1}-{R:2}.htm" />
                <rule name="AUSx2" stopProcessing="true">
                    <match url="^articles/(.*)_(.*)_(.*).htm$" />
                    <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}.htm" />
                <rule name="AUSx3" stopProcessing="true">
                    <match url="^articles/(.*)_(.*)_(.*)_(.*).htm$" />
                    <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}.htm" />
                <rule name="AUSx4" stopProcessing="true">
                    <match url="^articles/(.*)_(.*)_(.*)_(.*)_(.*).htm$" />
                    <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}-{R:5}.htm" />
                <rule name="AUSx5" stopProcessing="true">
                    <match url="^articles/(.*)_(.*)_(.*)_(.*)_(.*)_(.*).htm$" />
                    <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}-{R:5}-{R:6}.htm" />
                <rule name="AUSx6" stopProcessing="true">
                    <match url="^articles/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*).htm$" />
                    <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}-{R:5}-{R:6}-{R:7}.htm" />
                <rule name="AUSx7" stopProcessing="true">
                    <match url="^articles/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*).htm$" />
                    <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}-{R:5}-{R:6}-{R:7}-{R:8}.htm" />
                <rule name="AUSx8" stopProcessing="true">
                    <match url="^articles/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*).htm$" />
                    <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}-{R:5}-{R:6}-{R:7}-{R:8}-{R:9}.htm" />

In the end this URL:


After a Windows Update the lovely “Blue Screen of Death” appeared on one of our servers. Frantic to find a solution, “Boot to the last known working configuration” wasn’t working. A system restore was a last resort option.

Here’s what the error consisted of:

STOP: c0000218 {Registry File Failure}
The registry cannot load the hive (file):
or its log or alternate.
It is corrupt, absent, or not writable.

To resolve the issue I:

  1. Boot to the Windows 2008 Server Install DVD
  2. Click “Repair Computer” on the second screen
  3. Open a command prompt on the second or third prompt
  4. Change directory to C:\Windows\System32\Config\
  5. Rename “SOFTWARE” to “SOFTWARE.BAK”
  6. Copy “RegBack\SOFTWARE” to that directory
  7. Reboot

This restored the SOFTWARE registry to its previous state before the Windows Update. I then had a pending list of Windows Updates to install again. But I’ll leave that for another day for now to see if anyone else is having issues.