Today’s challenge at CF Webtools for myself was to find and replace any “_” (underscore) characters in a URL .htm file name and replace it with “-” (dash). The list I was given had file names with up to 7 underscores in any position. Example: my_file_name.htm
While I figured this would be a straight-forward task with IIS URL Rewrite, I was wrong.
In the end I found that I either had to create one rule for each possible underscore count or write a custom rewrite rule. I went the one rule per count route. I read in one blog you can only use up to 9 variables ({R:x}).
The other part of the rule was they had to be only in the “/articles/” directory.
My first challenge was just to get the right regular expression in place. What I found out was that the IIS (7.5) UI’s “Test Pattern” utility doesn’t accurately test. In the test this worked:
Input: http://www.test.com/articles/my_test.htm Pattern: ^.*\/articles\/(.*)_(.*).htm$ Capture Groups: {R:1} : "my", {R:2} : "test"
However, this does not match in real-world testing. #1, don’t escape “/” (forward-slash) (really??). #2 the pattern is only matched against everything after the domain and first slash (http://www.test.com/).
So really, only this works:
Input: http://www.test.com/articles/my_test.htm Pattern: ^articles/(.*)_(.*).htm$ Capture Groups: {R:1} : "my", {R:2} : "test"
In order to match against up to 8 underscores, you need 8 rules, each one looking for more underscores. So the next one would be:
Input: http://www.test.com/articles/my_test_file.htm Pattern: ^articles/(.*)_(.*)_(.*).htm$ Capture Groups: {R:1} : "my", {R:2} : "test", {R:3} : "file"
To do this efficiently you just edit the web.config in the web root for that site. The end result ended up being:
<?xml version="1.0" encoding="UTF-8"?> <configuration> <system.webServer> <rewrite> <rules> <rule name="AUSx1" stopProcessing="true"> <match url="^articles/(.*)_(.*).htm$" /> <action type="Redirect" url="articles/{R:1}-{R:2}.htm" /> </rule> <rule name="AUSx2" stopProcessing="true"> <match url="^articles/(.*)_(.*)_(.*).htm$" /> <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}.htm" /> </rule> <rule name="AUSx3" stopProcessing="true"> <match url="^articles/(.*)_(.*)_(.*)_(.*).htm$" /> <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}.htm" /> </rule> <rule name="AUSx4" stopProcessing="true"> <match url="^articles/(.*)_(.*)_(.*)_(.*)_(.*).htm$" /> <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}-{R:5}.htm" /> </rule> <rule name="AUSx5" stopProcessing="true"> <match url="^articles/(.*)_(.*)_(.*)_(.*)_(.*)_(.*).htm$" /> <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}-{R:5}-{R:6}.htm" /> </rule> <rule name="AUSx6" stopProcessing="true"> <match url="^articles/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*).htm$" /> <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}-{R:5}-{R:6}-{R:7}.htm" /> </rule> <rule name="AUSx7" stopProcessing="true"> <match url="^articles/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*).htm$" /> <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}-{R:5}-{R:6}-{R:7}-{R:8}.htm" /> </rule> <rule name="AUSx8" stopProcessing="true"> <match url="^articles/(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*)_(.*).htm$" /> <action type="Redirect" url="articles/{R:1}-{R:2}-{R:3}-{R:4}-{R:5}-{R:6}-{R:7}-{R:8}-{R:9}.htm" /> </rule> </rules> </rewrite> </system.webServer> </configuration>
In the end this URL:
http://www.domain.com/articles/my_file_foo_bar.htm
becomes:
http://www.domain.com/articles/my-file-foo-bar.htm