Jack is Here, asp.net findings

As a software engineer, I focus on .NET, especially asp.net, C#, WCF and so on, and I am also very interested in Search Engine Optimization.

Entries Tagged ‘URL’

A hyperlink control that is used to open a file or an e-mail message fails in an application that uses MSXML 6.0 on a Windows XP-based or Windows Vista-based computer

Symptoms
On a Windows XP-based or Windows Vista-based computer, you run an application that uses Microsoft XML Core Services (MSXML) 6.0. The application contains a hyperlink control that users can use to open a file or an e-mail message. However, in this scenario, the hyperlink control may fail.
This issue occurs if the following conditions are true:The file URL or the e-mail message URL is encoded in the UTF-8 format.The output is in the HTML format. Note This issue does not occur if the URL starts with http://.
Resolution
XSL Transformation (XSLT) does not recognize the UTF-8-encoded URL because there is no “no-uri-escaping” XSLT flag in MSXML 6.0.

How to use mod_rewrite to simplify URL Rewriting in Apache – A basic guide to the mod_rewrite module

Introduction

URL rewriting is the process of manipulating a URL or a link to send to a web server in a way that the link will be dynamically changed on the server to include additional parameters and information together with a server initiated realignment. The Web server performs all these manipulations on the fly, so that the browser of the loop on the change in the URL and the redirection is considered authenticated.

URL Rewriting can benefit your websites and web based applications by providing better security, better visibility or friendliness with Search Engines and helps in keeping the structure of the website more easy to maintain for future changes.

You can read about the theory and benefits of URL Rewriting from my Previous article, which can be accessed from here. In this article we will be taking a look at how we can implement URL Rewriting on an Apache based web server environment using the mod_rewrite module for Apache.

What is mod_rewrite?

Mod_rewrite is one of the most favored modules for the Apache web server and there are many web developers and administrators who will vote this module as the best thing to happen on Apache. This module has a lot of tricks up its sleeve so that it can be called the Swiss Army Knife of all Apache Modules. Apart from providing simple URL Rewriting functionality for an Apache based website, this module arms the website with better URL protection, better search engine visibility, protection against bandwidth thieves by stopping hot linking, hassle free restructuring possibilities and options to provide friendliest of URLs for the website users. This module due to its versatility and functionality can at times feel a bit daunting to master, but getting a through understanding of the basics can make you a master of the craft of URL Rewriting.

Lets Begin! – A look at all the stuff you need to have on your test environment to get mod-rewrite alive and kicking.

First of all, must be on your test machine is configured correctly for Apache Web server. mod_rewrite is usually installed with the Apache server, but if it is missing – this can be on a Linux machine, such as the case was not compiled mod_rewrite module installed along – you have to get it installed. Apache's mod_rewrite to use your box you must configure the module loaded by Apache's dynamic needs. In a shared server you have with your web hosting company to obtain the module installed and Apache from loading.

On your local machine you can find if the module is installed along with Apache by having a look at the modules directory of Apache. Check for a file named mod_rewrite.so and if it is there then the module can be made to load in to the Apache server dynamically. By default this module is not loaded when Apache starts and you need to tell Apache to enable this module for dynamic loading by making changes in the web servers configuration file, which is explained below.

How to Enable mod_rewrite on Apache?

You can make the mod_rewrite module load dynamically in to the Apache web server environment using the LoadModule Directive in the httpd.conf file. Load this file in a text editor and find a line similar to the one given below.

#LoadModule rewrite_module modules/mod_rewrite.so

Uncomment this line by removing the # and save the httpd.conf file. Restart your Apache server and if all went well mod_rewrite module will now be enabled on your web server.

Lets Rewrite our first URL using mod_rewrite

Ok, now the mod_rewrite module is enabled on your server. Lets have a look at how to make this module load itself and to make it work for us.

In order to load the module dynamically you have to add a single line to your .htaccess file. The .htaccess files are configuration files with Apache directives defined in them and they provide distributed directory level configuration for a website. Create a .htaccess file in your web servers test directory – or any other directory on which you want to make URL Rewriting active – and add the below given line to it.

RewriteEngine on

Now we have the rewrite engine turned on and Apache is ready to rewrite URLs for you. Lets look at a sample rewrite instruction for making a request to our server for first.html redirected to second.html at server level. Add the below given line to your .htaccess file along with the RewriteEngine directive that we have added before.

RewriteRule ^first.html$ second.html

I will explain what we have done here at the next section, but if all went well then any requests for first.html made on your server will be transferred to second.html. This is one of the simplest forms of URL Rewritting.

The point here to note that the redirection is completely hidden from the client, this is a classic and HTTP redirect are held differently. Client or browser content first.html second.html give the impression that we are fetching from. This website is awareness of the client, what is the URL you have been very powerful URL rewriting can be generated on the fly.

Basics of mod_rewrite module

Now we know that mod_rewrite can be enabled for an entire website or a specific directory by using .htaccess file and have done a basic rewrite directive in the previous example. Here I will explain what exactly have we done in the first sample rewrite.

Mod_rewrite module provides a set of configuration directive statements for URL Rewriting and the RewriteRule directive – that we saw in the previous sample – is the most important one. The mod_rewrite engine uses pattern-matching substitutions for making the translations and this means a good grasp of Regular Expressions can help you a lot.

Note: Regular Expressions are so vast that they will not fit in to the scope of this article. I will try to write another article on that topic someday.

1. The RewriteRule Directive

The general syntax of the RewriteRule is very straightforward. RewriteRule Pattern Substitution [Flags]

The Pattern part is the pattern which the rewrite engine will look for in the incoming URL to catch. So in our first sample ^first.html$ is the Pattern. The pattern is written as a regular expression.

The Substitution is the replacement or translation that is to be done on the caught pattern in the URL. In our sample second.html is the Substitution part.

Flags are optional and they make the rewrite engine to do certain other tasks apart from just doing the substitution on the URL string. The flags if present are defined with in square brackets and should be separated by commas.

Lets take a look at a more complex rewrite rule. Take a look at the following URL.

http://yourwebsite/articles.php?category=stamps&id=122

Now we will convert the above URL in to a search engine and user friendly URL like the one given below.

http://yourwebsite/articles/stamps/122

Create a page called articles.php with the following code:

$category = $_GET['category'];

$id = $_GET['id'];

echo “Category : ” . $category . ” “;

echo “ID : ” . $id;

This page simply prints the two GET variables passed to it on the webpage.

Open the .htaccess file and write in the below given Rule.

RewriteEngine on
RewriteRule ^articles/(w+)/([0-9]+)$ /articles.php?category=$1&id=$2


The pattern ^articles/(w+)/([0-9]+)$ can be bisected as:

^articles/ – checks if the request starts with ‘articles/’

(w+)/ – checks if this part is a single word followed by a forward slash. The parenthesis is used for extracting the parameter values, which we need for replacing in the actual query string, in the substituted URL. The pattern, which is placed in parenthesis will be stored in a special variable which can be back-referenced in the substitution part using variables like $1, $2 so on for each pair of parenthesis.

([0-9]+)$ – this checks for digits at the last part of the url.

Try requesting the articles.php file in your test server with the below given url.

http://yourwebsite/articles/coins/1222

The URL Rewrite rule you have written will kick in and you will be seeing the result as if the url requested where:

http://yourwebsite/articles.php?category=coins&id=1222

Now you can work on this sample to build more and more complex URL Rewritting rules. By using URL rewriting in the above example we have achieved a search engine and user friendly URL, which is also tamper proof against casual script kiddie injection sort of attacks.

What does the Flags parameter of RewriteRule directive do?

RewriteRule Flags offer us a way to control each of the mod_rewrite rule. These flags are defined separately in a single set of square brackets with commas, and there are about 15 flags to choose from. These flags range from those who are the way legislation is complicated and controls such as those specific HTTP headers sent back to the client when a match is found using the model.

Lets look at some of the basic flags.

  • [NC] flag (nocase) –. This makes mod_rewrite to treat the pattern in a case-insensitive manner.
  • [F] flag (forbidden) – This makes Apache send a forbidden HTTP response header – response 403 – back to the client.
  • [R] flag (redirect) – This flag makes mod_rewrite to use a formal HTTP redirect instead of the internal Apache redirect. You can use this flag to inform the client about the redirection and this flag sends a Moved Temporarily – Response 302 – by default, but this flag takes an extra parameter, which you can use to modify the response code. If you wish to send a response code of 301 – Moved Permanently – then this flag can be written as [R=301]
  • [G] flag (gone) – This flag makes Apache respond with a HTTP Response 410 – File Gone.
  • [L] flag (last) – This makes mod_rewrite to stop processing succeeding directives if the current directive is successful.
  • [N] flag (next) – This flag makes the rewrite engine to stop process and loop back to start of the rule list. A point to note is that the URL, which will be used for pattern matching, will be the rewritten one. This flag can create an endless loop and so extreme care should be given while using it.

There are other flags too but they are complex to explain with in the scope of this article so you can find more info on them by referring the mod_rewrite manual.

2. The RewriteCond Directive

This directive provides you with a range of specifications and conditions within the conditional examination of additional power. When combined with the RewriteRule will allow you to rewrite the conditions for success-based Web site with this statement. RewriteCond like if () statement in the programming language, but they can decide whether an alternative RewriteRule directives should be carried out or is not here. And so on to prevent the hot link and check whether the client meets certain criteria such as URL rewriting before, you can use this command to achieve.

The general syntax of the RewriteCond is:

RewriteCond string-to-test condition-pattern

The string-to-test part of the RewriteCond has access to a large set of Variables like the HTTP Header variables, Request Variables, Server Variables, Time variables etc so you can do a lot of complex conditional checking while writing directives. You can use any of these variables as a string to test by putting it in a %{string} format. Suppose you want to use the HTTP_REFERER variable then it can be used as %{HTTP_REFERER }.

The condition part can be a simple string or a very complex regular expression as your imagination is the only limit with this module.

Lets take a look at an example for conditional rewriting using RewriteCond directive:

RewriteCond %{HTTP_USER_AGENT} ^Mozilla/4(.*)MSIE
RewriteRule ^index.html$ /index.ie.html [L]

RewriteCond %{HTTP_USER_AGENT} ^Mozilla/5(.*)Gecko
RewriteRule ^index.html$ /index.netscape.html [L]

RewriteRule ^index.html$ /index.other.html [L]

This example used as a test string with the RewriteCond directive HTTP_USER_AGENT. It does is that it uses the HTTP_USER_AGENT header variable to find access to the user's browser and matching the known values for the pre-set detector and serve different pages based on the results of the competition of visitors. The first RewriteCond checks to find a match HTTP_USER_AGENT^Mozilla/4(.*)MSIE pattern. This match will occur when a user visits the page using IE as browser. Then the RewriteRule given just under that statement will kick in and will rewrite the URL to server index.ie.html page to the IE visitor.

Similarly a checking is made for mozilla specific browsers in the second RewriteCond and the RewriteRule will do the substitution for index.netscape.html when a positive match is made on the ^Mozilla/5(.*)Gecko pattern. The third RewriteRule is there to catch other browsers. If both the first and second RewriteCond fails then the last RewriteRule will be considered. A point to note in the above example is the usage of the [L] flag with all the RewriteRule directives. This is used to avoid the cascading of applying the rules when a positive RewriteRule is applied.

Two flags which can be used to further control the way the RewriteCond directive behave are [NC] – case-insensitive – and [OR] – chaining of multiple RewriteCond directives with logical OR.

By using these two directives – RewriteRule and RewriteCond – you can implement a lot of powerfull URL Rewriting functionality on your website.

Other mod_rewrite Directives

  1. RewriteBase Directive – This directive can solve the problem of RewriteRule creating non-existent URLs due to difference in the physical file system structure on web server and the structure of website URLs. Setting this directive to the below given statement can solve this problem.

    RewriteBase /

  2. RewriteMap Directive – This directive is very powerful as it allows you to map unique values to a set of other replacement values from a table and to use it in the substitution to generate on the fly URLs. This can be especially useful for huge e-commerce or CMS kind of applications where you need to replace each section name or category name in the URL with a corresponding id taken from a database.
  3. RewriteLog Directive – This directive can be used to set the log file that the mod_rewrite engine will use to log all the actions taken during processing on client requests. The syntax is:

    RewriteLog /path/to/logfile

    This directive should be defined in the httpd.conf file as this directive is applied on a per-server basis.

  4. RewriteLogLevel Directive – This directive tells mod_rewrite module the amount of information on the internal processing done while rewriting URLs to be logged. This directive takes values from 0 to 9 where 0 means no logging and 9 means all the information is logged. A higher level of logging can make Apache run slow, so a level above 2 is desired only for debugging purposes. This directive can be applied using the below given syntax.br/>
    RewriteLogLevel levelnumber

Conclusion

In this article we have taken only a brief look at the power of the mod_rewrite module. It is only a scratch on the surface but I hope it is enough to get you started on using this module on your web server environment.