htaccess Rewrites SEF URL and submits to PHP

What?
A quick note on a htaccess rewrite rule I'm liking.

What does it do?
What I type:
copyraw
http://www.mywebsite.com/blog/videos.html
  1.  http://www.mywebsite.com/blog/videos.html 
Sends this to server:
copyraw
http://www.mywebsite.com/index.php?myFolder=blog&myFiles=videos
  1.  http://www.mywebsite.com/index.php?myFolder=blog&myFiles=videos 
How?
copyraw
Options -Indexes +FollowSymlinks
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI}  !index.php
RewriteCond %{REQUEST_URI} ^/([^\.]+)\/([\w]+).html  [NC]
RewriteRule .*    index.php?myFolder=%1&myFiles=%2    [L]

ErrorDocument 400 /error/?v=400
ErrorDocument 401 /error/?v=401
ErrorDocument 403 /error/?v=403
ErrorDocument 404 /error/?v=404
ErrorDocument 500 /error/?v=500
  1.  Options -Indexes +FollowSymlinks 
  2.  RewriteEngine On 
  3.  RewriteBase / 
  4.  RewriteCond %{REQUEST_URI}  !index.php 
  5.  RewriteCond %{REQUEST_URI} ^/([^\.]+)\/([\w]+).html  [NC] 
  6.  RewriteRule .*    index.php?myFolder=%1&myFiles=%2    [L] 
  7.   
  8.  ErrorDocument 400 /error/?v=400 
  9.  ErrorDocument 401 /error/?v=401 
  10.  ErrorDocument 403 /error/?v=403 
  11.  ErrorDocument 404 /error/?v=404 
  12.  ErrorDocument 500 /error/?v=500 

Additional Notes
If you do apply the above to your site, bear in mind the following is also true:
copyraw
http://www.mysite.com/blog/pretty_much_anything_i_want_to_type_here.html

--yields
http://www.mysite.com/index.php?myFolder=blog&myFiles=pretty_much_anything_i_want_to_type_here.html
  1.  http://www.mysite.com/blog/pretty_much_anything_i_want_to_type_here.html 
  2.   
  3.  --yields 
  4.  http://www.mysite.com/index.php?myFolder=blog&myFiles=pretty_much_anything_i_want_to_type_here.html 
Anything not ending in ".html" will simply return a 404 error. I've included my error rules (they basically redirect to a branded error page).

So I sanitize on the receiving index.php file:
  1. Check for possible Code Injection
  2. Do NOT allow the use of apostrophe or double-quotes, convert these to a numerical representation only if you need to convert them back later (eg. 034, 039).
  3. Do NOT allow any punctuation you don't use in your site structure. Slashes and underscores /_ are good (so regexp: /[^a-zA-Z0-9_\/]/). If you allow percents (%) or apostrophes (*) then you are asking for trouble.
  4. Note my redirect for errors.
  5. Split the first string "myFolder" with the slash (/) as a delimiter, controlling the syntax/format of your site URLs.
For Example
copyraw
http://www.mysite.com/blog/videos/2010/january/21.html

// sends
index.php?myFolder=blog/videos/2010/january&myFiles=21
  1.  http://www.mysite.com/blog/videos/2010/january/21.html 
  2.   
  3.  // sends 
  4.  index.php?myFolder=blog/videos/2010/january&myFiles=21 
Which, hopefully, the PHP file will handle as:
copyraw
var $site_structure_string = $_GET['myFolder'];
$site_structure_string = preg_replace('/[^a-zA-Z0-9_\\/]/', '', $site_structure_string);
var $site_structure_item = $_GET['myFiles'];
var $site_structure_array = explode('/', $site_structure_string);

// yields
$site_structure_array[0] = 'blog'
$site_structure_array[1] = 'videos'
$site_structure_array[2] = '2010'
$site_structure_array[3] = 'january'
$site_structure_item = '21'
  1.  var $site_structure_string = $_GET['myFolder']
  2.  $site_structure_string = preg_replace('/[^a-zA-Z0-9_\\/]/', '', $site_structure_string)
  3.  var $site_structure_item = $_GET['myFiles']
  4.  var $site_structure_array = explode('/', $site_structure_string)
  5.   
  6.  // yields 
  7.  $site_structure_array[0] = 'blog' 
  8.  $site_structure_array[1] = 'videos' 
  9.  $site_structure_array[2] = '2010' 
  10.  $site_structure_array[3] = 'january' 
  11.  $site_structure_item = '21' 
And don't forget to redirect the user to an error page or back to the home page if something is amiss.

Oh and the above does NOT allow:
copyraw
http://www.mysite.com/blog.html
  1.  http://www.mysite.com/blog.html 
If you want this, I think the rewrite rule is:
copyraw
RewriteCond %{REQUEST_URI} ^/([\w]+).html  [NC]
  1.  RewriteCond %{REQUEST_URI} ^/([\w]+).html  [NC] 
But, er, I like that first check (myFolder) that the submitted URL matches the format of your site (and a lot more opportunity to check for malicious code).
Category: Personal Home Page :: Article: 520

Credit where Credit is Due:


Feel free to copy, redistribute and share this information. All that we ask is that you attribute credit and possibly even a link back to this website as it really helps in our search engine rankings.

Disclaimer: The information on this website is provided without warranty and any content is merely the opinion of the author. Please try to test in development environments prior to adapting them to your production environments. The articles are written in good faith and, at the time of print, are working examples used in a commercial setting.

Thank you for visiting and, as always, we hope this website was of some use to you!

Kind Regards,

Joel Lipman
www.joellipman.com

Related Articles

Joes Revolver Map

Joes Word Cloud

Accreditation

Badge - Certified Zoho Creator Associate
Badge - Certified Zoho Creator Associate

Donate & Support

If you like my content, and would like to support this sharing site, feel free to donate using a method below:

Paypal:
Donate to Joel Lipman via PayPal

Bitcoin:
Donate to Joel Lipman with Bitcoin - Valid till 8 May 2022 3QnhmaBX7LQSRsC9hh6Je9rGQKEGNQNfPb
© 2021 Joel Lipman .com. All Rights Reserved.