htaccess Rewrites SEF URL and submits to PHP

What?
A quick note on a htaccess rewrite rule I'm liking.

What does it do?
What I type:
copyraw
http://www.mywebsite.com/blog/videos.html
  1.  http://www.mywebsite.com/blog/videos.html 
Sends this to server:
copyraw
http://www.mywebsite.com/index.php?myFolder=blog&myFiles=videos
  1.  http://www.mywebsite.com/index.php?myFolder=blog&myFiles=videos 
How?
copyraw
Options -Indexes +FollowSymlinks
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_URI}  !index.php
RewriteCond %{REQUEST_URI} ^/([^\.]+)\/([\w]+).html  [NC]
RewriteRule .*    index.php?myFolder=%1&myFiles=%2    [L]

ErrorDocument 400 /error/?v=400
ErrorDocument 401 /error/?v=401
ErrorDocument 403 /error/?v=403
ErrorDocument 404 /error/?v=404
ErrorDocument 500 /error/?v=500
  1.  Options -Indexes +FollowSymlinks 
  2.  RewriteEngine On 
  3.  RewriteBase / 
  4.  RewriteCond %{REQUEST_URI}  !index.php 
  5.  RewriteCond %{REQUEST_URI} ^/([^\.]+)\/([\w]+).html  [NC] 
  6.  RewriteRule .*    index.php?myFolder=%1&myFiles=%2    [L] 
  7.   
  8.  ErrorDocument 400 /error/?v=400 
  9.  ErrorDocument 401 /error/?v=401 
  10.  ErrorDocument 403 /error/?v=403 
  11.  ErrorDocument 404 /error/?v=404 
  12.  ErrorDocument 500 /error/?v=500 

Additional Notes
If you do apply the above to your site, bear in mind the following is also true:
copyraw
http://www.mysite.com/blog/pretty_much_anything_i_want_to_type_here.html

--yields
http://www.mysite.com/index.php?myFolder=blog&myFiles=pretty_much_anything_i_want_to_type_here.html
  1.  http://www.mysite.com/blog/pretty_much_anything_i_want_to_type_here.html 
  2.   
  3.  --yields 
  4.  http://www.mysite.com/index.php?myFolder=blog&myFiles=pretty_much_anything_i_want_to_type_here.html 
Anything not ending in ".html" will simply return a 404 error. I've included my error rules (they basically redirect to a branded error page).

So I sanitize on the receiving index.php file:
  1. Check for possible Code Injection
  2. Do NOT allow the use of apostrophe or double-quotes, convert these to a numerical representation only if you need to convert them back later (eg. 034, 039).
  3. Do NOT allow any punctuation you don't use in your site structure. Slashes and underscores /_ are good (so regexp: /[^a-zA-Z0-9_\/]/). If you allow percents (%) or apostrophes (*) then you are asking for trouble.
  4. Note my redirect for errors.
  5. Split the first string "myFolder" with the slash (/) as a delimiter, controlling the syntax/format of your site URLs.
For Example
copyraw
http://www.mysite.com/blog/videos/2010/january/21.html

// sends
index.php?myFolder=blog/videos/2010/january&myFiles=21
  1.  http://www.mysite.com/blog/videos/2010/january/21.html 
  2.   
  3.  // sends 
  4.  index.php?myFolder=blog/videos/2010/january&myFiles=21 
Which, hopefully, the PHP file will handle as:
copyraw
var $site_structure_string = $_GET['myFolder'];
$site_structure_string = preg_replace('/[^a-zA-Z0-9_\\/]/', '', $site_structure_string);
var $site_structure_item = $_GET['myFiles'];
var $site_structure_array = explode('/', $site_structure_string);

// yields
$site_structure_array[0] = 'blog'
$site_structure_array[1] = 'videos'
$site_structure_array[2] = '2010'
$site_structure_array[3] = 'january'
$site_structure_item = '21'
  1.  var $site_structure_string = $_GET['myFolder']
  2.  $site_structure_string = preg_replace('/[^a-zA-Z0-9_\\/]/', '', $site_structure_string)
  3.  var $site_structure_item = $_GET['myFiles']
  4.  var $site_structure_array = explode('/', $site_structure_string)
  5.   
  6.  // yields 
  7.  $site_structure_array[0] = 'blog' 
  8.  $site_structure_array[1] = 'videos' 
  9.  $site_structure_array[2] = '2010' 
  10.  $site_structure_array[3] = 'january' 
  11.  $site_structure_item = '21' 
And don't forget to redirect the user to an error page or back to the home page if something is amiss.

Oh and the above does NOT allow:
copyraw
http://www.mysite.com/blog.html
  1.  http://www.mysite.com/blog.html 
If you want this, I think the rewrite rule is:
copyraw
RewriteCond %{REQUEST_URI} ^/([\w]+).html  [NC]
  1.  RewriteCond %{REQUEST_URI} ^/([\w]+).html  [NC] 
But, er, I like that first check (myFolder) that the submitted URL matches the format of your site (and a lot more opportunity to check for malicious code).
Category: Personal Home Page :: Article: 520

Add comment

Your rating:

Submit

Credit where Credit is Due:


Feel free to copy, redistribute and share this information. All that we ask is that you attribute credit and possibly even a link back to this website as it really helps in our search engine rankings.

Disclaimer: Please note that the information provided on this website is intended for informational purposes only and does not represent a warranty. The opinions expressed are those of the author only. We recommend testing any solutions in a development environment before implementing them in production. The articles are based on our good faith efforts and were current at the time of writing, reflecting our practical experience in a commercial setting.

Thank you for visiting and, as always, we hope this website was of some use to you!

Kind Regards,

Joel Lipman
www.joellipman.com

Accreditation

Badge - Zoho Creator Certified Developer Associate
Badge - Zoho Deluge Certified Developer
Badge - Certified Zoho CRM Developer

Donate & Support

If you like my content, and would like to support this sharing site, feel free to donate using a method below:

Paypal:
Donate to Joel Lipman via PayPal

Bitcoin:
Donate to Joel Lipman with Bitcoin bc1qf6elrdxc968h0k673l2djc9wrpazhqtxw8qqp4

Ethereum:
Donate to Joel Lipman with Ethereum 0xb038962F3809b425D661EF5D22294Cf45E02FebF

Please publish modules in offcanvas position.