Tiki 1.6: Make it Work with Search Engine Friendly URLs

The big issue with many web applications is that they're not Search Engine Friendly. That is, they use URL query strings to navigate to documents, which search engines won't follow (most search engines will ignore URLs that have question mark (?) in them). Unless the search engine has specific support for the web application in question, then you'll be happily ignored by the majority.

I've got Tiki working with nice URLs that work with search engines. Here's how...
<--break-->
There are two stages to this problem. Firstly, you need to make the system understand the new URLs so that it can find the real pages, even when the "encoded" URLs are used. Secondly, you need to get the system to generate these new URLs so that all links on your Tiki site are of the new form.

Just so you know, here's a normal Tiki URL:

  http://mysite/tiki-pagehistory.php?page=HomePage&preview=2

Here's how it will look when we've worked our magic on it:

  http://mysite/tiki-pagehistory.php+qu+page+eq+HomePage+an+preview+eq+2

It is possible to use "/" instead of "+" in the URLs, if you like, but you'll have to edit some Tiki templates to get Tiki to use absolute URLs to images, CSS and files (it's not the end of the world if you don't get this right, so feel free to try it if you like).

Back to getting it to work. Firstly, get your web server to understand the new form URLs. I use Apache, if you do as well, you can copy this, otherwise, work with your web server to achieve the same thing. Here is the bit of config that makes things work on my server:

<Location />
  RewriteEngine On
  RewriteRule (.+)\+an\+(.+) $1\&$2
  RewriteRule (.+)\+an\+(.+) $1\&$2
  RewriteRule (.+)\+eq\+(.+) $1\=$2
  RewriteRule (.+)\+eq\+(.+) $1\=$2
  RewriteRule (.+)\+qu\+(.+) $1\?$2
</Location>

(There's more about why some lines are in there twice later on, by the way).

I'd suggest you manually enter some "encoded" URLs into a browser and make sure you get the pages you expect before you continue. Apache's mod_rewrite is pretty complex, and you need to be sure you've got everything right before there's any point moving on.

Once you're happy with that, you can get Tiki to output the encoded URLs. Initially, I was going to write an Apache Output Filter for this, but it turns out that Smarty (the PHP template engine) can do it. Using Smarty is much more preferable, not least because it's much quicker, but also because it's just a bit of PHP.

Create a file called Smarty/plugins/outputfilter.urlt.php and put this in it:

<?php
function smarty_outputfilter_urlt($string)
{
  $string=preg_replace('/<a\s+(.*)href=(.*).php\?([^=]*)/i','<a $1href=$2.php+qu+$3',$string);
  $string=preg_replace('/<a\s+(.*)href=[\'\"]([^\'\"]*)=([^\'\"]*)[\'\"]/i','<a $1href="$2+eq+$3"',$string);
  $string=preg_replace('/<a\s+(.*)href=[\'\"]([^\'\"]*)=([^\'\"]*)[\'\"]/i','<a $1href="$2+eq+$3"',$string);
  $string=preg_replace('/<a\s+(.*)href=[\'\"]([^\'\"]*)\&a?m?p?;?([^\'\"]*)[\'\"]/i','<a $1href="$2+an+$3"',$string);
  $string=preg_replace('/<a\s+(.*)href=[\'\"]([^\'\"]*)\&a?m?p?;?([^\'\"]*)[\'\"]/i','<a $1href="$2+an+$3"',$string);
  return $string;
}
?>

(the sharp eyed may spot some lines are duplicated - more on that later)

Save out the file, and then edit the setup.php file (in the Tiki root). Near the bottom find this:

$smarty = new Smarty_Sterling();
$smarty->load_filter('pre','tr');
//$smarty->load_filter('output','trimwhitespace');

...and change it to this:

$smarty = new Smarty_Sterling();
$smarty->load_filter('pre','tr');
//$smarty->load_filter('output','trimwhitespace');
$smarty->load_filter('output','urlt');

That's it! All done. Now go and try it out. You should find that all URLs Tiki generates are of the "encoded" form (except when you go to index.php - still working on that one). Since you've got your mod_rewrite working, you should find that all of the "new" links still work like the old ones.

A note about the duplicated lines.
I haven't figured out a way to make mod_rewrite change all occurrences of "+eq+" into "=". It only seems to do one at a time. This causes a problem in URLs where you have more than one equals sign. Tiki happens to never use more than two parameters in URLs, so only ever has two equals signs on a URL. So, rather than do some sort of recursive substitution, I just duplicate the lines in the Apache config and the Smarty filter (that way, if something does have three equals signs, it'll still work, even if it's not quite as ideal as I'd like).

A Note about Using "/" instead of "+".
You can use slashes instead of plusses in URLs. Infact, the URLs are a lot more intuitive that way. However, Tiki's templates will need some adjusting to make things work if you do this. That's because Tiki uses relative pathnames for just about everything (images, files and CSS). So, if you've fetched tiki-index.php/qu/page/eq/fred then your browser will request "elegant.css" relative to that, which won't exist on your server. You can work around this by changing the header.tpl template to specify that the CSS is "/styles/.....css". You'll need to do the same for all the images and other files though, which is a fair bit of a job, so be careful.

Submitted by coofercat on Tue, 2003-06-17 15:56

Comments

Tiki 1.6: Make it Work with Search Engine Friendly URLs

Update for Tiki 1.8.3:

"Smarty/plugins/outputfilter.urlt.php" should be "lib/smarty/plugins/"

"setup.php" should be "setup_smarty.php"

...other than that, it's all good. Stand by for a much more sophisticated solution from me though ;-)

Submitted by coofercat on Sat, 2004-07-24 18:23.
Tiki 1.6: Make it Work with Search Engine Friendly URLs

It would look kinda weird with +. Can't it be done using "_" or "-". Not only that, but from my experiments with searching, the plus isn't viewed as a separator by search engines (by Google at least). So if you have "www.somesite.com/Harry+Potter.html", chances are if you search for "Potter" only, it won't show up as a result.

Submitted by trickster (not verified) on Sun, 2004-07-25 16:01.