TikiURLt |Coofer Cat's Weblog

LATEST NEWS

3rd December 2006

I've moved all the sites I'm responsible for to Drupal, so am not doing any active work on Tiki (or URLt) any more. However, if you use this mod and want to ask a question, feel free - I'll do my best to help you.

22nd May, 2005

Tiki URLt Version 1.0 is now ready to go! Download it here

It's had lots of work done to it, and has the following main features:

Super-fast HTML parsing and URL translation
Advanced cached database reads for greater efficiency (configurable!)
Handles errors gracefully (eg. Page Not Found, Translation Error)
Can handle non-PHP content (configurable MIME types)
Works on any web server (Apache, IIS, etc) or can be used with Apache mod_rewrite
Written to the latest Tiki guidelines (ie. ADODB, PHPdoc, etc)
Tested and working on Tiki 1.8.5 and 1.9.0 (but should also work with other versions)

Now it just needs to be part of the Tiki distribution ;-)

See TikiURLtInstall for install details.

Reference sites:

http://www.coofercat.com (this site!)
http://www.pre-emptive.net

What is Tiki URLt? How does it work?

Tiki URLt is the name given to the process of transforming URLs within the application. The intention is to make URLs look different from the way they would normally look. For example, by default, Tiki would have a URL for this page that looks like this:

http://www.coofercat.com/tiki-index.php?page=TikIURLt

However, in fact it looks like this:

http://www.coofercat.com/wiki/TikiURLt

This is because the URLt process has changed it. This change, or transformation, is a two part process. The first is to change the URL on it's way out of Tiki. That is, the Tiki application embeds URLs in the HTML it sends to the browser. The URLt process intercepts this HTML and transforms the embedded URLs. Thus, the links the user sees on a page are of the transformed type, not the default type.

The second part of the problem is when the user clicks a transformed URL. Clearly the transformed URL doesn't actually exist on the server, so the URLt process has to intercept the request as it comes into the server and transform it into the default URL style. The Tiki application then takes over, processing the request as it normally would.

Okay, so how does it actually do it?

Tiki URLt is three main pieces of code. These are:

Outgoing URLt: A Smarty Output Filter
Incoming URLt: A PHP wrapper
Administration: A few forms that allow configuration of URLt

Outgoing URLt: A Smarty Output Filter

Tiki uses the Smarty template engine to control it's output. Smarty can have a number of "output filters" installed. These filters run after the HTML has been created and is about to be sent to the browser. The filters have access to the HTML, so can modify it as they wish.

The URLt filter contains an optimised (cut down) HTML parser that looks for HTML links embedded in the HTML (ie. within a...href, form...action, img...src tags, for example). A cut-down parser is used because it is much faster then a full-blown HTML parser (of which there are a few for PHP). As links are found, they are passed through the transformation process, the result of which is put back into the HTML in place of the original URL. (see below for more on the transformation process)

Incoming URLt: A PHP wrapper

Incoming URLt presents somewhat of a problem: How to intercept requests from browsers? There are a number of solutions to this problem, most notably using Apache mod_rewrite rules to change the incoming request into the real location of resources. This is of course Apache specific, and requires access to the Apache configuration, which precludes shared hosting users from using this faclity.

Tiki URLt uses a PHP wrapper instead. Here, the web server must pass all requests to a single location (the PHP wrapper). This can be achieved in a number of way (see below), but all web servers can do this in some way, and in most cases access to the server configuration is not required.

Since the PHP wrapper is called for every request, it effectively takes over the role of the web server. That is, it has to transform the URL and then locate the document that is required. In the case of a PHP document, it has to execute this PHP, but make the environment "look" as if the PHP had been called directly. Thus, the PHP that actually handles the request is unaware that it has been called by URLt.

In the case of non-PHP documents (eg. images, CSS, etc) the PHP wrapper has to look up the MIME type of the document. It then has to construct suitable HTTP headers and return the document. Thus, the client is unaware that URLt has taken place - the response is exactly as if the document had been retrieved directly.

Administration

Administration of the URLt process is performed by three forms. The first is general configuration, which covers the following:

Is incoming and/or outgoing URLt enabled?
URLt database read cache times
Default MIME type
Default index page name

The remaining two forms are almost identical (one for incoming, one for outgoing). They allow regular expressions to be entered that will perform transformations on URLs. Each expression is called a "rule". Each rule can have an optional textual comment added as an aid to memory and ease of administration (regular expressions can be notoriously hard to understand!). Each rule can also have a number of "modes". For example, the URLt process (see below) can continue processing after using the rule, or stop processing further rules. Rules can also be disabled for testing purposes.

The URL Transformation Process

As URLs are found (either in outgoing HTML or in incoming requests) they are passed to a transformer. The transformer works on "rules". Each rule is a regular expression substitution, with optional regular expression flags (e.g. i = case insensitive) and a "mode". The mode can be one of:

Continue
Stop on match
Stop on no match
Disabled rule

If the mode is "continue" then rule processing proceeds to the next rule, regardless of the outcome of this rule. Alternatively, the transformer can stop procssing further rules if either a match is found, or if no match is found. This allows for significant optimisation of transformer execution. Finally, the transformer will skip over any disabled rules which can be useful for testing and development of rules.

With careful use of rules and modes, complex transformations can be performed on URLs. The use of "continue" means that a single URL can undergo numerous changes, stage by stage, depending on it's exact makeup. The ability to stop transformation means that careful rule ordering can not only optimise transformation, but also perform rudimentary exception handling.

Web Server Configuration

To use incoming URLt, the web server has to be configured to direct all requests to a single location (outgoing URLt does not require any web server configuration). Web server configuration is of course server specific, and access to the configuration is site specific. Readers are advised to check with their systems administrators or providers for details.

Apache server users can direct all requests for an area of their site to a single location in a number of ways. Here are two, the first using mod_rewrite, the second using core Apache functionality:

Example 1 - mod_rewrite

RewriteEngine on

 ReWriteCond %{REQUEST_URI} ^\/wiki\/<br />
 ReWriteRule \/wiki\/(.*)$                      /tiki/tiki-urlt.php [[L]71

Here, any requests to /wiki/ will be directed to the file /tiki/tiki-urlt.php (below the DocumentRoot). This is possibly the best way to achieve the required functionality

Example 2 - ErrorDocument

  ErrorDocument 404 /tiki/tiki-urlt.php<br />
 &lt;/Location&gt;77

Here, the /wiki directory has to be created inside the DocumentRoot (but left empty). Since no requests can be satisfied, they will all result in a 404 error, which the web server will call /tiki/tiki-urlt.php to handle. This method is less than ideal because the web server has to perform slightly more work than the mod_rewrite method. Also, the web server will log a "page not found" message in the error_log for each request. However, this is a quick way to achieve the required functionality and can usually be used in shared hosting environments (by using a .htaccess file in the /wiki directory).

Current Known Issues

(1) The HTML parser in the output filter does not handle JavaScript links correctly. This is not ideal, but it is unlikely the parser will ever really be able to translate links embeded in JavaScript.

(2) Smarty seems to submit the same template information multiple times. This means that URLs are double-translated, so the "stop on match" facility is somewhat diminished in value. A Tiki bug has been raised for the problem, although it may be a broader Smarty issue.

(3) Tiki programs that return HTTP redirects (ie. "302 Location:" headers) are not available to the output filter. As a result, the redirects do not bear translated URLs. It is unclear how to solve this issue (are outgoing headers available to PHP4? If so, how!?)

(4) Numerous Tiki programs do not output consistent URLs. For example, a blog page post is identified as "tiki-view_blog_post.php?blogId=1&PostId=1". However, the link "back to blog" is "tiki-view_blog_post.php?find=&blogId=1&PostId=1...". Whilst only a slight difference in ordering, and no doubt easily done, this causes immense problems for URL translations. It is unclear how this can be overcome in a sensible Tiki-wide way.

(5) Blogs are a real pain in the neck for SEFURL work. I'm not sure what the solution is, short of some major blog hackery.

(6) Actually making regex transformations work is a really difficult process. Users require intimate knowledge of regexes, and how web pages are delivered and how requests are made. As a result, this is a very advanced feature, which only the very cunning should consider!

Legal Stuff

Do what you want to use this information? Let me know if there's anything wrong with it, but by all means use it for (real!) Tiki documentation if URLt ever makes it into Tiki for real.

Attachment	Size
urlt1-0.tar.gz	10.74 KB
tiki-urlt-1.1.tar.gz	17.46 KB

Submitted by coofercat on Mon, 2005-05-23 18:28

Comments

Cannot download the package

Hi there!

I don't seem to be able to download TikiURLt from your link.I just get a 0 bytes gzip file (:cry:)

Submitted by Anonymous (not verified) on Mon, 2006-10-16 20:17.

Fixed!

This should be fixed - download to your hearts content!

Submitted by coofercat on Sun, 2006-12-03 19:36.

does it add hyphens?

How will it display for example "Some Subject"

/SomeSubject/,or /Some-Subject/

A SEF mod without hyphens is utterly useless in my opinion.

Submitted by Anonymous (not verified) on Thu, 2006-10-19 20:04.

Hyphens

I'm not sure how you want to do this, but there's no reason why you couldn't construct regexes to insert/remove hyphens if you want. I'm not sure what you mean by "display", as it's all about hacking about with URLs.

The good thing about this mod is that it's incredibly flexible, so you can do what you want. The bad thing about this mod is that it's incredibly flexible, so it's not always obvious what it can do.

Submitted by coofercat on Sun, 2006-12-03 19:35.

Missing files from package.

Hi, the following files:

./lib/urlt/urltlib.php
./lib/urlt/index.php
./templates/tiki-urlt_admin.tpl
./tiki-urlt_admin.php
./tiki-urlt_edit_in.php
./tiki-urlt_edit_out.php
./tiki-urlt.php

are missing from the tar.gz package posted for download. I cannot follow installation instruction without those files. Anything changed and are they not needed any more? In that case, how is the installation procedure modified?

Thanks,

Submitted by Larry (not verified) on Sat, 2007-02-10 13:52.

Check again...?

I've just had a look, and I can't see anything wrong. From your list, it implies you get the database SQL file, so you're obviously getting something. Either way, I can't see anything wrong with the archive or the download process.

Oh, and no, the install hasn't changed, you do need all of those files!

Submitted by coofercat on Sat, 2007-02-10 18:04.

drupal and tikiwiki

I found your post about porting all your sites to Drupal. I'm fairly familiar with Drupal, but I was in the process of setting up a new site in tiki, since I need some wiki functionality as well. Is this a bad idea? Should I stick with drupal and find a way to use modules and permissions to do something wiki-like?

I love Drupal otherwise, and I know its advantages; I just need some collaborative editing. What do you think? I'd appreciate your thoughts on tiki's future prospects, if you have a minute, since I'm worried I might be driving into a dead-end.

Thanks in advance.

rena

Submitted by rena (not verified) on Tue, 2007-07-17 06:52.

Re: drupal and tikiwiki

Okay, here goes... I'm a couple of years out of date with Tiki, so I may not be 100% about it, by the way.

I really liked Tiki because it's got about every feature you could ever want and a load more besides. Accordingly, it's pretty easy to get a site up that has all sorts of whizzy features and looks fairly reasonable.

Drupal by comparison requires a little more "low level" work, and despite hundreds of contrib modules struggles to do the same number of things. That said, getting a basic site up is really pretty easy.

My problem with Tiki was that vast swathes of it seem to be un-maintained or otherwise very slow to change. I submitted the odd patch here and there, and got as good as no response from them (not even to say "sorry, but no thanks"). I suspect other people have has the same experience, which means that version after version, the same problems persist.

Drupal on the other hand is incredibly tightly maintained. I've patched or requested the odd thing, and have been directed down very tight paths to refine the patches or otherwise forget about them. Whilst this may feel like some sort of infringement of my freedom or whatever, it's lead to a very clean, very robust and very well maintained and tested product.

Ultimately, I decided I didn't need 95% of Tiki. The 5% I did need proved to be inefficient and seemingly unable to improve. I use maybe 20% of Drupal, but it's rock solid, extensible, efficient (by comparison) and frankly a joy to work with.

As for Wiki functionality, I don't think much of Drupal's Liquid Wiki modules. They're not overly good quality, and the maintainer very much has his own plans for them. However, they work fairly well, and do what I need. I wouldn't like to say Drupal's Wiki is immediately ideal for collaborative editing, but if you're into PHP you could probably adapt what's available. There are other Wiki modules that may suit better, I don't know.

In my opinion, stick with Drupal if you possibly can. Otherwise, if it's just a Wiki you need, maybe try MediaWiki or some such as well as looking at Tiki.

Submitted by coofercat on Tue, 2007-07-17 11:38.

took me 5mn to become a developer on tikiwiki

I guess, tikiwiki has a different developing model, it took me 5mn to have cvs access to tikiwiki. I went on IRC, explained what I wanted to do, read some rules, and got access to the CVS immediately.

Usually the dev team does not accept patches, because they prefer you commit your own code.

Submitted by Franck (not verified) on Sun, 2007-12-23 05:21.

Missing Files?

Hi Coofer Cat,

Like "Larry", I apparently did not download all the files..?When I open the archive this is what I see:

Could you double check the source archive again?
Thanks

Submitted by Matt (not verified) on Mon, 2007-08-13 16:22.

Whoops!

Whoops, it seems the archive is indeed in a mess.

I don't currently use Tiki on any websites, so have no test environment to try this stuff out, or indeed to develop anything new. However, from an old backup, I've create an updated archive. Hopefully you'll have better luck with that!

Sorry for the confusion - not quite sure what I was smokingback then ;-)

Submitted by coofercat on Tue, 2007-08-14 14:13.

Its not a problem, I got the

Its not a problem, I got the updated archive and am working on it now. I have taken a look at drupal though in the meantime :)
Thanks! And whatever you were smoking, pass it!

Submitted by Matt (not verified) on Tue, 2007-08-14 16:57.

Excuse me... newbie

Excuse me... newbie here...

So what if i installed the wiki in the root of my server?
www.server.com/tiki-index.php

assuming i put the wrapper files into a /tiki dir as advised, can you let me know what the .htaccess should be?

Sorry... v v new...

Submitted by Anonymous Coward (not verified) on Mon, 2007-08-27 23:39.

In the Root

In which case, put the files into the root (so the .php files go alongside the Tiki ones, templates into the templates directory and so on).

Be very careful, URLt is not the for faint of heart - it's super-advanced and not easy to get working.

Good luck!

Submitted by coofercat on Tue, 2007-08-28 08:34.

I shall tread

I shall tread carefully...

Can you let me know what the .htaccess should read though, in that event?

Sorry...!

Submitted by Anonymous Coward (not verified) on Tue, 2007-08-28 09:06.

unuseful,the url did not

unuseful,the url did not change...
and in pattern replace flags and continure,what the flags mean?

Submitted by Anonymous Coward (not verified) on Thu, 2008-01-10 10:49.

Comments and issues

Hi there, long time tiki user here, trying this out, got a few observations and comments.

I have the incoming URL's working so far (only tried a couple of examples) and they seem ok.

When trying to allow normalised URLs (that is ones with a trailing slash if there is an item that does not look like a page name) then the "?" question mark is not recognised by the regex parser as the 0 or 1 operator, which appears to be down to the unescaped slash. While I get regex escaping rules, perhaps a link to a perl regex cheat card would help - this one perhaps http://www.addedbytes.com/cheat-sheets/regular-expressions-cheat-sheet/.

It also suffers from the same problems as using an apache rewrite rule, that internal links now are all prefixed with the pattern if it is a subdirectory pattern.

Trying the output URL's, I came across a couple of problems:

A small issue - The current archive will output "Fatal error: Smarty error: [plugin] could not load plugin file 'outputfilter.urlt.php' (core.load_plugins.php, line 118) in /home/danny/public_html/working_branch/lib/smarty/libs/Smarty.class.php on line 1095".

This is because the file outputfilter.urlt.php has been extracted under lib/smarty/plugins, and not lib/smarty/libs/plugins.It is just a matter of the archive paths still not being quite right.

So after moving that on my system I tried the first pattern:
^\/tiki-list_blogs.php$ blogs [no flags] stop on match

Enabled is true, 1 second cache.
However, I am not seeing the list blogs urls being replaced yet.
What else is needed?

Also, it would be handy if the admin interface could finally go to another 404 handler, so that the URLt error is only shown to a developer, and a user gets a search of likely matches. I am working on these, and will share my results once it all works.

Submitted by Danny Staple (not verified) on Sun, 2008-08-03 13:19.

Hmmm

Sorry -found the page not found URL already. Scratch that one.
Using periods instead of slashes means that there is no need for the route back up.

Submitted by Danny Staple (not verified) on Sun, 2008-08-03 19:15.