Using Dots in URLs

Why I finally decided to allow dots in URLs on- and in Beamtic's projects.

14189 views
d

By. Jacob

Edited: 2020-05-28 20:46

From a technical standpoint, there is nothing preventing us from using dots (.) in URLs. However, from a usability standpoint, literal dots can be a cause of confusion.

Dots in the URL may not make immediate sense to novice users that already has a hard time understanding the concept of an URL. We, as developers, can do much to improve peoples understanding and normalize URLs to a point where they make more sense. The fact that most of the web does things in illogical and ugly ways does not excuse us from not doing better on our own sites.

I personally avoid using a slash "/" at the end of article URLs based on the idea that a slash would indicate a directory or an index. My logic continues for the dot, because the dot, at least at the end of a string, would indicate a file extension; of course, on the world wide web, this is not always the case, since extensions do not have to match the Mime Type of the content—but, making it match is still nicer to our users!

Another confusing element is the "double dot notation", or the so called "dot segments". If we have a dot segment, consisting of two dots /../ and separated by a slash, this will indicate to a client that the file should be requested in a parent segment. Of course, we would probably never have that in an URL, but our code should still prevent users from choosing such invalid URLs.

See also: Absolute and Relative Paths

The need to use dots in URL paths

I write this because I personally have a real need to use dots in URLs. However, I also realize that doing so would violate my own URL logic. My future need may not be limited to just the use of dots, but also other valid special characters; luckily I can easily update my regular expressions as needed.

The plus sign is another special character that is being used both in the name of the C++ programming language, that might actually cause worse problems than dot (.). Indeed, certain programs, also uses a dot in their names. I personally prefer people not to use special characters when naming things, because it might cause problems when people need to talk about the product. I still hope these will eventually get fixed, but in the meantime, I really think people should avoid dots, plus signs, and other special characters when naming things.

The free image editor Paint.NET is a good example of software that includes a dot in the name. I think the thought behind the decision is clear; The developer probably thought something along the lines of: "I am writing this in .NET, so i will just name it Paint.NET".

It is a good example of how developers sometimes put very little thought into the marketing part of their creations. In Paint.NET's case, writing about the program on social media will actually, sometimes, cause the platform to incorrectly recognize the name as a link, and even apply an anchor to it automatically. I think Instagram fixed the problem, but both Facebook and Twitter still struggle with this bug (as of May 2020).

Another example would be the robots.txt standard. This standard does not even have a proper name in my eyes. They simply choose to call it robots.txt, not thinking about the issues the dot might cause when people discuss it in tutorials.

For a long time, when writing about things with dots in their names, I choose to write DOT in caps instead of using a literal dot; but this has its own limitations. If the word before the dot is itself also in caps, this just does not look right, and will make it harder to read. So, I have finally decided to just allow the use of literal dots (periods), though I have still not implemented it in Beamtic's PHP Photo Gallery – but it will be allowed in my future projects. It is basically just a small modification to a regular expression anyway.

The good, the bad and the ugly

The good: I am less concerned about dots in directories I.e.: /some-directory/template-v1.4/ because the slash at the end will clearly show users that it is not a file. In other cases, this might also be logically clear. I.e: /some-directory/template-v1.4.css; I would have nothing against those cases.

The bad: Say we were to write an article about robots.txt, and the URL would end up looking like /robots.txt, then this would clearly be a violation of my strict ideas on URL friendliness, since the content type would be text/html rather than the expected text/plain.

The ugly: Another possibility, which I would personally avoid, is to use an URL logic similar to this: /article/robots.txt. This would likely avoid confusion, but it can also be argued that the article/ part is wasting valuable space in the URL, which is especially important on mobile devices. Therefor, I baptize this "the ugly" :-D

As much as I dislike allowing dots in URLs in Beamtic's projects, I also see some potential benefits when it comes to search engines and users. As for the SEO benefits, if they even exist, they will be minuscule. So this alone would not be enough for me to violate my URL logic and start allowing it.

Usually it will be very clear that a given dot does not indicate a file extension. Most users will likely not get confused. And besides, any such confusion should quickly be cleared up when they view the contents of the URL.

I have, however, also seen some bad cases that were confusing, even to me. These had to do with sites that were actually linking to real files, and instead of linking directly to the static resource, they would link to some HTML version, but the URL would have .ext in it – this is not just ugly, it is bad. Do not do this!

Conclusion

Now you know the thoughts that went into the decision to finally allow them dots in URLs, and I have walked you through some specific cases, which I think totally justifies the use of dots in URLs.

I actually think it is often us, as web developers, who are overthinking things that are not even a problem in the first place. We need to consider the big picture when making these technical decisions, which should also include the usability side of things.

Finally, some characters will need encoding when used in an URL, and I would avoid allowing them for this reason alone.

Tell us what you think:

  1. Drop in traffic doing recent helpful content updates; time will tell if I will recover.
  2. Let us investigate what is up with those mass spammed *-k.html backlinks that many of us are seeing in our link reports.
  3. An in-dept look at the use of headings (h1-h6) and sections in HTML pages.
  4. Pagination can be a confusing thing to get right both practically and programmatically. I have put a lot of thought into this subject, and here I am giving you a few of the ideas I have been working with.
  5. The Video outside the viewport is properly not worth spending time on solving; it is probably intended to solve a specific issue, and every single little video probably does not need to get indexed anyway.

More in: SEO