{"id":71,"date":"2019-04-17T18:13:12","date_gmt":"2019-04-17T18:13:12","guid":{"rendered":"https:\/\/r0b3rt.com\/andr-web\/?p=71"},"modified":"2019-04-17T18:13:12","modified_gmt":"2019-04-17T18:13:12","slug":"compact-url","status":"publish","type":"post","link":"http:\/\/r0b3rt.com\/andr-web\/compact-url\/","title":{"rendered":"Compact URL"},"content":{"rendered":"<p><img decoding=\"async\" src=\"https:\/\/anandjoshi.me\/img\/url-shorten.PNG\" alt=\"url-shorten\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>The heart of the project is to understand how to write a good <a href=\"http:\/\/www.eternallyconfuzzled.com\/tuts\/algorithms\/jsw_tut_hashing.aspx\" target=\"_blank\" rel=\"noopener noreferrer\">Hashing function.<\/a><\/p>\n<p>Let us look at the the <strong><em>model.py<\/em><\/strong> inside the smallify app in the above project.<\/p>\n<div class=\"language-python highlighter-rouge\">\n<div class=\"highlight\">\n<pre class=\"highlight\"><code><span class=\"k\">class<\/span> <span class=\"nc\">URL<\/span><span class=\"p\">(<\/span><span class=\"n\">models<\/span><span class=\"o\">.<\/span><span class=\"n\">Model<\/span><span class=\"p\">):<\/span>\r\n    <span class=\"n\">original_url<\/span> <span class=\"o\">=<\/span> <span class=\"n\">models<\/span><span class=\"o\">.<\/span><span class=\"n\">CharField<\/span><span class=\"p\">(<\/span><span class=\"n\">max_length<\/span><span class=\"o\">=<\/span><span class=\"mi\">300<\/span><span class=\"p\">,<\/span> <span class=\"n\">unique<\/span><span class=\"o\">=<\/span> <span class=\"bp\">True<\/span><span class=\"p\">)<\/span>\r\n    <span class=\"n\">shortened_url<\/span> <span class=\"o\">=<\/span> <span class=\"n\">models<\/span><span class=\"o\">.<\/span><span class=\"n\">CharField<\/span><span class=\"p\">(<\/span><span class=\"n\">max_length<\/span><span class=\"o\">=<\/span><span class=\"mi\">30<\/span><span class=\"p\">,<\/span> <span class=\"n\">unique<\/span> <span class=\"o\">=<\/span>  <span class=\"bp\">True<\/span><span class=\"p\">)<\/span>\r\n    \r\n    <span class=\"k\">def<\/span> <span class=\"nf\">shorten_url<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\r\n        <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">shortened_url<\/span> <span class=\"o\">=<\/span> <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">__fnv_hash<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">original_url<\/span><span class=\"p\">)<\/span>\r\n        \r\n        <span class=\"k\">if<\/span><span class=\"p\">(<\/span><span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">shortened_url<\/span><span class=\"p\">)<\/span> <span class=\"o\">&gt;=<\/span> <span class=\"nb\">len<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">original_url<\/span><span class=\"p\">)):<\/span>\r\n            <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">shortened_url<\/span> <span class=\"o\">=<\/span>  <span class=\"bp\">self<\/span><span class=\"o\">.<\/span><span class=\"n\">original_url<\/span>\r\n        \r\n        \r\n    <span class=\"nd\">@staticmethod<\/span>\r\n    <span class=\"k\">def<\/span> <span class=\"nf\">__fnv_hash<\/span><span class=\"p\">(<\/span><span class=\"n\">key<\/span><span class=\"p\">):<\/span>\r\n        <span class=\"n\">h<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">2166136261<\/span>\r\n        \r\n        <span class=\"k\">for<\/span> <span class=\"n\">k<\/span> <span class=\"ow\">in<\/span> <span class=\"n\">key<\/span><span class=\"p\">:<\/span>\r\n            <span class=\"n\">h<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"n\">h<\/span><span class=\"o\">*<\/span><span class=\"mi\">16777619<\/span><span class=\"p\">)<\/span><span class=\"o\">^<\/span><span class=\"nb\">ord<\/span><span class=\"p\">(<\/span><span class=\"n\">k<\/span><span class=\"p\">)<\/span>\r\n        \r\n        <span class=\"c\"># Return 8 bit URL<\/span>\r\n        <span class=\"k\">return<\/span> <span class=\"n\">base64<\/span><span class=\"o\">.<\/span><span class=\"n\">encode<\/span><span class=\"p\">(<\/span><span class=\"n\">h<\/span><span class=\"o\">%<\/span><span class=\"mi\">281474976710656<\/span><span class=\"p\">)<\/span>\r\n    \r\n    <span class=\"k\">def<\/span> <span class=\"nf\">__str__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">):<\/span>\r\n        <span class=\"k\">return<\/span> <span class=\"n\">models<\/span><span class=\"o\">.<\/span><span class=\"n\">Model<\/span><span class=\"o\">.<\/span><span class=\"n\">__str__<\/span><span class=\"p\">(<\/span><span class=\"bp\">self<\/span><span class=\"p\">)<\/span>  \r\n<\/code><\/pre>\n<\/div>\n<\/div>\n<p>It makes use of the <a href=\"http:\/\/www.isthe.com\/chongo\/tech\/comp\/fnv\/\" target=\"_blank\" rel=\"noopener noreferrer\">FNV hashing algorithm<\/a>. Look at the private method <code class=\"highlighter-rouge\">__fnv_hash<\/code> in the above code snippet. It converts an arbitary length of string <em>key<\/em> into a 8 bit URL. Here every character of the key (original url) is XORed with a random integer <strong>h<\/strong> which is also multiplied at each stage to get a good folding effect. XORing gives a good uniform distribution. After this we encode the given integer to <a href=\"https:\/\/en.wikipedia.org\/wiki\/Base64\" target=\"_blank\" rel=\"noopener noreferrer\">base64<\/a>. Like a binary base has only 2 digits to represent every possible number, base64 uses 64 different characters. This gives us a more compact representation of the integer <strong>h<\/strong>. Finally, we do modular division by <em>(281474976710655 + 1)<\/em> to get a 8 bit compact URL.<\/p>\n<h3 id=\"why-281474976710656-\">Why 281474976710656 ?<\/h3>\n<p>As we can see from the below commands, the 64th character (starts with 0 to 63) is an underscore <code class=\"highlighter-rouge\">\" _ \"<\/code>. So the value of 8 consecutive <code class=\"highlighter-rouge\">\" _ \"<\/code> is 281474976710655 and hence we do a modular division with (281474976710655 + 1)<\/p>\n<p><img decoding=\"async\" src=\"https:\/\/anandjoshi.me\/img\/encoding-url-shortener.PNG\" alt=\"encoding-url-shortener\" \/><\/p>\n<p class=\"notice\">With this, we have some basic understanding of URL Shorteners. The demo project <a href=\"https:\/\/github.com\/anandjoshi91\/url-shortener\" target=\"_blank\" rel=\"noopener noreferrer\">url-shortener<\/a> in quite naive and created for this tutorial. Do check it out and let me know if you have any improvements in it. Feel free to send a pull request and looking forward to your views, opinions, and tips in the comments below.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>&nbsp; The heart of the project is to understand how to write a good Hashing function. Let us look at the the model.py inside the smallify app in the above project. class URL(models.Model): original_url = models.CharField(max_length=300, unique= True) shortened_url = models.CharField(max_length=30, unique = True) def shorten_url(self): self.shortened_url = self.__fnv_hash(self.original_url) if(len(self.shortened_url) &gt;= len(self.original_url)): self.shortened_url = self.original_url &hellip; <a href=\"http:\/\/r0b3rt.com\/andr-web\/compact-url\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Compact URL&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[8],"tags":[],"class_list":["post-71","post","type-post","status-publish","format-standard","hentry","category-computer-it"],"_links":{"self":[{"href":"http:\/\/r0b3rt.com\/andr-web\/wp-json\/wp\/v2\/posts\/71","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/r0b3rt.com\/andr-web\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/r0b3rt.com\/andr-web\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/r0b3rt.com\/andr-web\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/r0b3rt.com\/andr-web\/wp-json\/wp\/v2\/comments?post=71"}],"version-history":[{"count":1,"href":"http:\/\/r0b3rt.com\/andr-web\/wp-json\/wp\/v2\/posts\/71\/revisions"}],"predecessor-version":[{"id":72,"href":"http:\/\/r0b3rt.com\/andr-web\/wp-json\/wp\/v2\/posts\/71\/revisions\/72"}],"wp:attachment":[{"href":"http:\/\/r0b3rt.com\/andr-web\/wp-json\/wp\/v2\/media?parent=71"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/r0b3rt.com\/andr-web\/wp-json\/wp\/v2\/categories?post=71"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/r0b3rt.com\/andr-web\/wp-json\/wp\/v2\/tags?post=71"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}