Talk about base62 and tinyURL

  java

Order

Base64 is definitely familiar to everyone. What is base62? It is often used to map short URLs.

Ascii-encoded 62 alphanumeric characters

Value Encoding  Value Encoding  Value Encoding  Value Encoding
  0 a            17 r            34 I            51 Z
  1 b            18 s            35 J            52 0
  2 c            19 t            36 K            53 1
  3 d            20 u            37 L            54 2
  4 e            21 v            38 M            55 3
  5 f            22 w            39 N            56 4
  6 g            23 x            40 O            57 5
  7 h            24 y            41 P            58 6
  8 i            25 z            42 Q            59 7
  9 j            26 A            43 R            60 8
 10 k            27 B            44 S            61 9
 11 l            28 C            45 T
 12 m            29 D            46 U
 13 n            30 E            47 V
 14 o            31 F            48 W
 15 p            32 G            49 X
 16 q            33 H            50 Y

26 lowercase letters +26 uppercase letters +10 numbers =62

    public static final String BASE_62_CHAR = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789";
    public static final int BASE = BASE_62_CHAR.length();

The mapping between 62 and decimal

62 to 10

Do you remember the algorithm of binary to decimal? From right to left, multiply each binary number by the corresponding power of 2. The power starts from 0. 62 to 10 is similar, from right to left every number *62 to the power of n, n starts from 0.

    public static long toBase10(String str) {
        //从右边开始
        return toBase10(new StringBuilder(str).reverse().toString().toCharArray());
    }

    private static long toBase10(char[] chars) {
        long n = 0;
        int pow = 0;
        for(char item: chars){
            n += toBase10(BASE_62_CHAR.indexOf(item),pow);
            pow++;
        }
        return n;
    }

    private static long toBase10(int n, int pow) {
        return n * (long) Math.pow(BASE, pow);
    }

Decimal to 62

Do you still remember the algorithm of converting decimal system to binary system, which divides the binary system into two parts, then arranges them in reverse order, and adds zeros in high order. The conversion to 62 is similar, dividing by 62 to obtain the remainder, and then reversing the order.

    public static String fromBase10(long i) {
        StringBuilder sb = new StringBuilder("");
        if (i == 0) {
            return "a";
        }
        while (i > 0) {
            i = fromBase10(i, sb);
        }
        return sb.reverse().toString();
    }

    private static long fromBase10(long i, final StringBuilder sb) {
        int rem = (int)(i % BASE);
        sb.append(BASE_62_CHAR.charAt(rem));
        return i / BASE;
    }

Conversion of short urls

The main idea is to maintain a global self-increasing id, bind each long url with a self-increasing id, and then use base62 to convert the self-increasing id into a base62 string, thus completing the conversion.

public class Base62UrlShorter {

    private long autoIncrId = 10000;

    Map<Long, String> longUrlIdMap = new HashMap<Long, String>();

    public long incr(){
        return autoIncrId ++ ;
    }

    public String shorten(String longUrl){
        long id = incr();
        //add to mapping
        longUrlIdMap.put(id,longUrl);
        return Base62.fromBase10(id);
    }

    public String lookup(String shortUrl){
        long id = Base62.toBase10(shortUrl);
        return longUrlIdMap.get(id);
    }
}

test

    @Test
    public void testLongUrl2Short(){
        Base62UrlShorter shorter= new Base62UrlShorter();
        String longUrl = "https://movie.douban.com/subject/26363254/";
        String shortUrl = shorter.shorten(longUrl);
        System.out.println("short url:"+shortUrl);
        System.out.println(shorter.lookup(shortUrl));
    }

Regarding capacity

Self-increasing id is long, max. 2 64-1

doc