Nodejs chinese url conversion problem

  node.js, question

For example, if you want to download the file at this address:

Http://torrent.google.com.btba.xinqiys.com/upload/2016/08/26/[btba] [720p]- Dragon Wolf -6.56GB.torrent

If you read it directly, it will be reported to 404.

Must be converted to

Http://torrent.google.com.btba.xinqiys.com/upload/2016/08/26/percent E3 percent 80 percent 90BT percent E5 percent 90 percent A7 percent E3 percent 80 percent 91 percent 5B720p percent 5D- percent E9 percent BE percent 99 percent E7 percent 8B percent BC percent E8 percent A1 percent E6 percent 88 percent 98-6.56GB.torrent

Such an address can only be read.

This means that if the address is in Chinese, it cannot be done; it must be converted into Chinese.

Is there any convenient way to switch?

I used php to do it before. Now nodejs feels very troublesome. There are ready-made modules and so on, which is the best.


I added that I wrote one myself.

function url_encode(url){
    url = encodeURIComponent(url);
    url = url.replace(/\百分比3A/g, ":");
    url = url.replace(/\百分比2F/g, "/");
    url = url.replace(/\百分比3F/g, "?");
    url = url.replace(/\百分比3D/g, "=");
    url = url.replace(/\百分比26/g, "&");
    
    return url;
}

Attached is a previous php solution:

function cnurl($url){
    global $_G;
    if(ischinese($url) != 'encn') return $url;
    $_G['cn_charset'] = $_G['cn_charset'] ? $_G['cn_charset'] : $_G['cache']['evn_milu_pick']['charset'];
    if(!$_G['cn_charset']){
        $content = get_contents($url);
        $_G['cn_charset'] = strtoupper(get_charset($content));
    }
    $url = url_unescape($url);
    $url_info = parse_url($url);
    $url_query = $url_info['query'];
    parse_str($url_query, $url_arr);
    $args_arr = array();
    if($url_arr){
        foreach((array)$url_arr as $k => $v){
            $v = cnurl_format($v);
            $args_arr[] = $k.'='.$v; 
        }
        $args_str = implode('&', $args_arr);
        $url = str_replace($url_query, $args_str, $url);
    }else{
        return cnurl_format($url);
    }
    return $url;
}

function cnurl_format($str){
    global $_G;
    $str = trim($str);
    if(!$str) return;
    $str = url_unescape($str);
    if(ischinese($str) == 'allen') return $str;
    $str = piconv($str, CHARSET, $_G['cn_charset']);
    return preg_replace(array('/\百分比3A/i', '/\百分比2F/i' , '/\百分比3F/i', '/\百分比3D/i', '/\百分比26/i'), array(':', '/', '?', '=', '&'), rawurlencode($str) );
}

function url_unescape($str) {
    $str = rawurldecode($str);
      preg_match_all("/(?:百分比u.{4})|&#x.{4};|&#\d加;|.加/U",$str,$r);
      $ar = $r[0];
      foreach($ar as $k=>$v) {
        if(substr($v,0,2) == "百分比u"){
              $ar[$k] = iconv("UCS-2","GB2312",pack("H4",substr($v,-4)));
        }elseif(substr($v,0,3) == "&#x"){
              $ar[$k] = iconv("UCS-2","UTF-8",pack("H4",substr($v,3,-1)));
        }elseif(substr($v,0,2) == "&#") {
              $ar[$k] = iconv("UCS-2","UTF-8",pack("n",substr($v,2,-1)));
        }
      }
  return join("",$ar);
}

I thought it was complicated in php before because if the code of the other website is inconsistent with your own code, there will be a problem with your conversion of Chinese. For example, if you make urlencode for UTF-8 encoded Chinese and then visit this address, you may make mistakes. Because somebody else’s code may be gbk code.
So, if you don’t know each other’s code, this problem is very difficult.

However, I tried it with js. It seems not as complicated as php. At least there is no need to deal with coding. I am not sure about the situation for the time being.

Js has two functions that encode URIencodeURIComponent()AndencodeURI()The main topic is direct use.encodeURI()Function is ok,encodeURIComponent()The function escapes punctuation marks used to separate parts of the URI, such as’/’.

Please note thatencodeURIComponent()Functions andencodeURI()The difference between functions is that the former assumes that its parameters are part of a URI (such as protocol, host name, path, or query string). thereforeencodeURIComponent()The function escapes punctuation marks used to separate parts of the URI.
http://www.w3school.com.cn/js …