字符串中有方法可以实现 urllib.parse.unquote 的功能吗？

从cookie中提取出的值，类似于

http%3A//www.baidu.com/sinopec4/dep4809/swgl_4809.nsf/dbgwview%3Fopenform%26view%3Dvwdbgw%26db%3Dopec4/dep4809/swgl_4809.nsf%26count%3D50

用from urllib.parse import unquote可进行解码，请问字符串中有方法可以直接解码吗？不用导入这个包

joeyun

见习助教 75 声望

暂无个人描述~

0 人点赞

推荐文章：

更多推荐...

置顶

[进度 100.00%] Python Masonite 4.0 中文翻译召集（Python 中的类 Laravel 框架） 15 / 19 |

博客

2021年python库大全 30 / 5 |

公告

Python Masonite 框架中文翻译召集（Python 中的类 Laravel 框架） 24 / 25 |

博客

收集了一些各大网站 python 的登陆方式,希望对学习 python 的小白，和想写爬虫的你们有所帮助,,本项目用于研究和分享各大网站的模拟登陆方式 17 / 5 |

翻译

Python 3.7 的一些新特性 10 / 2 |

公告

一起学 Python 《Python 最佳实践指南》翻译召集 16 / 2 |

Jason990420

1.9k 声望 / 個人 @ 個人

最佳答案

没有直接解码的, 除非你把urllib/parse.py中的unquote函数取出来用, 那不用import了

import re

def unquote_to_bytes(string):
    """unquote_to_bytes('abc%20def') -> b'abc def'."""
    # Note: strings are encoded as UTF-8. This is only an issue if it contains
    # unescaped non-ASCII characters, which URIs should not.
    if not string:
        # Is it a string-like object?
        string.split
        return b''
    if isinstance(string, str):
        string = string.encode('utf-8')
    bits = string.split(b'%')
    if len(bits) == 1:
        return string
    res = [bits[0]]
    append = res.append
    # Delay the initialization of the table to not waste memory
    # if the function is never called
    global _hextobyte
    if _hextobyte is None:
        _hextobyte = {(a + b).encode(): bytes.fromhex(a + b)
                      for a in _hexdig for b in _hexdig}
    for item in bits[1:]:
        try:
            append(_hextobyte[item[:2]])
            append(item[2:])
        except KeyError:
            append(b'%')
            append(item)
    return b''.join(res)

def unquote(string, encoding='utf-8', errors='replace'):
    """Replace %xx escapes by their single-character equivalent. The optional
    encoding and errors parameters specify how to decode percent-encoded
    sequences into Unicode characters, as accepted by the bytes.decode()
    method.
    By default, percent-encoded sequences are decoded with UTF-8, and invalid
    sequences are replaced by a placeholder character.

    unquote('abc%20def') -> 'abc def'.
    """
    if '%' not in string:
        string.split
        return string
    if encoding is None:
        encoding = 'utf-8'
    if errors is None:
        errors = 'replace'
    bits = _asciire.split(string)
    res = [bits[0]]
    append = res.append
    for i in range(1, len(bits), 2):
        append(unquote_to_bytes(bits[i]).decode(encoding, errors))
        append(bits[i + 1])
    return ''.join(res)

_hextobyte = None
_hexdig = '0123456789ABCDEFabcdef'
_asciire = re.compile('([\x00-\x7f]+)')

text = ('http%3A//www.baidu.com/sinopec4/dep4809/swgl_4809.nsf/dbgwview%3Fopen'
        'form%26view%3Dvwdbgw%26db%3Dopec4/dep4809/swgl_4809.nsf%26count%3D50')

print(unquote(text))

5年前评论

joeyun （楼主）

那我还是导入这个包把

讨论数量: 1

Jason990420

1.9k 声望 / 個人 @ 個人

没有直接解码的, 除非你把urllib/parse.py中的unquote函数取出来用, 那不用import了

import re

def unquote_to_bytes(string):
    """unquote_to_bytes('abc%20def') -> b'abc def'."""
    # Note: strings are encoded as UTF-8. This is only an issue if it contains
    # unescaped non-ASCII characters, which URIs should not.
    if not string:
        # Is it a string-like object?
        string.split
        return b''
    if isinstance(string, str):
        string = string.encode('utf-8')
    bits = string.split(b'%')
    if len(bits) == 1:
        return string
    res = [bits[0]]
    append = res.append
    # Delay the initialization of the table to not waste memory
    # if the function is never called
    global _hextobyte
    if _hextobyte is None:
        _hextobyte = {(a + b).encode(): bytes.fromhex(a + b)
                      for a in _hexdig for b in _hexdig}
    for item in bits[1:]:
        try:
            append(_hextobyte[item[:2]])
            append(item[2:])
        except KeyError:
            append(b'%')
            append(item)
    return b''.join(res)

def unquote(string, encoding='utf-8', errors='replace'):
    """Replace %xx escapes by their single-character equivalent. The optional
    encoding and errors parameters specify how to decode percent-encoded
    sequences into Unicode characters, as accepted by the bytes.decode()
    method.
    By default, percent-encoded sequences are decoded with UTF-8, and invalid
    sequences are replaced by a placeholder character.

    unquote('abc%20def') -> 'abc def'.
    """
    if '%' not in string:
        string.split
        return string
    if encoding is None:
        encoding = 'utf-8'
    if errors is None:
        errors = 'replace'
    bits = _asciire.split(string)
    res = [bits[0]]
    append = res.append
    for i in range(1, len(bits), 2):
        append(unquote_to_bytes(bits[i]).decode(encoding, errors))
        append(bits[i + 1])
    return ''.join(res)

_hextobyte = None
_hexdig = '0123456789ABCDEFabcdef'
_asciire = re.compile('([\x00-\x7f]+)')

text = ('http%3A//www.baidu.com/sinopec4/dep4809/swgl_4809.nsf/dbgwview%3Fopen'
        'form%26view%3Dvwdbgw%26db%3Dopec4/dep4809/swgl_4809.nsf%26count%3D50')

print(unquote(text))

5年前评论

joeyun （楼主）

那我还是导入这个包把

讨论应以学习和精进为目的。请勿发布不友善或者负能量的内容，与人为善，比聪明更重要！

帮助

字符串中有方法可以实现 urllib.parse.unquote 的功能吗？

推荐文章：

社区赞助商

关于 LearnKu

资源推荐

服务提供商

其他信息

字符串中有方法可以实现 urllib.parse.unquote 的功能吗？

推荐文章：

社区赞助商

关于 LearnKu

资源推荐

服务提供商

其他信息

请登录