⚝
One Hat Cyber Team
⚝
Your IP:
216.73.216.118
Server IP:
84.32.84.162
Server:
Linux sg-nme-web1518.main-hosting.eu 5.14.0-611.16.1.el9_7.x86_64 #1 SMP PREEMPT_DYNAMIC Mon Dec 22 03:40:39 EST 2025 x86_64
Server Software:
LiteSpeed
PHP Version:
8.3.28
Buat File
|
Buat Folder
Eksekusi
Dir :
~
/
opt
/
gsutil
/
third_party
/
chardet
/
docs
/
View File Name :
usage.rst
Usage ===== Basic usage ----------- The easiest way to use the Universal Encoding Detector library is with the ``detect`` function. Example: Using the ``detect`` function -------------------------------------- The ``detect`` function takes one argument, a non-Unicode string. It returns a dictionary containing the auto-detected character encoding and a confidence level from ``0`` to ``1``. .. code:: python >>> import urllib.request >>> rawdata = urllib.request.urlopen('http://yahoo.co.jp/').read() >>> import chardet >>> chardet.detect(rawdata) {'encoding': 'EUC-JP', 'confidence': 0.99} Advanced usage -------------- If you’re dealing with a large amount of text, you can call the Universal Encoding Detector library incrementally, and it will stop as soon as it is confident enough to report its results. Create a ``UniversalDetector`` object, then call its ``feed`` method repeatedly with each block of text. If the detector reaches a minimum threshold of confidence, it will set ``detector.done`` to ``True``. Once you’ve exhausted the source text, call ``detector.close()``, which will do some final calculations in case the detector didn’t hit its minimum confidence threshold earlier. Then ``detector.result`` will be a dictionary containing the auto-detected character encoding and confidence level (the same as the ``chardet.detect`` function `returns <usage.html#example-using-the-detect-function>`__). Example: Detecting encoding incrementally ----------------------------------------- .. code:: python import urllib.request from chardet.universaldetector import UniversalDetector usock = urllib.request.urlopen('http://yahoo.co.jp/') detector = UniversalDetector() for line in usock.readlines(): detector.feed(line) if detector.done: break detector.close() usock.close() print(detector.result) .. code:: python {'encoding': 'EUC-JP', 'confidence': 0.99} If you want to detect the encoding of multiple texts (such as separate files), you can re-use a single ``UniversalDetector`` object. Just call ``detector.reset()`` at the start of each file, call ``detector.feed`` as many times as you like, and then call ``detector.close()`` and check the ``detector.result`` dictionary for the file’s results. Example: Detecting encodings of multiple files ---------------------------------------------- .. code:: python import glob from chardet.universaldetector import UniversalDetector detector = UniversalDetector() for filename in glob.glob('*.xml'): print(filename.ljust(60), end='') detector.reset() for line in open(filename, 'rb'): detector.feed(line) if detector.done: break detector.close() print(detector.result)