Skip to content

UniversalDetector.reset() does not reset the detector #71

@laurielounge

Description

@laurielounge

OS/Arch

$ python -c 'import platform;print(platform.uname())'

uname_result(system='Linux', node='testserver.mimeanalytics.com', release='4.18.0-240.10.1.el8_3.x86_64', version='#1 SMP Mon Jan 18 17:05:51 UTC 2021', machine='x86_64', processor='x86_64')

Python version

$ python --version

Python 3.6.8

cChardet version

$ python -c 'import cchardet;print(cchardet.__version__)'

2.1.7

What is the problem?

ud = cchardet.UniversalDetector()
ud.reset() does not reset the values of ud.done or ud.result after the first file has had its encoding detected.

Expected behavior

ud.done == False
ud.result == None

Actual behavior

ud.done == True
ud.result == (the last result)

Steps to reproduce the behavior

#!/usr/bin/env python3
import cchardet

files = ['file1', 'file2', 'file3']
ud = cchardet.UniversalDetector()
for file in files:
    ud.reset()
    print(f'Before: {ud.done}, {ud.result}')
    with open(file, 'rb') as ifh:
        for line in ifh.readlines():
            ud.feed(line)
            if ud.done:
                break
    ud.close()
    print(f'After: {ud.done}, {ud.result}')

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions