Skip to content

Configuration options to enable / disable all validation checks #95

@vers-one

Description

@vers-one

Description

There are quite a few EPUB books that don't conform to the EPUB specification and thus fail parsing validations in EpubReader. Sometimes, it might be desirable to turn off some of those validations and ignore the parts of the book that couldn't be parsed.

Proposed solution

Go through all EPUB parsing validation checks and create configuration options (one per each check) to turn them on or off individually. For example, the value of the package/manifest/item/id attribute must be unique (otherwise it will be impossible to determine which manifest item is referenced in the spine). A new PackageReaderOptions.SkipManifestItemsWithDuplicateIds configuration property will instruct PackageReader whether it should skip duplicate manifest items or throw an exception.

Additionally, create three configuration presets:

  • STRICT — all validation checks are enabled;
  • RELAXED — ignore errors that are most common in the real-world EPUB books;
  • IGNORE_ALL_ERRORS — turn off all validation checks and try to salvage as much data as possible.

STRICT needs to be the default option to preserve the backwards compatibility.

Additional context

Some of those options are already implemented and documented here: https://os.vers.one/EpubReader/malformed-epub/

Metadata

Metadata

Assignees

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions