Description
There are quite a few EPUB books that don't conform to the EPUB specification and thus fail parsing validations in EpubReader. Sometimes, it might be desirable to turn off some of those validations and ignore the parts of the book that couldn't be parsed.
Proposed solution
Go through all EPUB parsing validation checks and create configuration options (one per each check) to turn them on or off individually. For example, the value of the package/manifest/item/id attribute must be unique (otherwise it will be impossible to determine which manifest item is referenced in the spine). A new PackageReaderOptions.SkipManifestItemsWithDuplicateIds configuration property will instruct PackageReader whether it should skip duplicate manifest items or throw an exception.
Additionally, create three configuration presets:
STRICT — all validation checks are enabled;
RELAXED — ignore errors that are most common in the real-world EPUB books;
IGNORE_ALL_ERRORS — turn off all validation checks and try to salvage as much data as possible.
STRICT needs to be the default option to preserve the backwards compatibility.
Additional context
Some of those options are already implemented and documented here: https://os.vers.one/EpubReader/malformed-epub/
Description
There are quite a few EPUB books that don't conform to the EPUB specification and thus fail parsing validations in EpubReader. Sometimes, it might be desirable to turn off some of those validations and ignore the parts of the book that couldn't be parsed.
Proposed solution
Go through all EPUB parsing validation checks and create configuration options (one per each check) to turn them on or off individually. For example, the value of the
package/manifest/item/idattribute must be unique (otherwise it will be impossible to determine which manifest item is referenced in the spine). A newPackageReaderOptions.SkipManifestItemsWithDuplicateIdsconfiguration property will instructPackageReaderwhether it should skip duplicate manifest items or throw an exception.Additionally, create three configuration presets:
STRICT— all validation checks are enabled;RELAXED— ignore errors that are most common in the real-world EPUB books;IGNORE_ALL_ERRORS— turn off all validation checks and try to salvage as much data as possible.STRICTneeds to be the default option to preserve the backwards compatibility.Additional context
Some of those options are already implemented and documented here: https://os.vers.one/EpubReader/malformed-epub/