Add support for older Encore formats, fix duration/tuplet/tie handling#2
Open
manolo wants to merge 31 commits into
Open
Add support for older Encore formats, fix duration/tuplet/tie handling#2manolo wants to merge 31 commits into
manolo wants to merge 31 commits into
Conversation
Qt6 requires explicit QChar() construction from integer types; add QIODevice include for data.device()->pos() calls. EncMeasureElemOrnament::read() was hardcoded to read 28 bytes of ornament-specific data, but ornaments with m_size <= 28 have fewer bytes available. The overflow corrupted the stream, causing cascading parse failures (missing notes, missing measures, crashes on elemType 15). Rewrite the ornament read to consume exactly m_size-5 bytes using a size-aware loop that skips unavailable fields. Unknown element types (>= 12) called exit(1); replace with EncMeasureElemUnknown so parsing continues. Update adornoj.ref.txt: the old ref had stale values derived from the buggy parser reading past ornament boundaries into adjacent MEAS bytes.
Extends Enc2MusicXML to handle legacy Encore files from older versions: - v0xC2 format (version 773/775): 22-byte NOTE elements with pitch stored in a different field position - v0xA6 format (version 166/592): 10-byte NOTE elements with doubled element spacing (size*2) and different measure header offset (0x3E) Additional fixes: - Bounds checking in addSpannerEnds() to prevent crashes with corrupted measure references - Protection for empty m_lines vector in countStaves() - Fixed grace note detection (removed incorrect bit 0x40 check) - Fixed chord detection (notes with same tick are chords regardless of x_offset visual positioning) Test results: 993 of 1067 files (93.1%) now convert successfully with 1.4M+ notes extracted. Remaining 74 empty files are ZBOT encrypted or template files without musical content.
- Calculate real note/rest durations from tick differences instead of relying on faceValue which is incorrect in v0xC2 format files - Scale durations proportionally when internal m_durTicks doesn't match the actual time signature (e.g., files using 720 ticks for 3/4 time) - Detect and output time signature changes in MusicXML - Group elements by staff/voice for accurate duration calculation This improves measure duration accuracy from ~50% to 96.4% across the test corpus of 686 files (305918 measures).
- Remove incorrect scale factor in calculateRealDurations() - Encore files use the same tick scale as MusicXML (240 ticks = quarter note) - Enable forward generation for notes/rests that don't start at tick 0 - Improves validation success rate from 96.4% to 99.4%
The xoffset field in ties represents visual placement, not note matching. Ties are now correctly detected by matching tick, voice, and staff only.
Old format files (v0xC2) don't properly set m_dotControl. Now we calculate the number of dots by comparing the real duration with the base note duration. A dotted note has duration = base * 1.5, double dotted = base * 1.75, etc.
Old format files (v0xC2) don't store tuplet info in m_tuplet field. Now we detect triplets (3:2) by checking if duration = base * 2/3. Also supports quintuplets (5:4).
- Add triplet duration mappings to correctNoteType() (20, 40, 80, 160, 320, 640) - Fix TupletHandler to use duration-based grouping instead of count-based - Calculate noteDur/restDur once at function start for consistency - Pass calculated tick to note()/rest() for proper tuplet grouping - Handle chord notes correctly in tuplet brackets (skip for lookahead) - Add forceCloseTuplet logic for measure-end tuplets
- Extract common duration calculation into calculateDuration() - Add nstaves() helper method to avoid repeated pattern - Simplify durationNote() and durationRest() to use common function
- Add isTablature() method to detect tablature staves (staffType::TAB or clefType::TAB) - Skip tablature parts in partList() and parts() - Fix writeTimeModification to only write real tuplets (actual > 1 and actual != normal)
Live-recorded chords have individual notes offset by 1-3 ticks from each other. Without the CHORD_CLUSTER_THRESHOLD (4 ticks) skip, the first note in such a cluster gets realDuration=3 instead of the full note value. Add CHORD_CLUSTER_THRESHOLD constant and use it in calculateRealDurations() after skipping same-tick elements.
Encore stores a show/hide flag for each staff at byte offset +19 in the 30-byte EncLineStaffData entry (3rd byte of the skip block after pageIdx). 0x00 = hidden from printed score; any other value = visible. Add m_showStaff to EncLineStaffData and EncInstrument. New function propagateStaffVisibility() copies the flag from the first LINE block into the corresponding EncInstrument entry.
Three related fixes for v0xC4 files saved by Encore 5.0.2:
1. Probe-based encoding detection in EncInstrument::read(): when the
offset field is <= 250 (would be ONE_BYTE) but the first two bytes
look like UTF-16 LE (b0 printable ASCII, b1 == 0x00), upgrade to
TWO_BYTES. Fixes instrument names truncated to first letter ("B").
2. Pad the instruments vector to header.instrumentCount after reading
TK blocks, so parts without a TK block header are still created.
3. For each padded (name-empty) entry, seek to NAME_BASE + n*NAME_STEP
(202 + n*2158) and read the name if UTF-16 LE is detected there.
Encore 5.0.2 writes name content at this position even without the
TK block header. Fixes "Guitarra" being absent from the part list.
…size MIDI program (v0xC4 only): each instrument's 1-indexed GM program is stored at PRG_BASE + n*PRG_STEP (2278 + n*2158) as 8 identical bytes. Read and store in EncInstrument::m_midiProgram for use in MusicXML output. TITL encoding: override the TK00-derived charSize using the block's own varsize (>= 10000 → TWO_BYTES). Encore 5.0.2 writes TITL in UTF-16 LE even when the instrument block's offset field suggests ONE_BYTE, which caused title text to be read as "C" (first byte only) instead of "Canon".
- Add isHidden() helper: returns true when EncInstrument::m_showStaff is false. Update partList() and parts() to skip both tablature and hidden staves, so they are absent from the MusicXML output. - writeScorePart() now accepts an optional midiProgram (default 0). When non-zero, a <score-instrument> / <midi-instrument> block is emitted with the 1-indexed GM program number. Test: pachbel-hiden.enc has one hidden staff; its MusicXML output contains 4 parts instead of 5 (the hidden second staff is not written).
EncMeasureElemTie::read() now reads the direction byte at +5 from
elemStart. 0xfe = outgoing tie (TIE-START); anything else = arc-only
visual marker that does not initiate a tie. The m_isTieStart flag is
stored on the element.
NoteConnector::tieStart() is updated to:
- skip arc-only TIE elements (m_isTieStart == false)
- tolerate up to CHORD_CLUSTER_THRESHOLD ticks of timing drift between
the TIE element and the note (live-recorded chords offset by 1-3 ticks)
Update pachbel-hiden ref files to reflect improved tie output.
The seek to measStart+0x1A placed the jump-sign byte at bit position 0 of m_coda, but repeat() extracts it from bit position 8 via (m_coda >> 8) & 0xFF. Seeking one byte earlier (measStart+0x19) reads the full 4-byte field with the sign byte correctly at byte index 1. Update apoghiaturo, kordorkestro, and opoj XML refs to include the score-instrument and midi-instrument elements added by the MIDI program output feature.
Encore records live-performed chords and ties as individual MIDI events that are very short (1-14 ticks). These notes are hidden in the Encore display but appear in the binary file. Two filtering passes are applied during the first-pass collection of voice elements: 1. Short-note filter: notes with realDuration > 0 && < 15 are skipped unless they are tie-starts or chord extensions (within CHORD_MIDI_THRESHOLD = 8 ticks of the previous note in the same voice). For small face values (64th/128th, fvBase <= 15) the filter applies; for longer face values it filters only when realDuration > CHORD_CLUSTER_THRESHOLD to preserve live-recorded chord roots. 2. Cascade filter: when a filtered note has grace1 & 0x0F == 1 (tie-sender), its pitch is recorded. A subsequent note with grace1 & 0x0F == 2 (tie-receiver) and the same staff/voice/pitch is also filtered so that Encore's dotted-note continuation artifacts are suppressed consistently. Add two test files from the MuseScore test suite: - midi_artifact_filter: Q rest, filtered 64th C4 (rdur=11), Q E4 placed - midi_artifact_cascade: filtered 64th C4 (g1low=1) cascades to Q C4 (g1low=2)
durationNote() and durationRest() previously returned m_realDuration unconditionally, which for MIDI-recorded files is the raw tick difference between consecutive notes and can greatly exceed the measure length. Use realDuration only when it is a recognized standard value (whole, half, quarter, dotted variants, triplets). For non-standard values (e.g. 2564 ticks in a 1920-tick measure), fall back to calculateDuration() based on face value, dot control, and tuplet ratio. This matches how the MuseScore Encore importer advances its cursor: by written duration, not MIDI timing. Update XML refs where the old code produced non-standard durations (675, 709) that are now correctly replaced by face-value durations.
The Encore dotControl byte is not a reliable dot count. Using it in calculateDuration() with an iterative *3/2 formula produced non-standard values (e.g. 135 for a 16th with dotControl=2, which should be 105 for a double-dotted 16th or 60 for a plain 16th). These values caused MuseScore to report "<duration> not equal to specified duration" errors. durationNote() now uses realDuration only when it encodes a valid written duration: a standard note length (plain/dotted/triplet) or a value that matches the face-value note under a recognised tuplet ratio (e.g. 48 = 16th x 4/5 quintuplet). All other realDuration values fall back to the plain face-value duration, preventing overflow and type/duration mismatches. isStandardDuration() updated to include correct double-dotted values (base x 7/4: 840/420/210/105) and triple-dotted values (base x 15/8: 900/450/225), replacing the previously incorrect list.
notesAreInChord() used exact tick equality, so live-recorded chords (where simultaneous notes land at ticks 0,1,2,... due to MIDI latency) were treated as separate melody notes. Each such note advanced the measure tick counter by its full written duration, causing massive measure overflow (e.g. Found: 2750/1920 in a 4/4 measure). Replace the chord-detection scheme in both passes with isChordOf(), which considers a note a chord extension when its tick falls within CHORD_MIDI_THRESHOLD (= 8) of the current chord root. The chord root is tracked in chordRootTick / chordRootTick2 (updated only for non-chord notes) to prevent cascading: notes at ticks 0, 1, 4, 9 form one chord (root=0, all within 8 ticks), while a note at tick=240 starts a new chord. The same threshold is used in the first pass (tick accumulation and artifact filter) and the second pass (chord element output and the tuplet lookahead), so the two passes stay consistent.
The MuseScore Encore importer never uses raw Encore ticks (e->tick) for note placement — it advances a cumulative written-duration counter (cumTick) and ignores MIDI timing noise. Enc2MusicXML was doing the opposite: generating <forward>/<backup> elements for every mismatch between elem->m_tick and the accumulated tick, even when the difference was just 1-7 ticks of MIDI latency jitter. These tiny forwards accumulated across many notes to produce measure overflows. Only generate forward/backup when the gap exceeds CHORD_MIDI_THRESHOLD (8 ticks), which reliably separates real musical gaps (e.g. a 240-tick rest gap) from MIDI timing noise.
First pass: replace tick >= m_durTicks with tick + noteDur > m_durTicks so a single note whose written duration would push past the bar line is skipped, not just notes that start at or beyond it. This prevents live-recorded MIDI files from having notes that overflow the measure. Second pass: after each voice, write a <forward> to fill any remaining gap up to m_durTicks. This handles MIDI artifact notes that were filtered out, leaving a small hole at the end of the measure, and non-standard time signatures (e.g. 129/128) whose tick count is not divisible by the note grid of divisions=240.
m_durTicks comes from the Encore binary and reflects the actual MIDI recording timing, which for live-recorded files can differ by several ticks from the theoretical value (e.g. 958 or 968 instead of 960 for 4/4). Using m_durTicks as the measure boundary caused MuseScore to report "Incomplete measure" warnings for measures like 129/128 or 1916/1920 of expected 4/4 — the fill/overflow checks were targeting the wrong boundary. Introduce theoreticalDurTicks() which computes the correct measure duration from m_timeSigNum * (960 / m_timeSigDen). Replace all measure-boundary uses of m_durTicks in measure() with this value so that the first-pass overflow check, the garbage-element filter, and the end-of-voice forward fill all use the written time signature.
…ration MuseScore ignores <forward> elements when computing voice duration for the "Incomplete measure" check — only actual note and rest elements count. Two changes: 1. Remove all intra-voice <forward>/<backup> elements based on raw MIDI ticks. Like the MuseScore importer (which uses cumTick exclusively), notes are now placed at the current accumulated written-duration position, ignoring MIDI timing jitter from elem->m_tick. 2. Replace the end-of-voice <forward> fill with writeGapRest(), which writes a proper <note><rest/> element so MuseScore counts the gap toward the voice's total duration. For standard durations (16th, 8th, quarter, etc.) the correct <type> and <dot> are emitted; for non-standard gaps <type> is omitted and MuseScore infers it from <duration>.
…w m_tuplet calculateDuration() with raw m_tuplet values from MIDI-recorded Encore files (e.g. actualNotes=11, normalNotes=6) produces non-standard durations like 60 * 3/2 * 6/11 = 49, which MuseScore's MusicXML importer cannot represent as a standard TDuration and aborts with an assertion failure. Apply the same realDuration-based logic as durationNote(): use realDuration only when it is a recognised standard or tuplet value; otherwise fall back to plain face-value. This eliminates <duration>49</duration><type>64th</type> mismatches that caused the MuseScore crash.
The previous durationRest() used calculateDuration() with the raw m_tuplet byte (actualNotes/normalNotes), which for MIDI-recorded Encore files can contain noise like 11:6, producing non-standard values (e.g. 49 divisions) that crash MuseScore's MusicXML importer. The fix matches the MuseScore Encore importer's approach: 1. Use face value for the base duration (plain note type). 2. Detect dot count by comparing dotControl (Encore's sounding duration in Encore ticks) against base*3/2, base*7/4, base*15/8. 3. If dotControl is non-standard, use the plain face value (0 dots). This ignores m_tuplet entirely for rests, which is correct because tuplet rests in MIDI-recorded files don't need explicit ratio encoding — the rest's sounding duration (dotControl) already describes the visual value.
Two fixes that eliminate measure overflows in MIDI-recorded Encore files: 1. durationRest dot formula: the iterative *3/2 loop gave 9/4 (= 270) for 2 dots when the correct formula is 7/4 (= 210, double-dotted 8th). Replace with direct: if dots==1 use base*3/2, dots==2 use base*7/4, dots==3 use base*15/8. 2. Second-pass overflow guard: a note accepted by the first pass as a chord extension (isChord=true in the first pass, no overflow check) can appear as a root note in the second pass when chordRootTick2 diverges from chordRootTick (they diverge when elements filtered in the first pass alter chordRootTick but not chordRootTick2). Add an explicit tick + duration > measureDur check for root notes in the second pass, matching the first-pass guard.
Three fixes for compatibility with MuseScore import of MIDI-recorded files: - Tuplet lookahead: the forceCloseTuplet logic checked nextNote->actualNotes() to decide if the next element continues a tuplet. In MIDI-recorded files actualNotes is 0 and the tuplet ratio is inferred from duration; the check now falls back to detectTuplet so MIDI triplets are recognised correctly. Without this, the last note of a 3-note triplet was incorrectly marked as a tuplet stop one note too early, leaving an orphaned start that MuseScore imported as an incomplete tuplet. - Gap rest splitting: writeGapRest now recursively splits non-standard durations (e.g. 75 ticks = 60+15) into standard note-type rests. Previously a typeless rest was written and MuseScore's TDuration rounded it to the nearest standard value, causing measure overflow. - Chord note duration: chord extension notes now inherit the root note duration for the <duration> element, preventing MusicXML importers from double-counting chord notes with different raw MIDI durations.
Add atraing.enc (31-staff MIDI-recorded orchestral score) as a regression test for the tuplet lookahead and gap rest splitting fixes. The file exercises complex MIDI timing patterns including inferred triplets, chord clusters across multiple staves, and non-standard voice gaps. Remove pachbel-hiden.enc from the test suite.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds support for legacy Encore file formats and fixes several conversion issues found while testing against a corpus of 1067 files. It also improves compatibility with MuseScore Studio import for MIDI-recorded orchestral scores.
Older Encore format support
Duration and measure fixes
faceValue, which is unreliable in v0xC2 filesm_durTicksdoes not match the actual time signature (e.g. files using 720 ticks for 3/4 time)calculateRealDurations(): Encore files use the same 240-ticks-per-quarter convention as MusicXMLdurationRest()to usedotControlas sounding duration, with correct dot formula (base*3/2for 1 dot,base*7/4for 2 dots)durationRest()ignoringm_tupletbyte producing non-standard values in tuplet contextsTuplet and dot fixes
m_tupletis not set (old format files)correctNoteType()TupletHandlerto use duration-based grouping instead of count-based groupingm_dotControlis not set (old format files)forceCloseTupletlogic now usesdetectTuplet()as fallback whenactualNotesis 0 in the file, so MIDI-recorded triplets are not closed one note too earlyGap rest fixes
<forward>elements with actual<rest>elements so MuseScore counts them toward voice durationTDurationto round to the wrong value and produce measure overflowTie fix
xoffsetcomparison from tie detection:xoffsetrepresents visual placement, not note identity. Ties are now matched by tick, voice, and staff only.MIDI-recorded file improvements
CHORD_MIDI_THRESHOLD) so notes recorded near-simultaneously are grouped correctly<forward>/<backup>placement<duration>to prevent MusicXML importers double-counting notes with different raw MIDI durationsRobustness fixes
addSpannerEnds()to prevent crashes with corrupted measure referencesm_linesvector incountStaves()0x40bit check)x_offsetvisual positioningRefactoring
calculateDuration()nstaves()andisTablature()helperswriteTimeModificationto only emit real tuplets (actual > 1 and actual != normal)Test data
.encfile as a regression test. It was validated against a personal collection of 691.encfiles of various formats and complexity.Results
The remaining 74 empty-output files are ZBOT-encrypted or template files with no musical content.