Skip to content

Feature Request: Add stripBinaryFields() for LLM-friendly output #7

@PhilflowIO

Description

@PhilflowIO

🚀 Feature Request

Add a new function stripBinaryFields() to remove binary/large fields from iCal/vCard data, making output LLM-friendly.

🔗 Related Issue

This addresses PhilflowIO/dav-mcp#33 - Contact photos (PHOTO field) cause LLM context overflow

💡 Problem

When vCards contain photos, the Base64-encoded image data can be several MB in size, causing:

  • LLM context overflow
  • Massive token usage and costs
  • Slow responses or timeouts
  • Memory issues in MCP servers

Similar issue affects iCal ATTACH fields with BASE64 encoding.

📝 Proposed API

/**
 * Strip binary/large fields from iCal/vCard data for LLM-friendly output
 * Removes: PHOTO, LOGO (vCard), ATTACH with BASE64 encoding (iCal)
 * 
 * @param calendarObject - iCal/vCard string or object with 'data' field
 * @returns Cleaned iCal/vCard string without binary data
 * 
 * @example
 * // vCard with photo
 * const cleaned = stripBinaryFields(vcard);
 * // PHOTO and LOGO properties removed
 * 
 * // Event with attachment
 * const cleaned = stripBinaryFields(event);
 * // ATTACH properties with ENCODING=BASE64 removed
 */
export function stripBinaryFields(calendarObject: CalendarObjectInput): string;

🔧 Implementation Approach

Using same ical.js pattern as updateFields():

export function stripBinaryFields(calendarObject: CalendarObjectInput): string {
  const icalString = typeof calendarObject === 'string' 
    ? calendarObject 
    : calendarObject.data;
    
  const jcalData = ICAL.parse(icalString);
  const component = new ICAL.Component(jcalData);
  
  // vCard: Remove PHOTO, LOGO
  if (component.name === 'vcard') {
    component.removeAllProperties('photo');
    component.removeAllProperties('logo');
  } 
  // VCALENDAR: Remove ATTACH with BASE64
  else if (component.name === 'vcalendar') {
    const subcomponent = 
      component.getFirstSubcomponent('vevent') ||
      component.getFirstSubcomponent('vtodo') ||
      component.getFirstSubcomponent('vjournal');
      
    if (subcomponent) {
      const attachProps = subcomponent.getAllProperties('attach');
      attachProps.forEach(attach => {
        const encoding = attach.getParameter('encoding');
        if (encoding?.toLowerCase() === 'base64') {
          subcomponent.removeProperty(attach);
        }
      });
    }
  }
  
  return component.toString();
}

🎯 Use Case in dav-mcp

import { stripBinaryFields } from 'tsdav-utils';

// In formatContactList() - strip PHOTO before sending to LLM
output += JSON.stringify(contacts.map(c => ({
  url: c.url,
  etag: c.etag,
  data: stripBinaryFields(c.data)  // ← Removes PHOTO/LOGO
})), null, 2);

📦 Fields to Strip

vCard (RFC 6350)

  • PHOTO - Contact photo (often several MB Base64)
  • LOGO - Organization logo

iCalendar (RFC 5545)

  • ATTACH with ENCODING=BASE64 - Binary attachments

Future Considerations

  • SOUND (vCard) - Audio pronunciation
  • Other large proprietary extensions

✅ Benefits

  1. LLM-safe output - No context overflow
  2. Token efficiency - Massive reduction in token usage
  3. Reusable - Any project using tsdav-utils benefits
  4. Consistent API - Matches updateFields() pattern
  5. Non-destructive - Only affects output, not source data

🧪 Testing

describe('stripBinaryFields', () => {
  it('removes PHOTO from vCard', () => {
    const vcard = 'BEGIN:VCARD\nPHOTO;ENCODING=b:' + 'x'.repeat(100000) + '\nEND:VCARD';
    const cleaned = stripBinaryFields(vcard);
    expect(cleaned).not.toContain('PHOTO');
    expect(cleaned.length).toBeLessThan(vcard.length);
  });
  
  it('removes BASE64 ATTACH from event', () => {
    const event = 'BEGIN:VEVENT\nATTACH;ENCODING=BASE64:..huge..\nEND:VEVENT';
    const cleaned = stripBinaryFields(event);
    expect(cleaned).not.toContain('ATTACH');
  });
  
  it('preserves non-binary fields', () => {
    const vcard = 'BEGIN:VCARD\nFN:John Doe\nEMAIL:john@example.com\nPHOTO:...\nEND:VCARD';
    const cleaned = stripBinaryFields(vcard);
    expect(cleaned).toContain('FN:John Doe');
    expect(cleaned).toContain('EMAIL:john@example.com');
  });
});

📚 References

🚦 Priority

Medium-High - Blocks LLM usage for contacts with photos (common in Google Contacts, iCloud, etc.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions