design.md 5.51 KB

Context

The Nano Banana App currently saves PNG files by directly writing the byte stream received from the Gemini API to disk. While this works for most applications, Adobe Photoshop and other professional image editing tools have stricter requirements for PNG file format compliance. The fact that WeChat can "fix" these files suggests that WeChat's image processing pipeline re-writes PNG files to ensure standard compliance.

Current Issues

  1. Direct Byte Writing: The application writes Gemini API responses directly to files without validation
  2. PNG Header Issues: Some PNG files may lack proper chunks or metadata expected by professional tools
  3. No Format Validation: No verification that the written files conform to PNG specification
  4. Inconsistent Behavior: Some images work fine, others fail in specific applications

Goals / Non-Goals

Goals

  • Ensure all generated PNG files are fully compliant with PNG specification
  • Fix compatibility issues with Adobe Photoshop and other professional tools
  • Maintain 100% backward compatibility with existing functionality
  • Provide graceful fallback if Pillow library is not available
  • Apply fixes to all image save locations consistently

Non-Goals

  • Change image quality or visual appearance
  • Modify file naming or directory structure
  • Add new image formats beyond PNG
  • Change the API interaction with Gemini

Decisions

Decision: Use Pillow (PIL) for PNG Processing

  • What: Use Python's standard Pillow library to validate and re-write PNG files
  • Why: Pillow is the de facto standard for image processing in Python and ensures PNG specification compliance
  • How: Wrap existing byte writing with Pillow Image.open() and Image.save() operations

Decision: Implement Graceful Fallback

  • What: If Pillow is not available or fails, fall back to original byte writing
  • Why: Ensures the application continues to work even without Pillow dependency
  • How: Try-catch blocks around Pillow operations with fallback to original method

Decision: Apply Fix Consistently

  • What: Apply PNG re-writing to all save locations (generated images, reference images, downloads)
  • Why: Ensures consistent behavior across all PNG files created by the application
  • How: Create utility function and apply it in all image save operations

Technical Architecture

PNG Processing Pipeline

Gemini API Response (bytes)
    ↓
Pillow Image.open(io.BytesIO(bytes))
    ↓
Validate Image Integrity
    ↓
Image.save(file_path, 'PNG', optimize=True)
    ↓
Standard PNG File (Photoshop Compatible)

Fallback Mechanism

Gemini API Response (bytes)
    ↓
Try Pillow Processing
    ↓
If Success → Use Standard PNG
    ↓
If Failed → Original Byte Writing
    ↓
Ensure Application Continues Working

Implementation Strategy

Phase 1: Create PNG Utility Function

def save_png_with_validation(file_path: str, image_bytes: bytes) -> bool:
    """Save PNG with format validation using Pillow"""
    try:
        from PIL import Image
        import io

        with Image.open(io.BytesIO(image_bytes)) as img:
            img.save(file_path, 'PNG', optimize=True)
        return True
    except Exception as e:
        logger.warning(f"Pillow processing failed, using fallback: {e}")
        return False

Phase 2: Apply to All Save Locations

  1. HistoryManager.save_generation() - generated images
  2. HistoryManager.save_generation() - reference images
  3. ImageGeneratorWindow.download_image() - user downloads

Phase 3: Add Validation and Logging

  • Log when Pillow processing succeeds/fails
  • Provide user feedback for processing issues
  • Add error handling for corrupt image data

Risk Mitigation

Dependency Management

  • Risk: Adding Pillow dependency
  • Mitigation: Graceful fallback to original method if Pillow unavailable

Performance Impact

  • Risk: Additional processing time for PNG re-writing
  • Mitigation: Use Pillow's optimize=True, minimal overhead expected

Compatibility Issues

  • Risk: Pillow processing might change image characteristics
  • Mitigation: Extensive testing with various image types and use cases

File Size Changes

  • Risk: PNG re-writing might change file sizes
  • Mitigation: Monitor and optimize file size impact

Testing Strategy

Compatibility Testing

  • Test with Adobe Photoshop (multiple versions)
  • Test with other professional tools (GIMP, Affinity Photo, etc.)
  • Test with web browsers and image viewers
  • Test with various PNG source images

Regression Testing

  • Ensure existing functionality remains intact
  • Test image generation workflow end-to-end
  • Verify history management still works correctly
  • Confirm download functionality unchanged

Performance Testing

  • Measure PNG processing time impact
  • Test with large image files
  • Validate memory usage during processing
  • Ensure UI remains responsive

Success Criteria

  1. Photoshop Compatibility: All PNG files can be opened in Adobe Photoshop
  2. Backward Compatibility: Existing functionality remains unchanged
  3. Performance: No significant impact on generation or save times
  4. Reliability: Graceful handling of edge cases and errors
  5. Consistency: All PNG files in the application follow the same standard

Rollback Plan

If issues arise, the implementation can be easily rolled back by:

  1. Removing Pillow imports and utility functions
  2. Restoring original direct byte writing code
  3. Removing PNG validation logging
  4. All changes are isolated to specific save methods