design.md
5.51 KB
Context
The Nano Banana App currently saves PNG files by directly writing the byte stream received from the Gemini API to disk. While this works for most applications, Adobe Photoshop and other professional image editing tools have stricter requirements for PNG file format compliance. The fact that WeChat can "fix" these files suggests that WeChat's image processing pipeline re-writes PNG files to ensure standard compliance.
Current Issues
- Direct Byte Writing: The application writes Gemini API responses directly to files without validation
- PNG Header Issues: Some PNG files may lack proper chunks or metadata expected by professional tools
- No Format Validation: No verification that the written files conform to PNG specification
- Inconsistent Behavior: Some images work fine, others fail in specific applications
Goals / Non-Goals
Goals
- Ensure all generated PNG files are fully compliant with PNG specification
- Fix compatibility issues with Adobe Photoshop and other professional tools
- Maintain 100% backward compatibility with existing functionality
- Provide graceful fallback if Pillow library is not available
- Apply fixes to all image save locations consistently
Non-Goals
- Change image quality or visual appearance
- Modify file naming or directory structure
- Add new image formats beyond PNG
- Change the API interaction with Gemini
Decisions
Decision: Use Pillow (PIL) for PNG Processing
- What: Use Python's standard Pillow library to validate and re-write PNG files
- Why: Pillow is the de facto standard for image processing in Python and ensures PNG specification compliance
- How: Wrap existing byte writing with Pillow Image.open() and Image.save() operations
Decision: Implement Graceful Fallback
- What: If Pillow is not available or fails, fall back to original byte writing
- Why: Ensures the application continues to work even without Pillow dependency
- How: Try-catch blocks around Pillow operations with fallback to original method
Decision: Apply Fix Consistently
- What: Apply PNG re-writing to all save locations (generated images, reference images, downloads)
- Why: Ensures consistent behavior across all PNG files created by the application
- How: Create utility function and apply it in all image save operations
Technical Architecture
PNG Processing Pipeline
Gemini API Response (bytes)
↓
Pillow Image.open(io.BytesIO(bytes))
↓
Validate Image Integrity
↓
Image.save(file_path, 'PNG', optimize=True)
↓
Standard PNG File (Photoshop Compatible)
Fallback Mechanism
Gemini API Response (bytes)
↓
Try Pillow Processing
↓
If Success → Use Standard PNG
↓
If Failed → Original Byte Writing
↓
Ensure Application Continues Working
Implementation Strategy
Phase 1: Create PNG Utility Function
def save_png_with_validation(file_path: str, image_bytes: bytes) -> bool:
"""Save PNG with format validation using Pillow"""
try:
from PIL import Image
import io
with Image.open(io.BytesIO(image_bytes)) as img:
img.save(file_path, 'PNG', optimize=True)
return True
except Exception as e:
logger.warning(f"Pillow processing failed, using fallback: {e}")
return False
Phase 2: Apply to All Save Locations
- HistoryManager.save_generation() - generated images
- HistoryManager.save_generation() - reference images
- ImageGeneratorWindow.download_image() - user downloads
Phase 3: Add Validation and Logging
- Log when Pillow processing succeeds/fails
- Provide user feedback for processing issues
- Add error handling for corrupt image data
Risk Mitigation
Dependency Management
- Risk: Adding Pillow dependency
- Mitigation: Graceful fallback to original method if Pillow unavailable
Performance Impact
- Risk: Additional processing time for PNG re-writing
- Mitigation: Use Pillow's optimize=True, minimal overhead expected
Compatibility Issues
- Risk: Pillow processing might change image characteristics
- Mitigation: Extensive testing with various image types and use cases
File Size Changes
- Risk: PNG re-writing might change file sizes
- Mitigation: Monitor and optimize file size impact
Testing Strategy
Compatibility Testing
- Test with Adobe Photoshop (multiple versions)
- Test with other professional tools (GIMP, Affinity Photo, etc.)
- Test with web browsers and image viewers
- Test with various PNG source images
Regression Testing
- Ensure existing functionality remains intact
- Test image generation workflow end-to-end
- Verify history management still works correctly
- Confirm download functionality unchanged
Performance Testing
- Measure PNG processing time impact
- Test with large image files
- Validate memory usage during processing
- Ensure UI remains responsive
Success Criteria
- Photoshop Compatibility: All PNG files can be opened in Adobe Photoshop
- Backward Compatibility: Existing functionality remains unchanged
- Performance: No significant impact on generation or save times
- Reliability: Graceful handling of edge cases and errors
- Consistency: All PNG files in the application follow the same standard
Rollback Plan
If issues arise, the implementation can be easily rolled back by:
- Removing Pillow imports and utility functions
- Restoring original direct byte writing code
- Removing PNG validation logging
- All changes are isolated to specific save methods