design.md
4.51 KB
Context
The Nano Banana App currently uses PySide6 for its GUI with a traditional file dialog approach for image uploads. Users must click "添加图片" button, navigate file explorer, and select images. This workflow is inefficient for users who frequently work with screenshots, web images, or need to quickly add multiple reference images.
Goals / Non-Goals
Goals
- Enable drag-and-drop of image files directly onto the preview area
- Support paste functionality for clipboard images (screenshots, copied web images)
- Maintain all existing functionality without breaking changes
- Provide clear visual feedback during drag operations
- Enhance image validation to prevent invalid files
Non-Goals
- Replace existing file dialog upload method
- Support non-image file drag operations
- Implement complex file management features
- Change existing image storage structure
Decisions
Decision: Extend Existing UI Components
-
What: Enhance the existing
QScrollAreaand image preview container to accept drops - Why: Maintains current layout and behavior while adding new capabilities
- How: Subclass existing components or add event handlers to current widgets
Decision: Use Qt's Built-in Drag-and-Drop Framework
-
What: Implement
dragEnterEvent,dropEvent, and related Qt methods - Why: Native Qt support provides cross-platform compatibility and consistent behavior
-
How: Enable
setAcceptDrops(True)on target widgets and handleQMimeData
Decision: Support Multiple Input Methods
- What: File dialog, drag-and-drop, and clipboard paste all coexist
- Why: Users have different preferences and workflows
- How: Route all inputs through a unified image validation and storage system
Risks / Trade-offs
Risk: File Type Security
- Risk: Malicious files could be dropped/pasted into the application
- Mitigation: Implement MIME type checking, file header validation, and size limits
Trade-off: Complex Event Handling
- Trade-off: More complex event handling code vs. improved user experience
- Decision: Accept complexity for significant UX improvement
Risk: Cross-platform Clipboard Variations
- Risk: Different clipboard behaviors across Windows, macOS, Linux
- Mitigation: Use Qt's unified clipboard API with fallbacks
Migration Plan
-
Phase 1: Implement drag-and-drop file support
- Add event handlers to image preview area
- Implement file validation and processing
- Add visual feedback (border highlighting)
-
Phase 2: Add clipboard paste support
- Implement paste event handling
- Add temporary file handling for clipboard images
- Integrate with existing preview system
-
Phase 3: Enhance validation and error handling
- Improve file type detection
- Add user-friendly error messages
- Implement size limits and warnings
Open Questions
- Should we support drag-and-drop onto the entire application window or just the image area?
- What should be the maximum clipboard image size to prevent memory issues?
- Should we provide different visual feedback for different drag content types?
- How should we handle duplicate images from different sources?
Implementation Details
Technical Architecture
Input Sources:
├── File Dialog (existing)
├── Drag-and-Drop (new)
│ ├── Files from Explorer/Finder
│ └── Images from Web Browsers
└── Clipboard Paste (new)
├── Screenshots (PrtScn, Win+Shift+S)
└── Copied Images (Ctrl+C)
↓ Unified Processing Pipeline:
Image Validation:
├── MIME Type Check
├── File Header Validation
├── Size Limits
└── Format Support Check
↓ Storage & Display:
Image Management:
├── Add to self.uploaded_images list
├── Generate thumbnail preview
├── Update UI count and layout
└── Maintain existing delete functionality
Event Handling Flow
- Drag Enter Event: Validate drag content, accept if valid images
- Drag Move Event: Update visual feedback (highlight, cursor)
- Drop Event: Process dropped files, add to upload list
- Paste Event: Check clipboard for image data, process if present
- Validation Event: Unified validation for all input sources
Key Components to Modify
-
image_generator.py: Main application file -
upload_images()method: Extend to handle multiple input sources -
update_image_preview()method: Reuse existing preview logic - Image preview area: Add drag-and-drop event handlers
- Main window: Add paste event handling