5.1 KiB
5.1 KiB
CivilPlan MCP v2 - Bird's-Eye View Generation Design Spec
Overview
Add a new MCP tool generate_birdseye_view to CivilPlan MCP that generates 3D architectural/civil engineering bird's-eye view and perspective renderings using Google's Nano Banana Pro (Gemini 3 Pro Image) API. Additionally, remove all local LLM dependencies and create a polished release with comprehensive documentation.
Scope
In Scope
- New MCP tool:
generate_birdseye_view— generates 2 images (bird's-eye + perspective) - Nano Banana Pro integration via
google-genaiPython SDK - Project-type-specific prompt templates (road, building, water/sewerage, river, landscaping)
- Local LLM removal — delete all local LLM code and dependencies
- Release v2.0.0 — GitHub release with detailed README and connection guides
Out of Scope
- Night/day or seasonal variations
- Video/animation generation
- 3D model file export (OBJ, FBX, etc.)
Architecture
Data Flow
MCP Client (Claude / ChatGPT)
|
| MCP Protocol (HTTP)
v
CivilPlan MCP Server (FastMCP)
|
| generate_birdseye_view tool called
v
BirdseyeViewGenerator
|
|-- [If SVG drawing exists] Convert SVG to PNG reference image
|-- [Always] Build optimized prompt from project data
|
v
Google Gemini API (Nano Banana Pro model)
|
v
2x PNG images returned (bird's-eye + perspective)
|
|-- Save to output directory
|-- Return base64 + file paths via MCP response
New Files
| File | Purpose |
|---|---|
civilplan_mcp/tools/birdseye_generator.py |
MCP tool implementation |
civilplan_mcp/prompts/birdseye_templates.py |
Project-type prompt templates |
civilplan_mcp/services/gemini_image.py |
Nano Banana Pro API client wrapper |
tests/test_birdseye_generator.py |
Unit tests |
Tool Interface
@mcp.tool()
async def generate_birdseye_view(
project_summary: str, # Parsed project description (from project_parser)
project_type: str, # "road" | "building" | "water" | "river" | "landscape" | "mixed"
svg_drawing: str | None, # Optional SVG drawing content from drawing_generator
resolution: str = "2k", # "2k" | "4k"
output_dir: str = "./output/renders"
) -> dict:
"""
Returns:
{
"birdseye_view": {"path": str, "base64": str},
"perspective_view": {"path": str, "base64": str},
"prompt_used": str,
"model": "nano-banana-pro"
}
"""
Prompt Template Strategy
Each project type gets a specialized prompt template:
- Road: Emphasize road alignment, terrain, surrounding land use, utility corridors
- Building: Emphasize building mass, facade, site context, parking/landscaping
- Water/Sewerage: Emphasize pipeline routes, treatment facilities, connection points
- River: Emphasize riverbank, embankments, bridges, flood plains
- Landscape: Emphasize vegetation, pathways, public spaces, terrain grading
- Mixed: Combine relevant elements from applicable types
Template format:
"Create a photorealistic {view_type} of a {project_type} project:
{project_details}
Style: Professional architectural visualization, Korean construction context,
clear weather, daytime, {resolution} resolution"
API Configuration
- API key stored via existing
.env/secure_store.pypattern - New env var:
GEMINI_API_KEY - SDK:
google-genai(official Google Gen AI Python SDK) - Model:
gemini-3-pro-image(Nano Banana Pro) - Error handling: On API failure, return error message without crashing the MCP tool
SVG-to-PNG Conversion
When an SVG drawing is provided as reference:
- Convert SVG to PNG using
cairosvgorPillow - Send as reference image alongside the text prompt
- Nano Banana Pro uses it for spatial understanding
Local LLM Removal
Identify and remove:
- Any local model loading code (transformers, llama-cpp, ollama, etc.)
- Related dependencies in
requirements.txt/pyproject.toml - Config entries referencing local models
- Replace with Gemini API calls where needed
Release Plan
Version: v2.0.0
README Overhaul
- Project overview with feature highlights
- Quick start guide (clone, install, configure, run)
- Tool reference table (all 20 tools including new birdseye)
- Claude Desktop connection guide (step-by-step with screenshots description)
- ChatGPT / OpenAI connection guide
- API key setup guide (Gemini, public data portal)
- Example outputs (birdseye rendering description)
- Troubleshooting FAQ
GitHub Release
- Tag:
v2.0.0 - Release notes summarizing changes
- Installation instructions
Testing Strategy
- Unit test for prompt template generation
- Unit test for SVG-to-PNG conversion
- Integration test with mocked Gemini API response
- Manual end-to-end test with real API key
Dependencies Added
| Package | Purpose |
|---|---|
google-genai |
Gemini API SDK (Nano Banana Pro) |
cairosvg |
SVG to PNG conversion |
Pillow |
Image processing |
Dependencies Removed
All local LLM packages (to be identified during implementation by scanning current requirements).