codeslord commited on
Commit
64e16be
·
1 Parent(s): cf0883d

mcp server

Browse files
Files changed (5) hide show
  1. LICENSE +196 -0
  2. README.md +100 -7
  3. app.py +205 -0
  4. requirements.txt +7 -0
  5. tools.py +310 -0
LICENSE ADDED
@@ -0,0 +1,196 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have
77
+ made, use, offer to sell, sell, import, and otherwise transfer the
78
+ Work, where such license applies only to those patent claims
79
+ licensable by such Contributor that are necessarily infringed by
80
+ their Contribution(s) alone or by combination of their
81
+ Contribution(s) with the Work to which such Contribution(s) was
82
+ submitted. If You institute patent litigation against any entity
83
+ (including a cross-claim or counterclaim in a lawsuit) alleging that
84
+ the Work or a Contribution incorporated within the Work constitutes
85
+ direct or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate as of
87
+ the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ You must give any other recipients of the Work or Derivative Works
95
+ a copy of this License; and
96
+ You must cause any modified files to carry prominent notices stating
97
+ that You changed the files; and
98
+ You must retain, in the Source form of any Derivative Works that You
99
+ distribute, all copyright, patent, trademark, and attribution notices
100
+ from the Source form of the Work, excluding those notices that do not
101
+ pertain to any part of the Derivative Works; and
102
+ If the Work includes a "NOTICE" text file as part of its
103
+ distribution, then any Derivative Works that You distribute must
104
+ include a readable copy of the attribution notices contained
105
+ within such NOTICE file, excluding those notices that do not pertain
106
+ to any part of the Derivative Works, in at least one of the following
107
+ places: within a NOTICE text file distributed as part of the
108
+ Derivative Works; within the Source form or documentation, if
109
+ provided along with the Derivative Works; or, within a display
110
+ generated by the Derivative Works, if and wherever such third-party
111
+ notices normally appear. The contents of the NOTICE file are
112
+ for informational purposes only and do not modify the License. You may
113
+ add Your own attribution notices within Derivative Works that You
114
+ distribute, alongside or as an addendum to the NOTICE text from the
115
+ Work, provided that such additional attribution notices cannot be
116
+ construed as modifying the License.
117
+
118
+ You may add Your own copyright statement to Your modifications and
119
+ may provide additional or different license terms and conditions
120
+ for use, reproduction, or distribution of Your modifications, or
121
+ for any such Derivative Works as a whole, provided Your use,
122
+ reproduction, and distribution of the Work otherwise complies with
123
+ the conditions stated in this License.
124
+
125
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
126
+ any Contribution intentionally submitted for inclusion in the Work
127
+ by You to the Licensor shall be under the terms and conditions of
128
+ this License, without any additional terms or conditions.
129
+ Notwithstanding the above, nothing herein shall supersede or modify
130
+ the terms of any separate license agreement you may have executed
131
+ with Licensor regarding such Contributions.
132
+
133
+ 6. Trademarks. This License does not grant permission to use the trade
134
+ names, trademarks, service marks, or product names of the Licensor,
135
+ except as required for reasonable and customary use in describing the
136
+ origin of the Work and reproducing the content of the NOTICE file.
137
+
138
+ 7. Disclaimer of Warranty. Unless required by applicable law or
139
+ agreed to in writing, Licensor provides the Work (and each
140
+ Contributor provides its Contributions) on an "AS IS" BASIS,
141
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
142
+ implied, including, without limitation, any warranties or conditions
143
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
144
+ PARTICULAR PURPOSE. You are solely responsible for determining the
145
+ appropriateness of using or redistributing the Work and assume any
146
+ risks associated with Your exercise of permissions under this License.
147
+
148
+ 8. Limitation of Liability. In no event and under no legal theory,
149
+ whether in tort (including negligence), contract, or otherwise,
150
+ unless required by applicable law (such as deliberate and grossly
151
+ negligent acts) or agreed to in writing, shall any Contributor be
152
+ liable to You for damages, including any direct, indirect, special,
153
+ incidental, or consequential damages of any character arising as a
154
+ result of this License or out of the use or inability to use the
155
+ Work (including but not limited to damages for loss of goodwill,
156
+ work stoppage, computer failure or malfunction, or any and all
157
+ other commercial damages or losses), even if such Contributor
158
+ has been advised of the possibility of such damages.
159
+
160
+ 9. Accepting Warranty or Additional Liability. While redistributing
161
+ the Work or Derivative Works thereof, You may choose to offer,
162
+ and charge a fee for, acceptance of support, warranty, indemnity,
163
+ or other liability obligations and/or rights consistent with this
164
+ License. However, in accepting such obligations, You may act only
165
+ on Your own behalf and on Your sole responsibility, not on behalf
166
+ of any other Contributor, and only if You agree to indemnify,
167
+ defend, and hold each Contributor harmless for any liability
168
+ incurred by, or claims asserted against, such Contributor by reason
169
+ of your accepting any such warranty or additional liability.
170
+
171
+ END OF TERMS AND CONDITIONS
172
+
173
+ APPENDIX: How to apply the Apache License to your work.
174
+
175
+ To apply the Apache License to your work, attach the following
176
+ boilerplate notice, with the fields enclosed by brackets "[]"
177
+ replaced with your own identifying information. (Don't include
178
+ the brackets!) The text should be enclosed in the appropriate
179
+ comment syntax for the file format. We also recommend that a
180
+ file or class name and description of purpose be included on
181
+ the same "printed page" as the copyright notice for easier
182
+ identification within third-party archives.
183
+
184
+ Copyright 2025 Rohith Raghunathan Nair
185
+
186
+ Licensed under the Apache License, Version 2.0 (the "License");
187
+ you may not use this file except in compliance with the License.
188
+ You may obtain a copy of the License at:
189
+
190
+ http://www.apache.org/licenses/LICENSE-2.0
191
+
192
+ Unless required by applicable law or agreed to in writing, software
193
+ distributed under the License is distributed on an "AS IS" BASIS,
194
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
195
+ See the License for the specific language governing permissions and
196
+ limitations under the License.
README.md CHANGED
@@ -1,14 +1,107 @@
1
  ---
2
- title: GameSmith
3
- emoji: 🔥
4
  colorFrom: purple
5
- colorTo: green
6
  sdk: gradio
7
- sdk_version: 6.0.1
8
  app_file: app.py
9
  pinned: false
10
- license: apache-2.0
11
- short_description: 'Bring your characters to life: Create. Animate. Ship.'
 
 
 
 
12
  ---
13
 
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: PixelForge AI
3
+ emoji: 🎮
4
  colorFrom: purple
5
+ colorTo: blue
6
  sdk: gradio
7
+ sdk_version: 5.9.0
8
  app_file: app.py
9
  pinned: false
10
+ tags:
11
+ - building-mcp-track-creative
12
+ - mcp-in-action-track-creative
13
+ - tool
14
+ - game-dev
15
+ - pixel-art
16
  ---
17
 
18
+ # 🎮 PixelForge AI: The Intelligent Game Asset Studio
19
+
20
+ **Author:** Rohith Raghunathan Nair
21
+
22
+ ## 🚀 Overview
23
+
24
+ **PixelForge AI** is a dual-interface Game Asset Generator that functions as both a human-friendly **Web App** and an AI-accessible **MCP Server**.
25
+
26
+ It solves the biggest bottleneck in indie game development: **Creating consistent, animated assets.**
27
+
28
+ By leveraging **Google Gemini 2.5 Flash** for style-consistent sprite generation and **Google Veo** for physics-aware animation, PixelForge allows developers (and AI agents!) to go from a text prompt to a ready-to-use sprite sheet in seconds.
29
+
30
+ ## 🏆 Hackathon Tracks
31
+
32
+ This project is submitted to:
33
+ - **Track 1: Building MCP** (Creative) - `building-mcp-track-creative`
34
+ - **Track 2: MCP in Action** (Creative) - `mcp-in-action-track-creative`
35
+
36
+ ## ✨ Features
37
+
38
+ 1. **Text-to-Sprite Generation:** Creates high-quality, flat 2D pixel art characters using advanced prompting strategies to ensure game-ready assets (no 3D artifacts, solid backgrounds).
39
+ 2. **AI Animation (Veo):** Uses Google's Veo model to generate fluid animations (Idle, Walk, Run, Jump) that strictly adhere to the pixel art style of the input sprite.
40
+ 3. **Asset Extraction:** Automatically converts the generated video animations into standard sprite sheet frames (PNGs) and ZIP archives, ready for engines like Unity, Godot, or Phaser.
41
+ 4. **MCP Native:** All functionality is exposed via the Model Context Protocol. You can ask Claude: *"Make a knight sprite, animate it walking, and give me the frames"* and it will execute the entire pipeline autonomously.
42
+
43
+ ## 🛠️ Installation & Usage
44
+
45
+ ### Prerequisites
46
+ - Python 3.10+
47
+ - Google Gemini API Key (with access to Veo and Gemini models)
48
+ - FFmpeg (installed on system)
49
+
50
+ ### Setup
51
+
52
+ 1. Clone the repository:
53
+ ```bash
54
+ git clone https://huggingface.co/spaces/YOUR_USERNAME/PixelForge-AI
55
+ cd PixelForge-AI
56
+ ```
57
+
58
+ 2. Install dependencies:
59
+ ```bash
60
+ pip install -r requirements.txt
61
+ ```
62
+
63
+ 3. Set your API Key:
64
+ ```bash
65
+ export GEMINI_API_KEY="your_google_api_key_here"
66
+ ```
67
+
68
+ 4. Run the server:
69
+ ```bash
70
+ python app.py
71
+ ```
72
+
73
+ 5. **For Humans:** Open `http://localhost:7860` in your browser.
74
+ 6. **For Agents (MCP):** Connect your MCP client (Claude Desktop, Cursor) to `http://localhost:7860/sse`.
75
+
76
+ ## 🤖 MCP Configuration (Claude Desktop)
77
+
78
+ Add this to your `claude_desktop_config.json`:
79
+
80
+ ```json
81
+ {
82
+ "mcpServers": {
83
+ "pixel-forge": {
84
+ "command": "python",
85
+ "args": [
86
+ "/path/to/PixelForge-AI/app.py"
87
+ ],
88
+ "env": {
89
+ "GEMINI_API_KEY": "your_key_here"
90
+ }
91
+ }
92
+ }
93
+ }
94
+ ```
95
+
96
+ ## 💡 Winning Ideas for MCP Hackathon
97
+ *(As requested, here are 5 winning concepts for the MCP Hackathon)*
98
+
99
+ 1. **PixelForge AI (This Project):** A creative pipeline that bridges the gap between generative video and usable game assets, solving a real "last mile" problem for developers.
100
+ 2. **NPC-GPT (The Living Character Engine):** An MCP server that generates not just the visual sprite (using PixelForge tools) but also the character's stats, dialogue trees, and behavior scripts, packaging them into a JSON file for Godot/Unity.
101
+ 3. **RetroRemix (Legacy Game Reskinner):** A tool where users upload screenshots of old games, and the AI identifies assets (tiles, enemies) and "remasters" them into a new style (e.g., "Cyberpunk Mario") using the sprite pipeline.
102
+ 4. **StoryBoarder (Cinematic Cutscene Gen):** An agentic workflow that takes a script, breaks it into scenes, generates keyframes using Gemini, animates short loops using Veo, and assembles a rough animatic video.
103
+ 5. **LevelGod (Procedural World Builder):** A tool focused on "Tile Connectivity". You generate a center tile, and the MCP server iteratively generates the connecting tiles (top, bottom, corners) to ensure seamless looping textures for infinite runners or RPG maps.
104
+
105
+ ## 📜 License
106
+
107
+ MIT License
app.py ADDED
@@ -0,0 +1,205 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import gradio as gr
2
+ import os
3
+ import base64
4
+ from io import BytesIO
5
+ from PIL import Image
6
+ import tempfile
7
+ from tools import generate_pixel_character, animate_pixel_character, extract_sprite_frames
8
+
9
+ # --- Helper Functions for Gradio Logic ---
10
+
11
+ def process_generate_sprite(prompt, ref_img):
12
+ try:
13
+ ref_b64 = None
14
+ if ref_img is not None:
15
+ # Convert numpy array or PIL image to base64
16
+ if isinstance(ref_img, str): # path
17
+ with open(ref_img, "rb") as f:
18
+ ref_b64 = base64.b64encode(f.read()).decode('utf-8')
19
+ elif hasattr(ref_img, "save"): # PIL Image
20
+ buffered = BytesIO()
21
+ ref_img.save(buffered, format="PNG")
22
+ ref_b64 = base64.b64encode(buffered.getvalue()).decode('utf-8')
23
+
24
+ b64_result = generate_pixel_character(prompt, ref_b64)
25
+
26
+ # Convert back to PIL for display
27
+ img_data = base64.b64decode(b64_result)
28
+ return Image.open(BytesIO(img_data)), b64_result
29
+ except Exception as e:
30
+ raise gr.Error(str(e))
31
+
32
+ def process_animate_sprite(sprite_img, animation_type, extra_prompt):
33
+ try:
34
+ if sprite_img is None:
35
+ raise ValueError("Please provide a sprite image first.")
36
+
37
+ # Convert input image to base64
38
+ sprite_b64 = None
39
+ if isinstance(sprite_img, str): # path provided by Gradio example or upload
40
+ with open(sprite_img, "rb") as f:
41
+ sprite_b64 = base64.b64encode(f.read()).decode('utf-8')
42
+ elif hasattr(sprite_img, "save"): # PIL Image
43
+ buffered = BytesIO()
44
+ sprite_img.save(buffered, format="PNG")
45
+ sprite_b64 = base64.b64encode(buffered.getvalue()).decode('utf-8')
46
+ elif isinstance(sprite_img, tuple): # Sometimes Gradio returns (path, meta)
47
+ # Handle other formats if necessary
48
+ pass
49
+
50
+ # If sprite_b64 is still None (e.g. numpy array), try to convert
51
+ if sprite_b64 is None:
52
+ # Assuming numpy array -> PIL -> Base64
53
+ im = Image.fromarray(sprite_img)
54
+ buffered = BytesIO()
55
+ im.save(buffered, format="PNG")
56
+ sprite_b64 = base64.b64encode(buffered.getvalue()).decode('utf-8')
57
+
58
+ video_b64 = animate_pixel_character(sprite_b64, animation_type, extra_prompt)
59
+
60
+ # Save to temp file for Gradio to display
61
+ video_bytes = base64.b64decode(video_b64)
62
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".mp4") as f:
63
+ f.write(video_bytes)
64
+ video_path = f.name
65
+
66
+ return video_path, video_b64
67
+ except Exception as e:
68
+ raise gr.Error(str(e))
69
+
70
+ def process_extract_frames(video_file, fps):
71
+ try:
72
+ if video_file is None:
73
+ raise ValueError("Please upload a video file.")
74
+
75
+ # Read video file to base64
76
+ with open(video_file, "rb") as f:
77
+ video_b64 = base64.b64encode(f.read()).decode('utf-8')
78
+
79
+ zip_b64, frames_b64 = extract_sprite_frames(video_b64, fps)
80
+
81
+ # Save zip to temp file
82
+ with tempfile.NamedTemporaryFile(delete=False, suffix=".zip") as f:
83
+ f.write(base64.b64decode(zip_b64))
84
+ zip_path = f.name
85
+
86
+ # Convert frames to gallery format (list of paths or PIL images)
87
+ gallery_images = []
88
+ for fb64 in frames_b64:
89
+ img_data = base64.b64decode(fb64)
90
+ gallery_images.append(Image.open(BytesIO(img_data)))
91
+
92
+ return gallery_images, zip_path
93
+ except Exception as e:
94
+ raise gr.Error(str(e))
95
+
96
+
97
+ # --- Gradio UI Layout ---
98
+
99
+ with gr.Blocks(title="PixelForge AI - Game Asset Studio", theme=gr.themes.Soft()) as demo:
100
+ gr.Markdown(
101
+ """
102
+ # 🎮 PixelForge AI
103
+ ### The Intelligent Game Asset Studio
104
+
105
+ Generate, animate, and export pixel art sprites for your games using Google Gemini & Veo.
106
+ *Built for the Hugging Face MCP Hackathon.*
107
+ """
108
+ )
109
+
110
+ with gr.Tab("1. Generate Sprite"):
111
+ with gr.Row():
112
+ with gr.Column():
113
+ prompt_input = gr.Textbox(
114
+ label="Character Description",
115
+ placeholder="A brave knight in rusty armor, side view...",
116
+ lines=3
117
+ )
118
+ ref_input = gr.Image(label="Style Reference (Optional)", type="pil")
119
+ gen_btn = gr.Button("Generate Sprite", variant="primary")
120
+
121
+ with gr.Column():
122
+ result_image = gr.Image(label="Generated Sprite", type="pil")
123
+ # Hidden state to pass base64 to next tab if needed
124
+ sprite_b64_state = gr.State()
125
+
126
+ gen_btn.click(
127
+ process_generate_sprite,
128
+ inputs=[prompt_input, ref_input],
129
+ outputs=[result_image, sprite_b64_state]
130
+ )
131
+
132
+ with gr.Tab("2. Animate"):
133
+ with gr.Row():
134
+ with gr.Column():
135
+ # Allow user to use generated image or upload new
136
+ anim_input_image = gr.Image(label="Input Sprite", type="pil")
137
+ anim_type = gr.Dropdown(
138
+ choices=["idle", "walk", "run", "jump"],
139
+ value="idle",
140
+ label="Animation Type"
141
+ )
142
+ extra_anim_prompt = gr.Textbox(
143
+ label="Motion Tweaks (Optional)",
144
+ placeholder="Make it bounce more..."
145
+ )
146
+ anim_btn = gr.Button("Animate", variant="primary")
147
+
148
+ with gr.Column():
149
+ result_video = gr.Video(label="Generated Animation")
150
+ video_b64_state = gr.State()
151
+
152
+ # Link previous tab result to this input
153
+ result_image.change(
154
+ lambda x: x,
155
+ inputs=[result_image],
156
+ outputs=[anim_input_image]
157
+ )
158
+
159
+ anim_btn.click(
160
+ process_animate_sprite,
161
+ inputs=[anim_input_image, anim_type, extra_anim_prompt],
162
+ outputs=[result_video, video_b64_state]
163
+ )
164
+
165
+ with gr.Tab("3. Extract Frames"):
166
+ with gr.Row():
167
+ with gr.Column():
168
+ # Allow user to use generated video or upload new
169
+ extract_input_video = gr.Video(label="Input Animation")
170
+ fps_slider = gr.Slider(minimum=4, maximum=24, value=8, step=1, label="FPS")
171
+ extract_btn = gr.Button("Extract Frames", variant="primary")
172
+
173
+ with gr.Column():
174
+ frames_gallery = gr.Gallery(label="Sprite Sheet Frames")
175
+ download_zip = gr.File(label="Download Sprite Sheet (ZIP)")
176
+
177
+ # Link previous tab result
178
+ result_video.change(
179
+ lambda x: x,
180
+ inputs=[result_video],
181
+ outputs=[extract_input_video]
182
+ )
183
+
184
+ extract_btn.click(
185
+ process_extract_frames,
186
+ inputs=[extract_input_video, fps_slider],
187
+ outputs=[frames_gallery, download_zip]
188
+ )
189
+
190
+ gr.Markdown("---")
191
+ gr.Markdown("### 🤖 Model Context Protocol (MCP)")
192
+ gr.Markdown(
193
+ """
194
+ This app doubles as an MCP Server! Connect it to Claude or Cursor to generate assets directly in your chat.
195
+
196
+ **Tools Exposed:**
197
+ - `generate_pixel_character(prompt)`
198
+ - `animate_pixel_character(sprite_b64, animation_type)`
199
+ - `extract_sprite_frames(video_b64)`
200
+ """
201
+ )
202
+
203
+ if __name__ == "__main__":
204
+ # Launch with MCP support enabled
205
+ demo.launch(mcp_server=True)
requirements.txt ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ gradio>=5.9.0
2
+ mcp
3
+ google-genai
4
+ pillow
5
+ numpy
6
+ ffmpeg-python
7
+ python-dotenv
tools.py ADDED
@@ -0,0 +1,310 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import base64
3
+ import io
4
+ import json
5
+ import math
6
+ import time
7
+ from typing import List, Optional, Tuple, Dict, Any
8
+ from PIL import Image
9
+ import ffmpeg
10
+ from google import genai
11
+ from google.genai import types
12
+
13
+ # Initialize API Key from environment
14
+ # Users should set GEMINI_API_KEY in their environment variables
15
+ API_KEY = os.environ.get("GEMINI_API_KEY")
16
+
17
+ def _get_client():
18
+ if not API_KEY:
19
+ raise ValueError("GEMINI_API_KEY environment variable is not set.")
20
+ return genai.Client(api_key=API_KEY)
21
+
22
+ def generate_pixel_character(
23
+ prompt_text: str,
24
+ reference_image_b64: Optional[str] = None
25
+ ) -> str:
26
+ """
27
+ Generates a static 2D pixel art sprite character based on a text description.
28
+
29
+ Args:
30
+ prompt_text: Description of the character (e.g., "A brave knight with a red cape").
31
+ reference_image_b64: Optional base64 string of a reference image to influence style.
32
+
33
+ Returns:
34
+ A base64 string of the generated PNG image.
35
+ """
36
+ client = _get_client()
37
+
38
+ base_instructions = (
39
+ "Generate a 2D pixel art sprite character. Style: flat 2D pixel art, NOT 3D, "
40
+ "NOT photorealistic, NOT rendered. Use pixelated art style with visible pixels. "
41
+ "Character should be a 2D sprite sheet style character, side view, full body, centered. "
42
+ "White solid background (not transparent, not checkerboard). No shadows, no 3D effects, "
43
+ "no depth, no gradients. Pure 2D pixel art sprite style like classic video game sprites. "
44
+ "Do NOT add text or UI."
45
+ )
46
+
47
+ full_prompt = f"{prompt_text}\n\n{base_instructions}" if prompt_text else base_instructions
48
+
49
+ # Config for Gemini 2.5 Flash Image (supports generate_content)
50
+ model_id = "gemini-2.5-flash-image"
51
+
52
+ contents = [types.Content(parts=[types.Part.from_text(text=full_prompt)])]
53
+
54
+ if reference_image_b64:
55
+ # Remove data URL prefix if present
56
+ if "," in reference_image_b64:
57
+ reference_image_b64 = reference_image_b64.split(",", 1)[1]
58
+
59
+ image_bytes = base64.b64decode(reference_image_b64)
60
+ contents[0].parts.append(types.Part.from_bytes(data=image_bytes, mime_type="image/png"))
61
+
62
+ try:
63
+ response = client.models.generate_content(
64
+ model=model_id,
65
+ contents=contents,
66
+ config=types.GenerateContentConfig(
67
+ temperature=0.4
68
+ )
69
+ )
70
+
71
+ if not response.candidates or not response.candidates[0].content.parts:
72
+ raise RuntimeError("No content generated by the model.")
73
+
74
+ for part in response.candidates[0].content.parts:
75
+ if part.inline_data:
76
+ # Return raw base64
77
+ return base64.b64encode(part.inline_data.data).decode('utf-8')
78
+
79
+ raise RuntimeError("Model returned content but no image data found.")
80
+
81
+ except Exception as e:
82
+ raise RuntimeError(f"Failed to generate character: {str(e)}")
83
+
84
+ def animate_pixel_character(
85
+ sprite_b64: str,
86
+ animation_type: str = "idle",
87
+ extra_prompt: str = ""
88
+ ) -> str:
89
+ """
90
+ Animates a static pixel art sprite using Google's Veo model.
91
+
92
+ Args:
93
+ sprite_b64: Base64 string of the input static sprite (PNG).
94
+ animation_type: One of "idle", "walk", "run", "jump".
95
+ extra_prompt: Optional additional instructions for the motion.
96
+
97
+ Returns:
98
+ A base64 string of the generated video (MP4).
99
+ """
100
+ client = _get_client()
101
+
102
+ # Clean base64
103
+ if "," in sprite_b64:
104
+ sprite_b64 = sprite_b64.split(",", 1)[1]
105
+
106
+ sprite_bytes = base64.b64decode(sprite_b64)
107
+
108
+ # Prompt construction (ported from original project)
109
+ base_style = """
110
+ CRITICAL STYLE REQUIREMENTS - YOU MUST FOLLOW THESE EXACTLY:
111
+ - This is a 2D PIXEL ART animation. The input image is a 2D pixel art sprite.
112
+ - The clip should be about 4 seconds long so it can be used as a game sprite animation.
113
+ - You MUST maintain the EXACT same 2D pixel art style as the input image.
114
+ - DO NOT make it 3D, DO NOT make it realistic, DO NOT add depth, DO NOT add shadows.
115
+ - DO NOT add lighting, DO NOT add gradients, DO NOT add shine or gloss.
116
+ - DO NOT render it in 3D style, DO NOT make it photorealistic.
117
+ - The character must remain flat 2D pixel art throughout the entire animation.
118
+ - Match the pixelated, low-resolution, retro game sprite aesthetic of the input image exactly.
119
+ - Use a solid pure white background (#FFFFFF), completely flat, no gradient, no textures, no shadows, no ground line, no objects.
120
+ - The character must stay centered, side view, and fill a reasonable portion of the frame.
121
+ - No text, UI, logos, borders, or props.
122
+ - Keep the same pixel density and resolution as the input image.
123
+ """.strip()
124
+
125
+ motion_prompts = {
126
+ "idle": """
127
+ ANIMATION TYPE: IDLE
128
+ - The character must stand completely still in place.
129
+ - Only animate a very subtle breathing motion: tiny up/down movement of the chest.
130
+ - Optional: very slight idle sway (left/right) of the body, but minimal.
131
+ - NO walking, NO movement across the screen, NO leg movement.
132
+ - The animation should be a single seamless idle loop that starts and ends in almost the same pose so it can be looped cleanly.
133
+ - Do not add any extra actions or transitions after the idle cycle finishes.
134
+ - Keep the animation loop smooth and subtle.
135
+ """,
136
+ "walk": """
137
+ ANIMATION TYPE: WALK CYCLE
138
+ - Animate a classic 2D side-scrolling walk cycle in place, side view.
139
+ - The character walks on the spot (does NOT move across the screen).
140
+ - Show clear leg alternation: left leg forward, right leg back, then switch.
141
+ - Arms should swing opposite to legs (left arm forward when right leg forward).
142
+ - The character's body should have a slight up/down bounce as they walk.
143
+ - The video should contain one clean walk cycle that returns to the starting pose so it can loop seamlessly.
144
+ - Avoid extra camera movement or additional actions at the end of the clip.
145
+ - The character must stay centered in the frame throughout.
146
+ """,
147
+ "run": """
148
+ ANIMATION TYPE: RUN CYCLE
149
+ - Animate a faster run cycle in place, side view.
150
+ - The character runs faster than walking with more exaggerated motion.
151
+ - Legs move faster with longer strides.
152
+ - Arms pump more vigorously than walking.
153
+ - Body has more pronounced up/down bounce.
154
+ - The character stays centered and runs on the spot, like a classic game sprite.
155
+ - The clip should be a single, smooth run cycle that ends in nearly the same pose as the first frame for looping.
156
+ - Do not add extra motions or transitions after the run cycle is complete.
157
+ """,
158
+ "jump": """
159
+ ANIMATION TYPE: JUMP CYCLE
160
+ - Animate a complete jump cycle in place: anticipation (squat down), jump up, hang time at peak, fall down, land, then settle back into the starting pose.
161
+ - The character should compress slightly before jumping (anticipation).
162
+ - At the peak of the jump, there should be a brief hang time.
163
+ - The landing should have a slight compression/squat.
164
+ - The clip should contain one complete jump cycle that ends very close to the initial idle pose, so it can be looped with other animations.
165
+ - Avoid extra steps, actions, or camera moves after the landing.
166
+ - NO camera movement, keep the character centered throughout.
167
+ """
168
+ }
169
+
170
+ selected_motion = motion_prompts.get(animation_type, motion_prompts["idle"]).strip()
171
+ extra = f"\nAdditional user instruction: {extra_prompt.strip()}" if extra_prompt else ""
172
+ full_prompt = f"{base_style}\n\n{selected_motion}\n\n{extra}"
173
+
174
+ # Fallback chain for Veo models
175
+ veo_models = [
176
+ "veo-3.1-fast-generate-preview", # Currently the best for this
177
+ "veo-3.1-generate-preview"
178
+ ]
179
+
180
+ last_error = None
181
+
182
+ for model_name in veo_models:
183
+ try:
184
+ print(f"Attempting animation with model: {model_name}")
185
+
186
+ response = client.models.generate_videos(
187
+ model=model_name,
188
+ prompt=full_prompt,
189
+ image=types.Image(
190
+ image_bytes=sprite_bytes,
191
+ mime_type="image/png"
192
+ ),
193
+ config=types.GenerateVideosConfig(
194
+ aspect_ratio="16:9",
195
+ # duration_seconds=4 # Not always supported in all SDK versions/models, but implied
196
+ )
197
+ )
198
+
199
+ # Poll for completion
200
+ # The Python SDK might handle polling in generate_videos if not async,
201
+ # or we get an operation object.
202
+ # In the new google-genai SDK, generate_videos returns a Job or response.
203
+ # Let's assume it waits or we need to wait.
204
+ # Actually, looking at docs, generate_videos returns a GenerateVideosResponse
205
+ # which contains the video immediately OR it's a long-running op.
206
+ # For "generate-preview" models, it's often synchronous or fast.
207
+ # However, if it returns an operation, we must wait.
208
+
209
+ # NOTE: The Python SDK structure might differ slightly from Node.
210
+ # If response has 'generated_videos', we are good.
211
+
212
+ if hasattr(response, 'generated_videos'):
213
+ video = response.generated_videos[0]
214
+ if video.video.bytes:
215
+ return base64.b64encode(video.video.bytes).decode('utf-8')
216
+ elif video.video.uri:
217
+ # We might need to fetch it if it's a URI
218
+ # For now, assuming bytes are returned or handled by SDK
219
+ pass
220
+
221
+ # If it's an operation (Job), we'd need to poll.
222
+ # But usually the synchronous wrapper does this.
223
+ # Let's try to return the bytes.
224
+
225
+ # If we need to handle the Job object manually:
226
+ if hasattr(response, 'done'): # It's an operation
227
+ while not response.done:
228
+ time.sleep(5)
229
+ response = client.operations.get(response.name)
230
+
231
+ # Now get result
232
+ if response.result and response.result.generated_videos:
233
+ video_bytes = response.result.generated_videos[0].video.bytes
234
+ return base64.b64encode(video_bytes).decode('utf-8')
235
+
236
+ # If we got here and haven't returned, try to inspect standard fields
237
+ # The new SDK is quite dynamic.
238
+
239
+ # Let's assume standard happy path for now.
240
+ return base64.b64encode(response.generated_videos[0].video.bytes).decode('utf-8')
241
+
242
+ except Exception as e:
243
+ print(f"Model {model_name} failed: {e}")
244
+ last_error = e
245
+ continue
246
+
247
+ raise RuntimeError(f"All animation models failed. Last error: {last_error}")
248
+
249
+ def extract_sprite_frames(
250
+ video_b64: str,
251
+ fps: int = 8
252
+ ) -> Tuple[str, List[str]]:
253
+ """
254
+ Extracts frames from an MP4 video and returns a path to a ZIP file and list of frame images.
255
+
256
+ Args:
257
+ video_b64: Base64 string of the MP4 video.
258
+ fps: Frames per second to extract (default 8).
259
+
260
+ Returns:
261
+ Tuple containing:
262
+ 1. Base64 string of the ZIP file containing all frames.
263
+ 2. List of Base64 strings for the individual frames (for preview).
264
+ """
265
+ # Create temporary directory
266
+ import tempfile
267
+ import shutil
268
+ import glob
269
+
270
+ if "," in video_b64:
271
+ video_b64 = video_b64.split(",", 1)[1]
272
+
273
+ video_bytes = base64.b64decode(video_b64)
274
+
275
+ with tempfile.TemporaryDirectory() as tmp_dir:
276
+ input_path = os.path.join(tmp_dir, "input.mp4")
277
+ with open(input_path, "wb") as f:
278
+ f.write(video_bytes)
279
+
280
+ output_pattern = os.path.join(tmp_dir, "frame_%03d.png")
281
+
282
+ try:
283
+ (
284
+ ffmpeg
285
+ .input(input_path)
286
+ .filter('fps', fps=fps)
287
+ .output(output_pattern)
288
+ .overwrite_output()
289
+ .run(capture_stdout=True, capture_stderr=True)
290
+ )
291
+ except ffmpeg.Error as e:
292
+ raise RuntimeError(f"FFmpeg error: {e.stderr.decode('utf8')}")
293
+
294
+ # Collect frames
295
+ frame_files = sorted(glob.glob(os.path.join(tmp_dir, "frame_*.png")))
296
+
297
+ frames_b64 = []
298
+ for frame_file in frame_files:
299
+ with open(frame_file, "rb") as f:
300
+ frames_b64.append(base64.b64encode(f.read()).decode('utf-8'))
301
+
302
+ # Create ZIP
303
+ zip_path = os.path.join(tmp_dir, "sprites.zip")
304
+ shutil.make_archive(zip_path.replace('.zip', ''), 'zip', tmp_dir, '.')
305
+
306
+ # Read ZIP
307
+ with open(zip_path, "rb") as f:
308
+ zip_b64 = base64.b64encode(f.read()).decode('utf-8')
309
+
310
+ return zip_b64, frames_b64