MIT's GenCAD turns images into editable CAD programs
MIT researchers Md Ferdous Alam and Faez Ahmed release a demo, paper and code for an image-conditioned model that outputs full parametric CAD command histories.
By Ryan Merket ยท
Why it matters
For engineering teams, editable CAD is the source of truth. A model that maps images to parametric command histories could slot into real workflows: rapid reverse engineering, template generation, design exploration, and retrieval across past projects. It is also a pointer for founders building CAD copilots and PLM integrations: output the program, not just the geometry.

MIT researchers Md Ferdous Alam and Faez Ahmed have released GenCAD, a model that turns a single rendered image into a 3D solid and the full, editable CAD program that built it. The team put up a project page with an interactive demo, the paper, and the code.
The pair is building from MIT, and the work sits at the intersection of Ahmed's DECODE research focus on computational design tools and Alam's recent push on CAD representations. Their bet is that engineering workflows need models that generate the source code of geometry, not just meshes. As they write in the paper, GenCAD "does not merely generate a 3D solid but also the entire CAD program."
What is new
Most generative 3D systems output meshes or point clouds. Those are fine for visuals but lose the parametric constraints engineers use to modify, audit, and manufacture parts. GenCAD tackles that gap by predicting sequences of CAD commands (the program) that a geometry kernel can execute to produce a boundary-representation solid.
In their abstract, Alam and Ahmed frame the key idea: learn a joint representation between images and CAD command sequences, then generate in that latent space so the output is both accurate and modifiable.
How it works
GenCAD is a multi-stage system:
- An autoregressive transformer encoder learns latent representations of CAD command sequences.
- A contrastive model aligns latents from CAD programs and CAD images.
- A latent diffusion model generates CAD-program latents conditioned on images.
- A decoder converts those latents back into a sequence of parametric CAD commands.
This pipeline lets the model produce multiple valid programs for a single image (they showcase sample diversity on the project page) and retrieve similar programs from a library. In one retrieval demo, they surface the top-3 candidates from a collection of roughly 7,000 CAD programs.
What ships today
- A public project page with examples and an interactive teaser.
- The full arXiv paper detailing architecture and results.
- An open GitHub repo with code.
- A short video overview.
Why this line of work
The authors argue that data availability has pushed the field toward meshes and voxels, which are easier to collect but miss what makes CAD useful in engineering: parametric constraints and editability. By predicting the command history itself, GenCAD outputs something teams can tweak, parameter sweep, version, and pass along in a downstream toolchain.
For operators building design tools, that is the wedge: you can imagine image-to-CAD as an onboarding step, a copilot that returns an editable starting point instead of a dead-end mesh, or a search layer that treats past CAD programs as building blocks.