AI Code Compliance Checker
Type
Year
Team
Personal Project
2025
Team of 2 (Arjun Khurana, Angad Khurana)
AEC TECH
AI
Retrieval Augmented Generation | Artificial Intelligence
The Idea
I worked as an architect for six months for one of India’s largest real estate developer, where much of my role involved manual, labor-intensive code compliance reviews
It wasn’t until almost a year after leaving that I saw a similar job description shared in my college WhatsApp group and the inefficiency of these tasks truly struck me. That moment clarified the opportunity: much of this work could be streamlined with the right technology. It became the starting point for pursuing a solution
It was a moment of realization when I recognized that nearly 50% of the job description wasn’t actually a skill, but a set of tasks that could be handled efficiently with the right technology
Defining the Challenges
Information Overload
Architects and engineers must navigate massive regulatory documents (building codes, bylaws, LEED, TOD), often thousands of pages long
Manual Bottlenecks
Searching these documents is slow, tedious, and highly error-prone
Traditional AI Limitations
Can only process a small amount of text at once(“lost-in-the-middle” problem)
Models often generate confident but incorrect information when context is missing or unclear
Models can’t retain uploaded information across sessions or manage large, changing document sets
The Problem
Navigating endless code-compliance documents is an architect’s nightmare
The Solution
An intelligent document retrieval system that transforms how architecture professionals access regulatory information
All compliance documents for construction were grouped into four categories: building guidelines, local bylaws, sustainability policies, and miscellaneous standards (e.g., TOD). The goal was to organize them for intuitive, high-level selection while emphasizing discovery rather than traditional search
Working Principle Investigation | Conventional RAG Framework
Limitations of Conventional RAG System
Understanding Challenges of Document Structure
Complex table notes structure
Atypical table layout
Images embedded in tables
Why Conventional RAG does not work here?
Final RAG Framework | Late Interaction Retrieval using ColQwen
Technical Innovation
This project implements late-interaction multi-vector retrieval using ColQwen-based embeddings, where each document token generates its own vector representation.
At query time, the system performs token-level similarity matching between multi-vector embeddings, dramatically improving precision over single-vector approaches

Jina Embeddings v4 is an open-source model that’s a fine-tuned version of ColQwen v2, further boosting its performance.

Weaviate's database was used for memory-efficient multi-vector embedding storage and fast, accurate retrieval powered by its MUVERA algorithm
Orchestration & Scalability
LangGraph structures the retrieval pipeline as a directed graph. This graph-based approach enables conditional routing (e.g., switching strategies for single vs. multi-document queries) and robust state persistence across user sessions.
User Experience & Interface
Layout & Heirarchy
A familiar layout is maintained, consistent with standard LLM interfaces such as ChatGPT. Preserving this format allows users to rely on their existing muscle memory for “talking to AI,” making the transition to our system seamless and intuitive
Document Root:
Single-click selection for main codes (NBC)
Nested Jurisdictions:
Collapsible menus keep main navigation clean
Visualizing Logic
Building codes are logic puzzles. We reduce cognitive tracing load by synthesizing text into diagrams
Engineering the UX
Users
I recently showed the ArchiCheck prototype to some former colleagues and friends, who are working as architects for a large real estate developer. They offered thoughtful feedback and suggestions to improve the product
Builders
Contributions
We both collaborated closely on the RAG orchestration and the broader AI pipeline, shaping the system’s intelligence together.
I focused on the front-end development and overall user experience, while Angad managed the web deployment and performance optimization.
Logo
The logo combines the letter ‘A’ with a simplified document icon, linking “Architect” with the documentation workflow at the heart of the product





















