Skip to content

Conversation

@Nielsbishere
Copy link
Contributor

@Nielsbishere Nielsbishere commented Nov 18, 2025

Summary

This PR introduces a new pre-IR HLSL reflection system for DXC, allowing tools to inspect source-level constructs immediately after parsing and semantic analysis, before lowering to DXIL or SPIR-V. This provides access to information that is currently lost during IR generation (syntax sugar, typedef names, namespaces, un-lowered constructs, symbol extents, etc.).

This is a large feature and I do not expect it to land as a single PR. The intent is to provide a proof of concept and a concrete design that demonstrates how a unified, compiler-agnostic reflection layer could simplify a wide range of tooling workflows. I should also note that I will likely be unable to maintain this beyond December 1st.

Motivation

Many tools today need structured information about HLSL source before it is lowered into DXIL or SPIR-V:

  • IDE tooling / shader editors require navigation, symbol info, type layout, and semantic analysis.
  • Custom toolchains often implement their own HLSL front end solely to extract registers, types, annotations, and metadata that DXC discards.
  • Migration tools for legacy shader models need access to default initializers, semantics on non-entry parameters, and source constructs that no longer exist in modern IR.
  • Bindings generators need to parse user types, namespaces, and resources in a stable way.
  • Static analysis tools want a lightweight AST-like view without compiling to DXIL.

Providing this information directly in DXC avoids the need for downstream projects to patch, fork, or re-implement large parts of the HLSL front end.

Overview

This feature adds:

1. A new COM interface: IHLSLReflector

Lets users reflect from source strings or from a serialized blob.

2. A standalone reflection container: hlsl::ReflectionData

A compact, AST-inspired data model that captures symbols, types, nodes, and relationships.
It serializes to binary and JSON.

3. A new flag group and tool: dxreflector.exe

Provides -reflect-* flags mirroring typical DXC usage and produces human-readable JSON or binary results.

Reflection is performed after preprocessing and semantic analysis, but before any IR generation, ensuring access to source-level information.

What It Reflects

Depending on enabled features:

  • Basics: registers, constants, cbuffers
  • Functions: signatures, entrypoints, parameters
  • User-defined types: structs, enums, typedefs, interfaces, unions
  • Namespaces and scopes: full parent/child relationships
  • Annotations: custom metadata on symbols
  • Symbols: names, file paths, line/column spans (optional via flag)
  • Control flow scopes: if/else/switch/loops (optional)

Example (COM)

reflector->FromSource(
    source,
    (const wchar_t*)sourcePath.ptr,
    (LPCWSTR*)strings.ptr, U32(strings.length),
    nullptr, 0, // Defines
    interfaces->includeHandler,
    &hlslReflectRes
);

hlslReflectRes->GetResult(&reflectBinary);

reflector->FromBlob(reflectBinary, &reflectionData);

reflectionData->GetDesc(&reflDesc);

for (uint32_t i = 0; i < reflDesc.FunctionCount; ++i) {
    D3D12_HLSL_FUNCTION_DESC funcDesc;
    reflectionData->GetFunctionDesc(i, &funcDesc);
}

Example (CLI)

Input:

struct A { float a; };

dxreflector.exe myShader.hlsl

{
    "Features": ["Basics", "Functions", "Namespaces", "UserTypes", "Scopes", "Symbols"],
    "Children": [
        {
            "Name": "A",
            "NodeType": "Struct",
            "Children": [
                {
                    "Name": "a",
                    "NodeType": "Variable",
                    "Type": { "Name": "float" }
                }
            ]
        }
    ]
}

Design Intent

The system is architected so that hlsl::ReflectionData is compiler-agnostic:

  • The AST-to-reflection walker (dxcreflection_from_ast.cpp) is isolated from the container format.

  • Alternative producers could be generated in the future:

    • A Clang-HLSL-based frontend
    • A DXIL- or SPIR-V-based reflection backend (post-IR)
  • This allows the same serialized container format to act as a unified, long-term reflection layer across toolchains and languages.

Use Cases

  • Replacing ad-hoc or custom HLSL parsers in existing toolchains
  • More robust migration tooling for legacy shader models
  • Documentation or visualization utilities (graphs, call trees, type maps)
  • Static analysis and linting tools
  • IDE integrations requiring rich, source-level navigation
  • Automatic bindings generation to other languages or frameworks

This allows these workflows to avoid forking or modifying DXC.

Proposed Merge Strategy (splitting PRs)

To make this reviewable and practical, the feature can be split into incremental PRs:

  1. Public API: dxcreflect.h + unimplemented COM stubs
  2. Container model: skeleton of hlsl::ReflectionData
  3. JSON-only output (no serialization)
  4. Minimal AST walker: basics only, no scopes/statements
  5. Scopes & control-flow reflection
  6. Binary serialization & deserialization
  7. CLI tool and feature flags

This preserves reviewer sanity and limits scope per change.

Known Issues / TODO

  • Uninstantiated templates are skipped; instantiated templates behave correctly
  • Unit tests are present but incomplete given feature breadth
  • Function parameter nodeIds and statements are not currently in the COM API
  • Default initializers not yet included
  • Bitfields ignored
  • DXIL/SPIR-V backend not implemented
  • Requires a fully-formed preprocessed HLSL input
  • Function lookup by name does not handle overload resolution
  • CLI lacks bin→json conversion helpers and default .json output
  • Scope start/end offsets not yet emitted

Non-Goals

To avoid confusion:

  • Not a full AST serializer
  • Not a diagnostic system (errors/warnings remain in DXC)
  • Not a replacement for IR-level reflection (it complements it)
  • Not a transformation or rewrite engine (preprocessor is already applied)

Final Notes

This PR is intentionally large as a proof of concept.
I do not expect this to be merged in its current form, especially since I may be unable to maintain it beyond December 1st, but I hope it serves as a concrete reference and discussion starting point for a potential unified reflection system in DXC.

Nielsbishere and others added 30 commits July 15, 2025 19:44
…he IDxcRewriter, this will allow generating register ids for registers that don't fully define them.
…allow passing different rewriter flags. Also added support for cbuffer auto bindings.
…ently it can build a mini AST out of the clang AST. The goal of this AST is to provide a simplified AST that is extremely easy for users to iterate over and is quite efficient. This AST can be used to supplement reflection from DXIL/SPIRV since this data disappears while lowering to DXIL/SPIRV; such as unused registers, struct definitions, enum definitions, annotations as well as source ranges of the symbols to allow jumping to symbols. All of this allows external tooling to more easily report errors while doing validation with reflection and would allow editors to show a simplified view of the symbols defined in the current file. It opens up a lot of possibilities for HLSL editors because there is simply more information available and avoids everyone from writing their own parser that isn't properly HLSL compliant. As an example for my own projects; I use my own parser to collect functions with annotations so I can handle compiling the relevant entries automatically and in my professional project a custom parser is used to obtain lots of information since shaders are used to guide the render library (such as providing visual scripting nodes that need to be consistent even if registers are compiled out). Currently only supporting namespaces, resources/registers, enums. TODO: Variables, nested arrays for registers, annotations, functions, body for functions, targeted -reflect-hlsl functions like -reflect-hlsl-functions to avoid a large AST, expose the generated reflection as a serialized blob and be able to use it as an interface from C++, enums should store members of Enum as nodes to allow source code inspection. EnumValues should just be int64_t[]. EnumValue should be relative to start value of Enum.
… though that one is a bit tricky. Functions are now registered. Up next: CBuffer & annotations.
…emporary hack, required because unknown annotations [[custom::syntax("test")]] emit a warning and don't make it to the AST. With this enabled, it will turn these annotations into a clang annotation and preserves the entire syntax as intended for further parsing. Usage for this can vary from reflection info (example: providing a static sampler for the engine to create) to the pre-processor guiding how to compile the final binary (for example an alternative to [shader("vertex")] with oxc::stage to allow compiling it as a standalone entrypoint). These annotations eventually end up into the reflection data object for IDxcHLSLReflection and they are attached per node; each node has an annotation start and count. Also disabled temporary debug prints in RecursiveReflectHLSL. Added some temporary debug utilities to recursively print nodes and their annotations.
…BD: provide more information to allow a full reflect. Will apply the same thing to consistent-bindings
…g array ids to handle register multi dimensional arrays if there is more than just 1 array dimension; aka reflection can now maintain full array info of a register (up to 8 like spirv).
…o export (e.g. registers + cbuffer (basics), functions, namespaces, user-types, function-internals, variables) to keep the AST small if not more info is needed
…egister and started defining type, variable and 'constant buffer'.
… correctly, added some todos and empty string is now specified as string id 0. Now printing test through node iteration.
… no other data is required. Removed unused data from DxcHLSLType and added logging for a type. A buffer now recurses through members and makes nodes. Now handling printing types and member variables.
…uffer information such as StructuredBuffer/ConstantBuffer.
…of compiler & cpu. TODO: Node bitfields and make members of dxc hlsl type flat
…most everywhere for testing purposes. Added helpers for (de)serializing data. Added serialize/deserialize for DxcReflectionData to ensure a buffer can be output from the rewriter and can be loaded through the utility.
…on to DxcReflectionData with buffer (deserialize).
…nvalid user data is passed that may crash the caller. Also fixed a few issues that were found by writing this validator
…hat it does. Exposed the feature flags of reflection to the internal and external struct to allow tools to know what features were used when exporting.
…ST but you already have the nodeId -> names somewhere or you have no use for it (e.g. you only want to know what data is in 't0, space5', but no need for other type info). Split node and node symbol data as well as type name and member name which allows an AST without any names. Node and symbol are now (somewhat) tightly packed relative to cache size. Annotations are now stored in a StringsNonDebug list that gets preserved even when stripping debug data.
…k compatibility between two otherwise identical ASTs (except debug flag)
…nction names. For example myNamespace::_buf.$Element.myType.a.b. This currently doesn't distinguish function/template overloading though. The use is for simple auto complete
…tin' or non builtin. 'builtin' currently only supports the shader annotation, since others might contain more complex syntax that is easier to handle at IL level
…dxil lowering properly, it's similar to -flegacy-resource-reservation but distinct in some key ways: the option only works for reserved registers and we need it to work on any register that has been compiled out. We also don't necessarily need the register to stay at the end of DXIL generation, in fact we'd rather not. Because having the register present can cause unintended performance penalties (for example, I use this reflection to know what transitions to issue)
…at depends on the backend. Made D3D12_HLSL_ANNOTATION which has a name and whether or not it is a builtin (MS annotation or C++ annotation). Started implementing the COM frontend. Fixed issue with AST that includes a Texture2DMS since apparently it includes sample count. Started separating the reflector from the rewriter.
…ell as a non recursive child count. TODO: Structs, function parameters, constant buffers, types, arrays.
…dding INT64 to WinAdapter and fixing WinAdapter includes for d3d12shader.h. All reflection targets are now (almost) the highest error level. Fixed build issues.
…interpolation modes as well as function parameters
…or to IHLSLReflector. Added RunDxReflector (%dxreflector) to the file check to allow properly testing reflection, deserialization & serialization and json output. Added a basic unit test for reflection (resources only). dxreflector.exe now properly outputs to the console without printf, allowing the testing utility and externals to not have to generate a json.
…s to have members; which is true in C but not C++/HLSL. Fixed issue with dxreflector.exe if the HLSL is invalid and no -Fo is used.
…sorts of array sizes. Added unit test for 'empty' hlsl files
…ymbols, inheritance and functions/annotations/(in)(out)/interpolationMode. Fixed variable names of structs when stripping symbols
…st to show what is supported with templates/using and what isn't
…Removed redundant json creation in dxcreflector.cpp. Symbols printed as a child now only show the name in the child and the symbol as a separate object.
…se/default and in the future if/else if/else to be done through a single parent node. Added some unit tests for switch. Enum (data) type is now also used for storing case info. Changed the statement walking from a switch case to an explicit walk to properly go through the cases and default. Added validation for switch/case/default.
… first/else if/else nodes. Updated unit tests for stmts and made them easier to find issues with. Now properly handling case/default children by using get sub smt. Fixed some json serialization problems for raw_data_test.
…g annotations and other information from nodes. Also added interpolation mode to NODE to allow struct members to have interpolation mode as well.
@github-actions
Copy link
Contributor

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 67c9849bc44ecbdc8dce02a97d21a7ec0472ef33 c00fc3106fb5bafa78b38198a25271f4d7bd028c -- include/dxc/DxcReflection/DxcReflectionContainer.h include/dxc/dxcreflect.h tools/clang/tools/dxcreflection/dxcreflection_from_ast.cpp tools/clang/tools/dxcreflection/dxcreflector.cpp tools/clang/tools/dxcreflectioncontainer/DxcReflectionContainer.cpp tools/clang/tools/dxcreflectioncontainer/DxcReflectionJson.cpp tools/clang/tools/dxreflector/dxreflector.cpp include/dxc/Support/ErrorCodes.h include/dxc/Support/HLSLOptions.h include/dxc/Test/DxcTestUtils.h include/dxc/WinAdapter.h lib/DxcSupport/HLSLOptions.cpp tools/clang/include/clang/Basic/LangOptions.h tools/clang/lib/Sema/SemaDeclAttr.cpp tools/clang/tools/dxcompiler/dxcapi.cpp tools/clang/unittests/HLSLTestLib/FileCheckerTest.cpp
View the diff from clang-format here.
diff --git a/include/dxc/Support/ErrorCodes.h b/include/dxc/Support/ErrorCodes.h
index d49ab292..c34f4d68 100644
--- a/include/dxc/Support/ErrorCodes.h
+++ b/include/dxc/Support/ErrorCodes.h
@@ -158,4 +158,3 @@
 // model
 #define DXC_E_INCORRECT_PROGRAM_VERSION                                        \
   DXC_MAKE_HRESULT(DXC_SEVERITY_ERROR, FACILITY_DXC, (0x001F))
-  
\ No newline at end of file
diff --git a/include/dxc/dxcreflect.h b/include/dxc/dxcreflect.h
index 958caff8..d97d8f90 100644
--- a/include/dxc/dxcreflect.h
+++ b/include/dxc/dxcreflect.h
@@ -21,8 +21,8 @@
 #define interface struct
 #endif
 
-#include "d3d12shader.h"
 #include "./dxcapi.h"
+#include "d3d12shader.h"
 
 #ifdef _MSC_VER
 #define CLSID_SCOPE __declspec(selectany) extern
@@ -166,7 +166,7 @@ enum D3D12_HLSL_NODE_TYPE {
   D3D12_HLSL_NODE_TYPE_CASE,
   D3D12_HLSL_NODE_TYPE_DEFAULT,
 
-   D3D12_HLSL_NODE_TYPE_USING,
+  D3D12_HLSL_NODE_TYPE_USING,
 
   D3D12_HLSL_NODE_TYPE_IF_FIRST,
   D3D12_HLSL_NODE_TYPE_ELSE_IF,
diff --git a/tools/clang/lib/Sema/SemaDeclAttr.cpp b/tools/clang/lib/Sema/SemaDeclAttr.cpp
index 1ffbfe14..b0718339 100644
--- a/tools/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/tools/clang/lib/Sema/SemaDeclAttr.cpp
@@ -4586,17 +4586,19 @@ static void ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D,
   if (Attr.getKind() == AttributeList::UnknownAttribute ||
       !Attr.existsInTarget(S.Context.getTargetInfo().getTriple())) {
 
-    //HLSL change, language option to maintain unknown annotations.
-    //This is extremely useful for extending the language for providing extra reflection info.
-    //These annotations are accessible through IHLSLReflectionData.
+    // HLSL change, language option to maintain unknown annotations.
+    // This is extremely useful for extending the language for providing extra
+    // reflection info. These annotations are accessible through
+    // IHLSLReflectionData.
 
     const LangOptions &LangOpts = S.Context.getLangOpts();
 
     if (LangOpts.HLSL && LangOpts.PreserveUnknownAnnotations) {
 
-      //In the case of oxc::stage("compute") clang only maintains oxc::stage.
-      //We get around this by instantiating a lexer and finding the end of the annotation (]]).
-      //We don't do any cleanup and pass the inside of [[]] as is, so any external parsing can be done on it.
+      // In the case of oxc::stage("compute") clang only maintains oxc::stage.
+      // We get around this by instantiating a lexer and finding the end of the
+      // annotation (]]). We don't do any cleanup and pass the inside of [[]] as
+      // is, so any external parsing can be done on it.
 
       SourceRange AttrRange = Attr.getRange();
 
@@ -4611,10 +4613,10 @@ static void ProcessDeclAttribute(Sema &S, Scope *scope, Decl *D,
 
       Lexer Lex(SM.getLocForStartOfFile(FID), LangOpts, Buffer.begin(),
                 AttrData, Buffer.end());
-      
+
       Token Tok;
-      while (!Lex.LexFromRawLexer(Tok)) {       //Search until ]]
-      
+      while (!Lex.LexFromRawLexer(Tok)) { // Search until ]]
+
         if (!Tok.is(tok::r_square))
           continue;
 
diff --git a/tools/clang/tools/dxcompiler/dxcapi.cpp b/tools/clang/tools/dxcompiler/dxcapi.cpp
index e298ed84..5c6ac00d 100644
--- a/tools/clang/tools/dxcompiler/dxcapi.cpp
+++ b/tools/clang/tools/dxcompiler/dxcapi.cpp
@@ -20,8 +20,8 @@
 #include "dxc/Support/Global.h"
 #include "dxc/config.h"
 #include "dxc/dxcisense.h"
-#include "dxc/dxctools.h"
 #include "dxc/dxcreflect.h"
+#include "dxc/dxctools.h"
 #ifdef _WIN32
 #include "dxcetw.h"
 #endif
@@ -94,7 +94,7 @@ static HRESULT ThreadMallocDxcCreateInstance(REFCLSID rclsid, REFIID riid,
     hr = CreateDxcPdbUtils(riid, ppv);
   } else if (IsEqualCLSID(rclsid, CLSID_DxcRewriter)) {
     hr = CreateDxcRewriter(riid, ppv);
-  }  else if (IsEqualCLSID(rclsid, CLSID_DxcReflector)) {
+  } else if (IsEqualCLSID(rclsid, CLSID_DxcReflector)) {
     hr = CreateDxcReflector(riid, ppv);
   } else if (IsEqualCLSID(rclsid, CLSID_DxcLinker)) {
     hr = CreateDxcLinker(riid, ppv);
diff --git a/tools/clang/tools/dxcreflection/dxcreflection_from_ast.cpp b/tools/clang/tools/dxcreflection/dxcreflection_from_ast.cpp
index 23c00a87..2a8cdaf9 100644
--- a/tools/clang/tools/dxcreflection/dxcreflection_from_ast.cpp
+++ b/tools/clang/tools/dxcreflection/dxcreflection_from_ast.cpp
@@ -128,10 +128,11 @@ PushNextNodeId(uint32_t &NodeId, ReflectionData &Refl, const SourceManager &SM,
     }
   }
 
-  //There is a forward declare, but we haven't seen it before.
-  //This happens for example if we have a fwd func declare in a struct, but define it in global namespace.
-  //(only) -reflect-function will hide this struct from us, but will still find a function in the global scope.
-  //This fixes that problem.
+  // There is a forward declare, but we haven't seen it before.
+  // This happens for example if we have a fwd func declare in a struct, but
+  // define it in global namespace. (only) -reflect-function will hide this
+  //struct from us, but will still find a function in the global scope. This
+  // fixes that problem.
 
   if (!isFwdDeclare && fwdDeclare && fwdDeclare != DeclSelf &&
       FwdDecls->find(fwdDeclare) == FwdDecls->end()) {
@@ -1541,7 +1542,7 @@ struct RecursiveStmtReflector : public StmtVisitor<RecursiveStmtReflector> {
 
     if (LastError)
       return;
-    
+
     uint32_t loc = uint32_t(Refl.IfSwitchStatements.size());
 
     const SourceRange &sourceRange = If->getSourceRange();
@@ -1557,7 +1558,7 @@ struct RecursiveStmtReflector : public StmtVisitor<RecursiveStmtReflector> {
 
     Refl.IfSwitchStatements.push_back(ReflectionIfSwitchStmt());
 
-    std::vector<Stmt*> branches;
+    std::vector<Stmt *> branches;
     branches.reserve(2);
     branches.push_back(If);
 
@@ -1793,8 +1794,7 @@ struct RecursiveStmtReflector : public StmtVisitor<RecursiveStmtReflector> {
 
         nodeType = D3D12_HLSL_NODE_TYPE_CASE;
 
-      } else if (DefaultStmt *defaultStmt =
-                     dyn_cast<DefaultStmt>(child)) {
+      } else if (DefaultStmt *defaultStmt = dyn_cast<DefaultStmt>(child)) {
         switchCase = defaultStmt;
         hasDefault = true;
         nodeType = D3D12_HLSL_NODE_TYPE_DEFAULT;
diff --git a/tools/clang/tools/dxcreflection/dxcreflector.cpp b/tools/clang/tools/dxcreflection/dxcreflector.cpp
index 6cabaf3d..576d4f72 100644
--- a/tools/clang/tools/dxcreflection/dxcreflector.cpp
+++ b/tools/clang/tools/dxcreflection/dxcreflector.cpp
@@ -670,7 +670,7 @@ struct HLSLReflectionData : public IHLSLReflectionData {
   std::vector<CHLSLFunctionParameter> FunctionParameters;
 
   HLSLReflectionData() : m_refCount(1) {}
-  virtual ~HLSLReflectionData() = default; 
+  virtual ~HLSLReflectionData() = default;
 
   // TODO: This function needs another look definitely
   void Finalize() {
@@ -905,7 +905,8 @@ struct HLSLReflectionData : public IHLSLReflectionData {
             : "";
 
     *pDesc = D3D12_HLSL_ENUM_DESC{
-        name, uint32_t(Data.Nodes[enm.NodeId].GetChildCount()), enm.Type, enm.NodeId};
+        name, uint32_t(Data.Nodes[enm.NodeId].GetChildCount()), enm.Type,
+        enm.NodeId};
 
     return S_OK;
   }
@@ -1688,7 +1689,7 @@ public:
   DXC_MICROCOM_TM_CTOR(DxcReflector)
   DXC_LANGEXTENSIONS_HELPER_IMPL(m_langExtensionsHelper)
 
-  virtual ~DxcReflector() = default; 
+  virtual ~DxcReflector() = default;
 
   HRESULT STDMETHODCALLTYPE QueryInterface(REFIID iid,
                                            void **ppvObject) override {
diff --git a/tools/clang/tools/dxcreflectioncontainer/DxcReflectionJson.cpp b/tools/clang/tools/dxcreflectioncontainer/DxcReflectionJson.cpp
index 3bb66e83..1c9f0c8d 100644
--- a/tools/clang/tools/dxcreflectioncontainer/DxcReflectionJson.cpp
+++ b/tools/clang/tools/dxcreflectioncontainer/DxcReflectionJson.cpp
@@ -1281,12 +1281,13 @@ uint32_t PrintNodeRecursive(const ReflectionData &Reflection, uint32_t NodeId,
   case D3D12_HLSL_NODE_TYPE_IF_FIRST:
   case D3D12_HLSL_NODE_TYPE_ELSE_IF: {
 
-    const ReflectionBranchStmt &stmt = Reflection.BranchStatements[node.GetLocalId()];
+    const ReflectionBranchStmt &stmt =
+        Reflection.BranchStatements[node.GetLocalId()];
     uint32_t start = NodeId + 1;
 
     if (stmt.HasConditionVar())
       Json.Object("Branch", [NodeId, &Reflection, &Json, &start, &Settings,
-                         hasSymbols, &childrenToSkip]() {
+                             hasSymbols, &childrenToSkip]() {
         Json.Object("Condition", [NodeId, &Reflection, &Json, &start, &Settings,
                                   hasSymbols, &childrenToSkip]() {
           if (hasSymbols)
@@ -1349,7 +1350,8 @@ uint32_t PrintNodeRecursive(const ReflectionData &Reflection, uint32_t NodeId,
   }
 
   // Switch; turns into ("Condition"), ("Case": [])
-  // If(Root); is just a container for IfFirst/ElseIf/Else (no need to handle it here)
+  // If(Root); is just a container for IfFirst/ElseIf/Else (no need to handle it
+  // here)
 
   else if (nodeType == D3D12_HLSL_NODE_TYPE_SWITCH) {
 
  • Check this box to apply formatting changes to this branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: New

Development

Successfully merging this pull request may close these issues.

2 participants