Skip to content

Commit 16f2c95

Browse files
committed
[Very WIP] Rewrite the core of the binding generator.
TL;DR: The binding generator is a mess as of right now. At first it was funny (in a "this is challenging" sense) to improve on it, but this is not sustainable. The truth is that the current architecture of the binding generator is a huge pile of hacks, so these few days I've been working on rewriting it with a few goals. 1) Have the hacks as contained and identified as possible. They're sometimes needed because how clang exposes the AST, but ideally those hacks are well identified and don't interact randomly with each others. As an example, in the current bindgen when scanning the parameters of a function that references a struct clones all the struct information, then if the struct name changes (because we mangle it), everything breaks. 2) Support extending the bindgen output without having to deal with clang. The way I'm aiming to do this is separating completely the parsing stage from the code generation one, and providing a single id for each item the binding generator provides. 3) No more random mutation of the internal representation from anywhere. That means no more Rc<RefCell<T>>, no more random circular references, no more borrow_state... nothing. 4) No more deduplication of declarations before code generation. Current bindgen has a stage, called `tag_dup_decl`[1], that takes care of deduplicating declarations. That's completely buggy, and for C++ it's a complete mess, since we YOLO modify the world. I've managed to take rid of this using the clang canonical declaration, and the definition, to avoid scanning any type/item twice. 5) Code generation should not modify any internal data structure. It can lookup things, traverse whatever it needs, but not modifying randomly. 6) Each item should have a canonical name, and a single source of mangling logic, and that should be computed from the inmutable state, at code generation. I've put a few canonical_name stuff in the code generation phase, but it's still not complete, and should change if I implement namespaces. Improvements pending until this can land: 1) Add support for missing core stuff, mainly generating functions (note that we parse the signatures for types correctly though), bitfields, generating C++ methods. 2) Add support for the necessary features that were added to work around some C++ pitfalls, like opaque types, etc... 3) Add support for the sugar that Manish added recently. 4) Optionally (and I guess this can land without it, because basically nobody uses it since it's so buggy), bring back namespace support. These are not completely trivial, but I think I can do them quite easily with the current architecture. I'm putting the current state of affairs here as a request for comments... Any thoughts? Note that there are still a few smells I want to eventually re-redesign, like the ParseError::Recurse thing, but until that happens I'm way happier with this kind of architecture. I'm keeping the old `parser.rs` and `gen.rs` in tree just for reference while I code, but they will go away. [1]: https://github.com/Yamakaky/rust-bindgen/blob/master/src/gen.rs#L448
1 parent 2d94347 commit 16f2c95

23 files changed

+3042
-960
lines changed

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ quasi = { version = "0.15", features = ["with-syntex"] }
1616
clippy = { version = "*", optional = true }
1717
syntex_syntax = "0.38"
1818
log = "0.3.*"
19+
env_logger = "*"
1920
libc = "0.2.*"
2021
clang-sys = "0.8.0"
2122

build.rs

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,11 +5,11 @@ mod codegen {
55

66
pub fn main() {
77
let out_dir = env::var_os("OUT_DIR").unwrap();
8-
let src = Path::new("src/gen.rs");
9-
let dst = Path::new(&out_dir).join("gen.rs");
8+
let src = Path::new("src/codegen/mod.rs");
9+
let dst = Path::new(&out_dir).join("codegen.rs");
1010

1111
quasi_codegen::expand(&src, &dst).unwrap();
12-
println!("cargo:rerun-if-changed=src/gen.rs");
12+
println!("cargo:rerun-if-changed=src/codegen/mod.rs");
1313
}
1414
}
1515

src/bin/bindgen.rs

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22
#![crate_type = "bin"]
33

44
extern crate bindgen;
5+
extern crate env_logger;
56
#[macro_use]
67
extern crate log;
78
extern crate clang_sys;
@@ -230,6 +231,13 @@ Options:
230231
}
231232

232233
pub fn main() {
234+
log::set_logger(|max_log_level| {
235+
use env_logger::Logger;
236+
let env_logger = Logger::new();
237+
max_log_level.set(env_logger.filter());
238+
Box::new(env_logger)
239+
}).expect("Failed to set logger.");
240+
233241
let mut bind_args: Vec<_> = env::args().collect();
234242
let bin = bind_args.remove(0);
235243

src/clang.rs

Lines changed: 76 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,13 @@ pub struct Cursor {
1515
x: CXCursor
1616
}
1717

18+
impl fmt::Debug for Cursor {
19+
fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result {
20+
write!(fmt, "Cursor({} kind: {}, loc: {})",
21+
self.spelling(), kind_to_str(self.kind()), self.location())
22+
}
23+
}
24+
1825
pub type CursorVisitor<'s> = for<'a, 'b> FnMut(&'a Cursor, &'b Cursor) -> Enum_CXChildVisitResult + 's;
1926

2027
impl Cursor {
@@ -55,6 +62,12 @@ impl Cursor {
5562
}
5663
}
5764

65+
pub fn is_anonymous(&self) -> bool {
66+
unsafe {
67+
clang_Cursor_isAnonymous(self.x) != 0
68+
}
69+
}
70+
5871
pub fn is_template(&self) -> bool {
5972
self.specialized().is_valid()
6073
}
@@ -77,10 +90,11 @@ impl Cursor {
7790
}
7891
}
7992

80-
pub fn raw_comment(&self) -> String {
81-
unsafe {
93+
pub fn raw_comment(&self) -> Option<String> {
94+
let s = unsafe {
8295
String_ { x: clang_Cursor_getRawCommentText(self.x) }.to_string()
83-
}
96+
};
97+
if s.is_empty() { None } else { Some(s) }
8498
}
8599

86100
pub fn comment(&self) -> Comment {
@@ -165,12 +179,18 @@ impl Cursor {
165179
}
166180
}
167181

168-
pub fn enum_val(&self) -> i64 {
182+
pub fn enum_val_signed(&self) -> i64 {
169183
unsafe {
170184
clang_getEnumConstantDeclValue(self.x) as i64
171185
}
172186
}
173187

188+
pub fn enum_val_unsigned(&self) -> u64 {
189+
unsafe {
190+
clang_getEnumConstantDeclUnsignedValue(self.x) as u64
191+
}
192+
}
193+
174194
// typedef
175195
pub fn typedef_type(&self) -> Type {
176196
unsafe {
@@ -293,10 +313,19 @@ impl Hash for Cursor {
293313
}
294314

295315
// type
316+
#[derive(Clone, Hash)]
296317
pub struct Type {
297318
x: CXType
298319
}
299320

321+
impl fmt::Debug for Type {
322+
fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result {
323+
write!(fmt, "Type({}, kind: {}, decl: {:?}, canon: {:?})",
324+
self.spelling(), type_to_str(self.kind()), self.declaration(),
325+
self.declaration().canonical())
326+
}
327+
}
328+
300329
#[derive(Debug, Copy, Clone, Eq, PartialEq, Hash)]
301330
pub enum LayoutError {
302331
Invalid,
@@ -378,6 +407,24 @@ impl Type {
378407
}
379408
}
380409

410+
pub fn fallible_align(&self) -> Result<usize, LayoutError> {
411+
unsafe {
412+
let val = clang_Type_getAlignOf(self.x);
413+
if val < 0 {
414+
Err(LayoutError::from(val as i32))
415+
} else {
416+
Ok(val as usize)
417+
}
418+
}
419+
}
420+
421+
pub fn fallible_layout(&self) -> Result<::ir::layout::Layout, LayoutError> {
422+
use ir::layout::Layout;
423+
let size = try!(self.fallible_size());
424+
let align = try!(self.fallible_align());
425+
Ok(Layout::new(size, align))
426+
}
427+
381428
pub fn align(&self) -> usize {
382429
unsafe {
383430
let val = clang_Type_getAlignOf(self.x);
@@ -581,21 +628,25 @@ pub struct Index {
581628
}
582629

583630
impl Index {
584-
pub fn create(pch: bool, diag: bool) -> Index {
631+
pub fn new(pch: bool, diag: bool) -> Index {
585632
unsafe {
586633
Index { x: clang_createIndex(pch as c_int, diag as c_int) }
587634
}
588635
}
636+
}
589637

590-
pub fn dispose(&self) {
638+
impl fmt::Debug for Index {
639+
fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result {
640+
write!(fmt, "Index {{ }}")
641+
}
642+
}
643+
644+
impl Drop for Index {
645+
fn drop(&mut self) {
591646
unsafe {
592647
clang_disposeIndex(self.x);
593648
}
594649
}
595-
596-
pub fn is_null(&self) -> bool {
597-
self.x.is_null()
598-
}
599650
}
600651

601652
// Token
@@ -609,6 +660,12 @@ pub struct TranslationUnit {
609660
x: CXTranslationUnit
610661
}
611662

663+
impl fmt::Debug for TranslationUnit {
664+
fn fmt(&self, fmt: &mut fmt::Formatter) -> fmt::Result {
665+
write!(fmt, "TranslationUnit {{ }}")
666+
}
667+
}
668+
612669
impl TranslationUnit {
613670
pub fn parse(ix: &Index, file: &str, cmd_args: &[String],
614671
unsaved: &[UnsavedFile], opts: ::libc::c_uint) -> TranslationUnit {
@@ -655,12 +712,6 @@ impl TranslationUnit {
655712
}
656713
}
657714

658-
pub fn dispose(&self) {
659-
unsafe {
660-
clang_disposeTranslationUnit(self.x);
661-
}
662-
}
663-
664715
pub fn is_null(&self) -> bool {
665716
self.x.is_null()
666717
}
@@ -687,6 +738,15 @@ impl TranslationUnit {
687738
}
688739
}
689740

741+
impl Drop for TranslationUnit {
742+
fn drop(&mut self) {
743+
unsafe {
744+
clang_disposeTranslationUnit(self.x);
745+
}
746+
}
747+
}
748+
749+
690750
// Diagnostic
691751
pub struct Diagnostic {
692752
x: CXDiagnostic

src/clangll.rs

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -428,7 +428,7 @@ pub const CXCallingConv_X86_64SysV: c_uint = 11;
428428
pub const CXCallingConv_Invalid: c_uint = 100;
429429
pub const CXCallingConv_Unexposed: c_uint = 200;
430430
#[repr(C)]
431-
#[derive(Copy, Clone)]
431+
#[derive(Copy, Clone, Hash)]
432432
pub struct CXType {
433433
pub kind: Enum_CXTypeKind,
434434
pub data: [*mut c_void; 2],
@@ -1168,6 +1168,7 @@ extern "C" {
11681168
pub fn clang_Cursor_getMangling(C: CXCursor) -> CXString;
11691169
pub fn clang_Cursor_getParsedComment(C: CXCursor) -> CXComment;
11701170
pub fn clang_Cursor_getModule(C: CXCursor) -> CXModule;
1171+
pub fn clang_Cursor_isAnonymous(C: CXCursor) -> c_uint;
11711172
pub fn clang_Module_getASTFile(Module: CXModule) -> CXFile;
11721173
pub fn clang_Module_getParent(Module: CXModule) -> CXModule;
11731174
pub fn clang_Module_getName(Module: CXModule) -> CXString;

0 commit comments

Comments
 (0)