Skip to content

New binary swiftdeps format #32131

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 11, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
1 change: 1 addition & 0 deletions include/swift/AST/FileSystem.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

#include "swift/Basic/FileSystem.h"
#include "swift/AST/DiagnosticEngine.h"
#include "swift/AST/DiagnosticsCommon.h"

namespace swift {

Expand Down
15 changes: 11 additions & 4 deletions include/swift/AST/FineGrainedDependencies.h
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
//===----- FineGrainedependencies.h -----------------------------*- C++ -*-===//
//===----- FineGrainedDependencies.h ----------------------------*- C++ -*-===//
//
// This source file is part of the Swift.org open source project
//
Expand Down Expand Up @@ -632,6 +632,11 @@ class DepGraphNode {

const DependencyKey &getKey() const { return key; }

/// Only used when the driver is reading a SourceFileDepGraphNode.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good comment! Sorry it cannot be enforced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be enforced with a 'friend' declaration -- but then I'd have to expose an implementation detail of the reader code in a public header. Either way, one or the other has to be public! And just having a public setter here doesn't seem like the worst idea in the world, since other data types here have public setters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds like there's no way to enforce it.

When I'm debugging an unexpected value in a data type, the smaller the scope of any setters, the easier it is to find the cause. Why would the existence of public setters for other types matter?

void setKey(const DependencyKey &key) {
this->key = key;
}

const Optional<StringRef> getFingerprint() const {
if (fingerprint) {
return StringRef(fingerprint.getValue());
Expand Down Expand Up @@ -684,15 +689,15 @@ class SourceFileDepGraphNode : public DepGraphNode {
/// True iff a Decl exists for this node.
/// If a provides and a depends in the existing system both have the same key,
/// only one SourceFileDepGraphNode is emitted.
bool isProvides;
bool isProvides = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the code, I wonder why false is the default. Would a comment explaining that be out of place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this the no-argument constructor produces an object where 'isProvides' has an undefined value. I'm not sure that warrants a comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I saw the "false" I figured that the isProvides field would only be set if it were true, and you were using the false because if never set, that meant it would be false. If such is the case, the code would be clearer to me with a comment.

If, on the other hand, it is the case that this field will be set to either truth value, and the initialization to false has no effect, then, the way I read code, I would find this initialization confusing without a comment. The initializer sets the sequence number to an illegal value so that initialized-but-never-set nodes can be detected.


friend ::llvm::yaml::MappingContextTraits<SourceFileDepGraphNode,
SourceFileDepGraph>;

public:
/// When the driver imports a node create an uninitialized instance for
/// deserializing.
SourceFileDepGraphNode() : DepGraphNode(), sequenceNumber(~0) {}
SourceFileDepGraphNode() : DepGraphNode() {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why get rid of the illegal sequence number? It was an extra check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already initialized to ~0 above, so this initializer was redundant.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, sounds good!


/// Used by the frontend to build nodes.
SourceFileDepGraphNode(DependencyKey key, Optional<StringRef> fingerprint,
Expand Down Expand Up @@ -736,6 +741,9 @@ class SourceFileDepGraphNode : public DepGraphNode {
: "somewhere else");
}

SWIFT_DEBUG_DUMP;
void dump(llvm::raw_ostream &os) const;

bool verify() const {
DepGraphNode::verify();
assert(getIsProvides() || isDepends());
Expand Down Expand Up @@ -872,7 +880,6 @@ class SourceFileDepGraph {

void emitDotFile(StringRef outputPath, DiagnosticEngine &diags);

private:
void addNode(SourceFileDepGraphNode *n) {
n->setSequenceNumber(allNodes.size());
allNodes.push_back(n);
Expand Down
133 changes: 133 additions & 0 deletions include/swift/AST/FineGrainedDependencyFormat.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
//===---- FineGrainedDependencyFormat.h - swiftdeps format ---*- C++ -*-======//
//
// This source file is part of the Swift.org open source project
//
// Copyright (c) 2014 - 2020 Apple Inc. and the Swift project authors
// Licensed under Apache License v2.0 with Runtime Library Exception
//
// See https://swift.org/LICENSE.txt for license information
// See https://swift.org/CONTRIBUTORS.txt for the list of Swift project authors
//
//===----------------------------------------------------------------------===//

#ifndef SWIFT_AST_FINEGRAINEDDEPENDENCYFORMAT_H
#define SWIFT_AST_FINEGRAINEDDEPENDENCYFORMAT_H

#include "llvm/Bitcode/RecordLayout.h"
#include "llvm/Bitstream/BitCodes.h"

namespace llvm {
class MemoryBuffer;
}

namespace swift {

class DiagnosticEngine;

namespace fine_grained_dependencies {

class SourceFileDepGraph;

using llvm::BCFixed;
using llvm::BCVBR;
using llvm::BCBlob;
using llvm::BCRecordLayout;

/// Every .swiftdeps file begins with these 4 bytes, for easy identification when
/// debugging.
const unsigned char FINE_GRAINED_DEPDENENCY_FORMAT_SIGNATURE[] = {'D', 'E', 'P', 'S'};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a reviewer/maintainer, it would be nice to have a comment right here explaining the purpose of a signature.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a comment (it's the magic number that the swiftdeps file begins with).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! Thanks.


const unsigned FINE_GRAINED_DEPENDENCY_FORMAT_VERSION_MAJOR = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense for this to be version 2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's the version number of the new binary format, not the version number of the overall dependency format. (In which case, it would be 3, with the fine-grained format v2 :-) )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I guess I was misled by the name "fine grained dependency format version", which implied to me that the number should be 2, since this is the second one such. But I can't think of anything that would break by leaving it as 1, since an old driver should reject this new format.


/// Increment this on every change.
const unsigned FINE_GRAINED_DEPENDENCY_FORMAT_VERSION_MINOR = 0;

using IdentifierIDField = BCVBR<13>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've always wondered where such numbers came from. Is this assuming that 2**13 is the max number of identifiers in a file? Is there an assertion somewhere? Is it just the number of leftover bits?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BCVBR is a variable-width encoding. 13 is the number of bits plus one which are encoded per chunk. I arrived at 2**12 empirically, but I don't think it matters too much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen if the value ever exceeded that? Would it just be slower, or would it break? If the latter, what would the error message be? Could a bug somewhere cause the code to think the value were more than 2**12 and would the result be reasonable to debug?

If the 13 is because the code counts on there being less than 2**12 names in a file, a comment would be nice.


using NodeKindField = BCFixed<3>;
using DeclAspectField = BCFixed<1>;

const unsigned RECORD_BLOCK_ID = llvm::bitc::FIRST_APPLICATION_BLOCKID;

/// The swiftdeps file format consists of a METADATA record, followed by zero or more
/// IDENTIFIER_NODE records.
///
/// Then, there is one SOURCE_FILE_DEP_GRAPH_NODE for each serialized
/// SourceFileDepGraphNode. These are followed by FINGERPRINT_NODE and
/// DEPENDS_ON_DEFINITION_NODE, if the node has a fingerprint and depends-on
/// definitions, respectively.
namespace record_block {
enum {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment about what this enum is for would be nice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe the very next thing explains it?? Not sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On third thought, the top of this file could contain an overview of what's going on here and how it works. Or put that in a .doc file and a pointer to it at the top of this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all standard boilerplate for the bitstream record stuff. I'm not sure it's worth explaining here.

METADATA = 1,
SOURCE_FILE_DEP_GRAPH_NODE,
FINGERPRINT_NODE,
DEPENDS_ON_DEFINITION_NODE,
IDENTIFIER_NODE,
};

// Always the first record in the file.
using MetadataLayout = BCRecordLayout<
METADATA, // ID
BCFixed<16>, // Dependency graph format major version
BCFixed<16>, // Dependency graph format minor version
BCBlob // Compiler version string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Putting the compiler version in here. I like it!!

>;

// After the metadata record, we have zero or more identifier records,
// for each unique string that is referenced from a SourceFileDepGraphNode.
//
// Identifiers are referenced by their sequence number, starting from 1.
// The identifier value 0 is special; it always represents the empty string.
// There is no IDENTIFIER_NODE serialized that corresponds to it, instead
// the first IDENTIFIER_NODE always has a sequence number of 1.
using IdentifierNodeLayout = BCRecordLayout<
IDENTIFIER_NODE,
BCBlob
>;

using SourceFileDepGraphNodeLayout = BCRecordLayout<
SOURCE_FILE_DEP_GRAPH_NODE, // ID
// The next four fields correspond to the fields of the DependencyKey
// structure.
NodeKindField, // DependencyKey::kind
DeclAspectField, // DependencyKey::aspect
IdentifierIDField, // DependencyKey::context
IdentifierIDField, // DependencyKey::name
BCFixed<1> // Is this a "provides" node?
>;

// Follows DEPENDS_ON_DEFINITION_NODE when the SourceFileDepGraphNode has a
// fingerprint set.
using FingerprintNodeLayout = BCRecordLayout<
FINGERPRINT_NODE,
BCBlob
>;

// Follows SOURCE_FILE_DEP_GRAPH_NODE and FINGERPRINT_NODE when the
// SourceFileDepGraphNode has one or more depends-on entries.
using DependsOnDefNodeLayout = BCRecordLayout<
DEPENDS_ON_DEFINITION_NODE,
BCVBR<16> // The sequence number (starting from 0) of the referenced
// SOURCE_FILE_DEP_GRAPH_NODE
>;
}

/// Tries to read the dependency graph from the given buffer.
/// Returns true if there was an error.
bool readFineGrainedDependencyGraph(llvm::MemoryBuffer &buffer,
SourceFileDepGraph &g);

/// Tries to read the dependency graph from the given path name.
/// Returns true if there was an error.
bool readFineGrainedDependencyGraph(llvm::StringRef path,
SourceFileDepGraph &g);

/// Tries to write the dependency graph to the given path name.
/// Returns true if there was an error.
bool writeFineGrainedDependencyGraph(DiagnosticEngine &diags, llvm::StringRef path,
const SourceFileDepGraph &g);

} // namespace fine_grained_dependencies
} // namespace swift

#endif
1 change: 1 addition & 0 deletions lib/AST/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,7 @@ add_swift_host_library(swiftAST STATIC
Evaluator.cpp
Expr.cpp
FineGrainedDependencies.cpp
FineGrainedDependencyFormat.cpp
FrontendSourceFileDepGraphFactory.cpp
GenericEnvironment.cpp
GenericSignature.cpp
Expand Down
23 changes: 17 additions & 6 deletions lib/AST/FineGrainedDependencies.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,14 @@
//
//===----------------------------------------------------------------------===//

#include <stdio.h>

#include "swift/AST/FineGrainedDependencies.h"

// may not all be needed
#include "swift/AST/DiagnosticEngine.h"
#include "swift/AST/DiagnosticsCommon.h"
#include "swift/AST/DiagnosticsFrontend.h"
#include "swift/AST/FileSystem.h"
#include "swift/AST/FineGrainedDependencyFormat.h"
#include "swift/Basic/FileSystem.h"
#include "swift/Basic/LLVM.h"
#include "swift/Demangling/Demangle.h"
Expand All @@ -31,6 +30,7 @@
#include "llvm/Support/Path.h"
#include "llvm/Support/YAMLParser.h"


// This file holds the definitions for the fine-grained dependency system
// that are likely to be stable as it moves away from the status quo.
// These include the graph structures common to both programs and also
Expand All @@ -53,11 +53,9 @@ Optional<SourceFileDepGraph> SourceFileDepGraph::loadFromPath(StringRef path) {
Optional<SourceFileDepGraph>
SourceFileDepGraph::loadFromBuffer(llvm::MemoryBuffer &buffer) {
SourceFileDepGraph fg;
llvm::yaml::Input yamlReader(llvm::MemoryBufferRef(buffer), nullptr);
yamlReader >> fg;
if (yamlReader.error())
if (swift::fine_grained_dependencies::readFineGrainedDependencyGraph(
buffer, fg))
return None;
// return fg; compiles for Mac but not Linux, because it cannot be copied.
return Optional<SourceFileDepGraph>(std::move(fg));
}

Expand Down Expand Up @@ -333,6 +331,19 @@ void DepGraphNode::dump(raw_ostream &os) const {
llvm::errs() << "no fingerprint";
}

void SourceFileDepGraphNode::dump() const {
dump(llvm::errs());
}

void SourceFileDepGraphNode::dump(raw_ostream &os) const {
DepGraphNode::dump(os);
os << " sequence number: " << sequenceNumber;
os << " is provides: " << isProvides;
os << " depends on:";
for (auto def : defsIDependUpon)
os << " " << def;
}

std::string DepGraphNode::humanReadableName(StringRef where) const {

return getKey().humanReadableName() +
Expand Down
Loading