Skip to content

Fix #6542: Pickle line sizes in TASTy #10363

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Nov 30, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 15 additions & 1 deletion compiler/src/dotty/tools/dotc/core/tasty/PositionPickler.scala
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,21 @@ class PositionPickler(
(addrDelta << 3) | (toInt(hasStartDelta) << 2) | (toInt(hasEndDelta) << 1) | toInt(hasPoint)
}

def picklePositions(roots: List[Tree], warnings: mutable.ListBuffer[String]): Unit = {
def picklePositions(source: SourceFile, roots: List[Tree], warnings: mutable.ListBuffer[String]): Unit = {
/** Pickle the number of lines followed by the length of each line */
def pickleLineOffsets(): Unit = {
val content = source.content()
buf.writeNat(content.count(_ == '\n') + 1) // number of lines
var lastIndex = content.indexOf('\n', 0)
buf.writeNat(lastIndex) // size of first line
while lastIndex != -1 do
val nextIndex = content.indexOf('\n', lastIndex + 1)
val end = if nextIndex != -1 then nextIndex else content.length
buf.writeNat(end - lastIndex - 1) // size of the next line
lastIndex = nextIndex
}
pickleLineOffsets()

var lastIndex = 0
var lastSpan = Span(0, 0)
def pickleDeltas(index: Int, span: Span) = {
Expand Down
13 changes: 13 additions & 0 deletions compiler/src/dotty/tools/dotc/core/tasty/PositionUnpickler.scala
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,20 @@ import Names.TermName
class PositionUnpickler(reader: TastyReader, nameAtRef: NameRef => TermName) {
import reader._

private var myLineSizes: Array[Int] = _
private var mySpans: util.HashMap[Addr, Span] = _
private var mySourcePaths: util.HashMap[Addr, String] = _
private var isDefined = false

def ensureDefined(): Unit = {
if (!isDefined) {
val lines = readNat()
myLineSizes = new Array[Int](lines)
var i = 0
while i < lines do
myLineSizes(i) += readNat()
i += 1

mySpans = util.HashMap[Addr, Span]()
mySourcePaths = util.HashMap[Addr, String]()
var curIndex = 0
Expand Down Expand Up @@ -60,6 +68,11 @@ class PositionUnpickler(reader: TastyReader, nameAtRef: NameRef => TermName) {
mySourcePaths
}

private[tasty] def lineSizes: Array[Int] = {
ensureDefined()
myLineSizes
}

def spanAt(addr: Addr): Span = spans.getOrElse(addr, NoSpan)
def sourcePathAt(addr: Addr): String = sourcePaths.getOrElse(addr, "")
}
9 changes: 7 additions & 2 deletions compiler/src/dotty/tools/dotc/core/tasty/TastyPrinter.scala
Original file line number Diff line number Diff line change
Expand Up @@ -139,9 +139,14 @@ class TastyPrinter(bytes: Array[Byte]) {
private val sb: StringBuilder = new StringBuilder

def unpickle(reader: TastyReader, tastyName: NameTable): String = {
val posUnpickler = new PositionUnpickler(reader, tastyName)
sb.append(s" ${reader.endAddr.index - reader.currentAddr.index}")
val spans = new PositionUnpickler(reader, tastyName).spans
sb.append(s" position bytes:\n")
sb.append(" position bytes:\n")
val lineSizes = posUnpickler.lineSizes
sb.append(s" lines: ${lineSizes.length}\n")
sb.append(posUnpickler.lineSizes.mkString(" line sizes: ", ", ", "\n"))
sb.append(" positions:\n")
val spans = posUnpickler.spans
val sorted = spans.toSeq.sortBy(_._1.index)
for ((addr, pos) <- sorted) {
sb.append(treeStr("%10d".format(addr.index)))
Expand Down
7 changes: 6 additions & 1 deletion compiler/src/dotty/tools/dotc/core/tasty/TreeUnpickler.scala
Original file line number Diff line number Diff line change
Expand Up @@ -1363,8 +1363,13 @@ class TreeUnpickler(reader: TastyReader,
def sourceChangeContext(addr: Addr = currentAddr)(using Context): Context = {
val path = sourcePathAt(addr)
if (path.nonEmpty) {
val sourceFile = ctx.getSource(path)
posUnpicklerOpt match
case Some(posUnpickler) =>
sourceFile.setLineIndicesFromLineSizes(posUnpickler.lineSizes)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a potential issue here: the relative path of a file is not a unique identifier of the file, there might be path conflicts in the Scala ecosystem.

If we also store the hash of source files, that would be better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would you use that hash?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe ctx.getSource(path) can take hash as an argument? I haven't thought through the details.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash would be on the contents of the source? Then we would need to read all the sources eagerly when we add them to the context. That may be quite expensive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we compile files, we already have the file contents in memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it is only loaded when we first access something the depends on the source https://github.com/lampepfl/dotty/blob/master/compiler/src/dotty/tools/dotc/util/SourceFile.scala#L43.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we compile Scala files, the contents for the Scala files will be forced.

case _ =>
pickling.println(i"source change at $addr: $path")
ctx.withSource(ctx.getSource(path))
ctx.withSource(sourceFile)
}
else ctx
}
Expand Down
2 changes: 1 addition & 1 deletion compiler/src/dotty/tools/dotc/quoted/PickledQuotes.scala
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ object PickledQuotes {
if tree.span.exists then
val positionWarnings = new mutable.ListBuffer[String]()
new PositionPickler(pickler, treePkl.buf.addrOfTree, treePkl.treeAnnots)
.picklePositions(tree :: Nil, positionWarnings)
.picklePositions(ctx.compilationUnit.source, tree :: Nil, positionWarnings)
positionWarnings.foreach(report.warning(_))

val pickled = pickler.assembleParts()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ trait MessageRendering {
val sb = mutable.StringBuilder()
val posString = posStr(pos, diagnosticLevel, msg)
if (posString.nonEmpty) sb.append(posString).append(EOL)
if (pos.exists && pos.source.file.exists) {
if (pos.exists) {
val pos1 = pos.nonInlined
val (srcBefore, srcAfter, offset) = sourceLines(pos1, diagnosticLevel)
val marker = columnMarker(pos1, offset, diagnosticLevel)
Expand Down
2 changes: 1 addition & 1 deletion compiler/src/dotty/tools/dotc/transform/Pickler.scala
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ class Pickler extends Phase {
treePkl.compactify()
if tree.span.exists then
new PositionPickler(pickler, treePkl.buf.addrOfTree, treePkl.treeAnnots)
.picklePositions(tree :: Nil, positionWarnings)
.picklePositions(unit.source, tree :: Nil, positionWarnings)

if !ctx.settings.YdropComments.value then
new CommentPickler(pickler, treePkl.buf.addrOfTree, treePkl.docString)
Expand Down
24 changes: 21 additions & 3 deletions compiler/src/dotty/tools/dotc/util/SourceFile.scala
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,9 @@ class SourceFile(val file: AbstractFile, computeContent: => Array[Char]) extends

def apply(idx: Int): Char = content().apply(idx)

def length: Int = content().length
def length: Int =
if lineIndicesCache ne null then lineIndicesCache.last
else content().length

/** true for all source files except `NoSource` */
def exists: Boolean = true
Expand All @@ -105,7 +107,8 @@ class SourceFile(val file: AbstractFile, computeContent: => Array[Char]) extends
def positionInUltimateSource(position: SourcePosition): SourcePosition =
SourcePosition(underlying, position.span shift start)

private def calculateLineIndices(cs: Array[Char]) = {
private def calculateLineIndicesFromContents() = {
val cs = content()
val buf = new ArrayBuffer[Int]
buf += 0
var i = 0
Expand All @@ -120,7 +123,22 @@ class SourceFile(val file: AbstractFile, computeContent: => Array[Char]) extends
buf += cs.length // sentinel, so that findLine below works smoother
buf.toArray
}
private lazy val lineIndices: Array[Int] = calculateLineIndices(content())

private var lineIndicesCache: Array[Int] = _
private def lineIndices: Array[Int] =
if lineIndicesCache eq null then
lineIndicesCache = calculateLineIndicesFromContents()
lineIndicesCache
def setLineIndicesFromLineSizes(sizes: Array[Int]): Unit =
val lines = sizes.length
val indices = new Array[Int](lines + 1)
var i = 0
val penultimate = lines - 1
while i < penultimate do
indices(i + 1) = indices(i) + sizes(i) + 1 // `+1` for the '\n' at the end of the line
i += 1
indices(lines) = indices(penultimate) + sizes(penultimate) // last line does not end with '\n'
lineIndicesCache = indices

/** Map line to offset of first character in line */
def lineToOffset(index: Int): Int = lineIndices(index)
Expand Down
12 changes: 6 additions & 6 deletions compiler/src/dotty/tools/dotc/util/SourcePosition.scala
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ extends SrcPos, interfaces.SourcePosition, Showable {

def point: Int = span.point

def line: Int = if (source.length != 0) source.offsetToLine(point) else -1
def line: Int = source.offsetToLine(point)

/** Extracts the lines from the underlying source file as `Array[Char]`*/
def linesSlice: Array[Char] =
Expand All @@ -45,16 +45,16 @@ extends SrcPos, interfaces.SourcePosition, Showable {
def beforeAndAfterPoint: (List[Int], List[Int]) =
lineOffsets.partition(_ <= point)

def column: Int = if (source.content().length != 0) source.column(point) else -1
def column: Int = source.column(point)

def start: Int = span.start
def startLine: Int = if (source.content().length != 0) source.offsetToLine(start) else -1
def startColumn: Int = if (source.content().length != 0) source.column(start) else -1
def startLine: Int = source.offsetToLine(start)
def startColumn: Int = source.column(start)
def startColumnPadding: String = source.startColumnPadding(start)

def end: Int = span.end
def endLine: Int = if (source.content().length != 0) source.offsetToLine(end) else -1
def endColumn: Int = if (source.content().length != 0) source.column(end) else -1
def endLine: Int = source.offsetToLine(end)
def endColumn: Int = source.column(end)

def withOuter(outer: SourcePosition): SourcePosition = SourcePosition(source, span, outer)
def withSpan(range: Span) = SourcePosition(source, range, outer)
Expand Down
1 change: 1 addition & 0 deletions project/scripts/cmdTests
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ cp tests/neg/i6371/B_2.scala $OUT/B.scala
rm $OUT/A.scala
"$SBT" "scalac -classpath $OUT1 -d $OUT1 $OUT/B.scala" > "$tmp" 2>&1 || echo "ok"
grep -qe "B.scala:2:7" "$tmp"
grep -qe "This location contains code that was inlined from A.scala:3" "$tmp"

echo "testing -Ythrough-tasty"
clear_out "$OUT"
Expand Down
13 changes: 9 additions & 4 deletions tasty/src/dotty/tools/tasty/TastyFormat.scala
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ Macro-format:
Note: Unqualified names in the name table are strings. The context decides whether a name is
a type-name or a term-name. The same string can represent both.


Standard-Section: "ASTs" TopLevelStat*

TopLevelStat = PACKAGE Length Path TopLevelStat* -- package path { topLevelStats }
Expand Down Expand Up @@ -226,7 +227,10 @@ Note: Tree tags are grouped into 5 categories that determine what follows, and t
Category 4 (tags 110-127): tag Nat AST
Category 5 (tags 128-255): tag Length <payload>

Standard-Section: "Positions" Assoc*

Standard-Section: "Positions" LinesSizes Assoc*

LinesSizes = Nat Nat* // Number of lines followed by the size of each line not counting the trailing `\n`

Assoc = Header offset_Delta? offset_Delta? point_Delta?
| SOURCE nameref_Int
Expand All @@ -244,7 +248,8 @@ Standard-Section: "Positions" Assoc*

All elements of a position section are serialized as Ints

Standard-Section: "Comments" Comment*

Standard Section: "Comments" Comment*

Comment = Length Bytes LongInt // Raw comment's bytes encoded as UTF-8, followed by the comment's coordinates.

Expand All @@ -254,8 +259,8 @@ Standard-Section: "Comments" Comment*
object TastyFormat {

final val header: Array[Int] = Array(0x5C, 0xA1, 0xAB, 0x1F)
val MajorVersion: Int = 25
val MinorVersion: Int = 1
val MajorVersion: Int = 26
val MinorVersion: Int = 0

final val ASTsSection = "ASTs"
final val PositionsSection = "Positions"
Expand Down