-
Notifications
You must be signed in to change notification settings - Fork 3
Chen
We are currently working on translating the binary operators in Swift to their corresponding representations in WALA. Attached is the SIL output we obtained from compiling this Swift code:
var a = true && false
We understand that && and || are special binary operators, as they involve short-circuitings. However, the output is very confusing and beyond what we would expect for a simple implementation of the short-circuiting mechanism. Our current understanding is that:
- The TryApplyInst calls Swift.Bool.|| infix(). This function takes a boolean value as input. When the boolean is true, it does nothing. When the boolean is false, it throws an exception. Essentially, it "guesses" whether the input can be the result of ||.
- TryApplyInst jumps to different basic blocks depending on whether or not an exception is thrown.
We are not sure about:
- Why and how closures are involved
- What "@error Error" means
- Why and how ThinToThickFunctionInst is involved
- Why TryApplyInst jumps to UnreachableInst when an exception is thrown
- Why there are more than one SILFunctions involved
At your convenience, we think it will be most helpful if you can explain all the SIL instructions in the attached file line by line, in plain English.
- I worked on separating built-in unary/binary operators to regular functions.
- In an ApplyInst instance, the "arguments" can sometimes be confusing. For example, in the following code:
var a = ~2
The unary operator ~ (sometimes? not always?) has three arguments: a GlobalAddrInst, an integer, and a metatype. I am not sure why GlobalAddrInst is passed to this function application.
- I handled the difficulty discussed above by:
- In an unary operation, only use the second last argument
- In a binary operation, only use the second and third last argument
- I'm going to discuss this with Miss Wu.
- I basically finished half of the ApplyInst. A CAst can be built if there is a call to a "real" function, and all the argument CAst nodes have been created. However, if there is really a call to a built-in unary/binary operator, I cannot separate it from the regular function calls yet.
- I took a quick look to (global) variable declaration & definition & usage.
- I worked on if statements.
- It is not easy to separate calls to built-in unary/binary operators to regular functions.
- The swift ASTNode appears to have some information about whether the node is a built-in unary/binary operator (in class Decl). I probably have to somehow retrieve a Decl* from a SILFunction*. (I can do this now with Jeff's help)
- Global variables are declared and definied using 3 instructions: AllocGlobalInst, GlobalAddrInst, StoreInst. They are used via a couple of "Access Instruction"s.
- SILFunction is a chunk of SIL code that can be called. It is also a container of SILBasicBlocks. A SILBasicBlock is a container of SILInstructions.
- I'm going to continue working on separating built-in binary/unary operators with regular fuction calls
- I may become less productive this upcoming week, as I have some course projects to do. I will see what else I can do after finishing my course projects.
- I separated the giant switch statement into a separated class: InstrKindInfoGetter. CMakeLists are updated to reflect this change.
- I created a hash map to store the CAst nodes created in WALAWalker. The key is a pointer to a SILInstruction, and the value is a reference to a CAst node.
- In ApplyInst, I figured out how to fetch the original SILInstruction for each argument. This allows me to lookup the hash map I created when I am building the apply function node.
- It was the first time that I had modified a CMakeList file. When I first created a new class, I did not know I had to update the CMakeList file. Therefore, the compilation was not successful initially.
- I spent most of the time deciding the type of the key in my hash map. The key must be accessible from a SILInstruction instance, which is a extremely complex class. I spent a lot of time figuring out what are the relationships among SILInstruction, SILValue, ValueBase, etc.
- I understand now how to modify CMakeList when I want to create a new source file, or delete an existing one.
- For an ApplyInst instance, I can obatin its arguments through: castInst->getArgument(i). The return value is of type SILValue. SILValue is a wrapper around a ValueBase pointer. I can get the underlying ValueBase pointer through: SILValue::getOpaqueValue(). The class SILInstruction is inherited from the calss ValueBase. It turns out that from castInst->getArgument(i)->getOpaqueValue(), I can obtain the original SILInstruction that creates this argument.
- I realized (actualy two weeks ago) that I was progressing very very slow in this project. This is partly because I always have a lot of other districtions: sometimes GRE, sometimes TOEFL, sometimes job interview, and sometimes assignments and exams. This is pretty bad. Lydia and I decided that we should set up some fixed time each week to work together: Tuesday after the meeting, and Saturday afternoon.
- Next week I am going to finish the ApplyInst. Lydia and I are still waiting for the code for generating integer literal CAst nodes. They are required since most functions we encounter so far take integers as (part of) input.
I don't want to lie here... I did not do anything related to our project in the past week.
I apologize for my low productivity last week. I had a GRE test on Oct 1 (Sunday), so I spent most of my time preparing for this. Essentially, what I have done are:
- I wrote some new tests for the string literals. The results are good. The existing code passes these tests perfectly.
- I made some attempts in supporting string operations. When I compile the following swift code:
var str = "a" + "b"
three ApplyInst are generated. Two string constructors, and one "+" function. When I compile the following code:
func appendA(str : String) -> String {
return str + "A"
}
var str = "a" + "b"
var appendedStr = appendA(str: str)
I get the additional ApplyInst generated for my user-defined function, which takes a string as input as well. I can trace the ApplyInst in "case ValueKind::ApplyInst" in the gaint switch statement. However, I have trouble obatining useful information from that instruction object. (Please see my second and thrid point in "Difficulities")
- In the swift compiler, a string literal can have encoding "UTF16". From my understanding, a plain text source file can have either UTF8 or UTF16 for its entire content. However, the swift compiler seems to have trouble with UTF16 encoded source files (even the released official version). Meanwhile, it treats East Asian characters in an UTF8 encoded file as UTF16 characters.
- In swift, all string operations are translated into function calls. However, in WALA, some of them are treated as built-in statements. Where can I find a list of all built-in statements in WALA?
- How can I know whether the callee is a built-in WALA statement from an ApplyInst object? An object for string concatination looks like this, where %10 and %16 are string constructors, and %0 and %1 are string objects:
<< ApplyInst >>
[ARG] #0: %11 = apply %10(%6, %7, %8, %9) : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String // user: %18
[ARG] #1: %17 = apply %16(%12, %13, %14, %15) : $@convention(method) (Builtin.RawPointer, Builtin.Word, Builtin.Int1, @thin String.Type) -> @owned String // user: %18
[ARG] #2: %5 = metatype $@thin String.Type // user: %18
- In "case ValueKind::ApplyInst", I believe we only need to worry about built-in WALA statements now. Everything else should be translated into WALA as regular function calls. This can be done without understanding the types of the arguments, return value, and functionality.
- I will try to look into WALA and find exactly what are the built-in statements.
- Once I find the list for the built-in statements, I will continue working on those related to strings.