Experimental Type Inferencer -- My misinterpretations of life -- Michael Lucas-Smith

2009-03-21

RoelTyper lives in public store. It's aging a little now - I'm not sure it loads cleanly in to the system. It was a very interesting experiment by Roel which uses the bytecode to analyze the instance variables in the system to infer their types. One of the biggest problems with type inferencers is that they're usually very slow. Roel's approach was not slow, however it only analyzed instance variables.

Because it only analyzed instance variables, it had some unwanted side effects - you couldn't tell what the return value of a method was or the types of method arguments and temporary variables. Many instance variables remained unknown because only direct messages sent to the instance variables were analyzed.

So I wondered, what if we used the same technique but we expanded the scope to record messages sent to messages, the last thing sent from a message, etc - so I ended up with three pieces of information: types, constraints and equivalences.

Types is obvious, if you assign a variable or return a constant, then we know the class immediately. Constraints is interesting because it lets us infer a type by finding classes that implement all messages sent to a "thing" - and by thing I mean another selector or a variable. Equivalences happen when a variable is assigned to something, we now know the variable is the same type as the type of that something else. The same goes for returning from a method: if a variable is returned, the methods type is equivalent to the variable and likewise with selectors, the last selector sent from a methods type is equivalent to the methods type.

The inferencer is still very fast and is giving me back some surprising results already. For instance, if we look at the method Integer>>* we can infer the type of the argument aNumber is ArithmeticValue; however it currently cannot figure out the return value of that method.

Another very interesting case is ApplicationModel>>mainWindow, and here's the code:

mainWindow
^builder isNil
    ifFalse: [builder window]
    ifTrue: [builder]

This is a remarkable pathological case. What programmer wouldn't simply return nil, given we know that builder is nil in the true branch. What happens if builder suddenly isn't nil between the isNil and the ifTrue:ifFalse: call? well you'll probably have the method return something other than a Window.

When my inferencer ran in to this particular method, it expected it would say the answer is ApplicationWindow and that's it.. but it wasn't to be fooled, it recognized that the return type of this method is the type of things that respond to #window -or- the type of the instance variable builder. The inferencer correctly returned: ApplicationWindow, UIBuilder.

I was surprised and confused until I analyzed why.. I think the next thing I need to do is to be able to explain why a type has been inferred the way it was, sort of like the New Prerequisite Engine does.

My ultimate goal here is to integrate it in to the next iteration of Searchlight, so that finding senders/implementors takes you as directly to the real method as possible.