Eric Lippert’s name is synonymous with C#. Having been Principal Developer at Microsoft on the C# compiler team and a member of the C# language design team he now works on C# analysis at Coverity.
If you know C# then the name Eric Lippert will be synonymous with clear explanations of difficult ideas and insights into the way languages work and are put together.
Here we host an overall summary of the highlights of the interview ranging over topics as diverse as the future of C#, asynchronous v parallel, Visual Basic and more (the link to the full interview on i-programmer can be found at the end of this page), so read on because you will surely find something to interest you about C#, languages in general or just where things are heading.
NV : So Eric, after so many years at Microsoft you began a new career at Coverity. Was the ‘context switch’ easy?
EL : Yes and no. Some parts of it were very easy and some took some getting used to.
For example, re-learning how to use Unix-based development tools, which I had not touched since the early 1990s, took me a while. Git is very different than Team Foundation Studio. And so on. But some things were quite straightforward.
Coverity’s attitude towards static analysis is very similar to the attitude that the C# compiler team has about compiler warnings, for instance. Though of course the conditions that Coverity is checking for are by their nature much more complicated than the heuristics that the C# compiler uses for warnings.
Switching from taking a bus to downtown every day instead of taking a bus to Redmond every day was probably the easiest part!
NV: I guess that from now on you’ll be working on the field of static analysis. What exactly does static analysis do?
EL: Static analysis is a very broad field in both industry and academia. So let me first start very wide, and then narrow that down to what we do at Coverity.
Static analysis is analysis of programs based solely on their source code or, if the source code is not available, their compiled binary form. That is in contrast with dynamic analysis, which analyses program behavior by watching the program run. So a profiler would be an example of dynamic analysis; it looks at the program as it is running and discovers facts about its performance, say.
Any analysis you perform just by looking at the source code is static analysis. So for example, compiler errors are static analysis; the error was determined by looking at the source code.
So now let’s get a bit more focused. There are lots of reasons to perform static analysis, but the one we are focused on is the discovery of program defects. That is still very broad. Consider a defect such as “this public method violates the Microsoft naming guidelines”. That’s certainly a defect. You might not consider that a particularly interesting or important defect, but it’s a defect.
Coverity is interested in discovering a very particular class of defect.
That is, defects that would result in a bug that could realistically affect a user of the software. We’re looking for genuine “I’m-glad-we-found-that-before-we-shipped-and-a-customer-lost-all-their-work” sort of bugs. Something like a badly named method that the customer is never going to notice.
NV: Do Code contracts play a role, and will the introduction of Roslyn affect the field of static analysis?
EL: Let me split that up into two questions. First, code contracts.
So as you surely know, code contracts are annotations that you can optionally put into your C# source code that allow you to express the pre-condition and post-condition invariants about your code. So then the question is, how do these contracts affect the static analysis that Coverity does? We have some support for understanding code contracts, but we could do better and one of my goals is to do some research on this for future versions.
One of the hard things about static analysis is the number of possible program states and the number of possible code paths through the program is extremely large, which can make analysis very time consuming. So one of the things we do is try to eliminate false paths — that is, code paths that we believe are logically impossible, and therefore do not have to be checked for defects. We can use code contracts to help us prune false paths.
A simple example would be if a contract says that a precondition of the method is that the first argument is a non-null string, and that argument is passed to another method, and the second method checks the argument to see if it is null. We can know that on that path – that is, via the first method – the path where the null check says “yes it is null” is a false path. We can then prune that false path and not consider it further. This has two main effects. The first is, as I said before, we get a significant performance gain by pruning away as many false paths as possible. Second, a false positive is when the tool reports a defect but does so incorrectly. Eliminating false paths greatly decreases the number of false positives. So we do some fairly basic consumption of information from code contracts, but we could likely do even more.
Now to address your second question, about Roslyn. Let me first answer the question very broadly. Throughout the industry, will Roslyn affect static analysis of C#? Absolutely yes, that is its reason for existing.
When I was at Microsoft I saw so many people write their own little C# parsers or IDEs or little mini compilers or whatever, for their own purposes. That’s very difficult, it’s time-consuming, it’s expensive, and it’s almost impossible to do right. Roslyn changes all that, by giving everyone a library of analysis tools for C# and VB which is correct, very fast, and designed specifically to make tool builder’s lives better.
I am very excited that it is almost done! I worked on it for many years and can’t wait to get my hands on the release version.
More specifically, will Roslyn affect static analysis at Coverity? We very much hope so. We work closely with my former colleagues on the Roslyn team. The exact implementation details of the Coverity C# static analyzer are of course not super-interesting to customers, so long as it works. And the exact date Roslyn will be available is not announced.
So any speculation as to when there will be a Coverity static analyzer that uses Roslyn as its front end is just that — speculative. Suffice to say that we’re actively looking into the possibility.
EL: Some of those more than others.
Let me start by taking a step back and reiterating what Roslyn is, and is not. Roslyn is a class library usable from C#, VB or other managed languages.Its purpose is to enable analysis of C# and VB code. The plan is for future versions of the C# and VB compilers and IDEs in Visual Studio to themselves use Roslyn.
So typical tasks you could perform with Roslyn would be things like:
- “Find all usages of a particular method in this source code”
- “Take this source code and give me the lexical and grammatical analysis”
- “Tell me all the places this variable is written to inside this block”
Let me quickly say what it is not. It is not a mechanism for customers to themselves extend the C# or VB languages; it is a mechanism for analyzing the existing languages. Roslyn will make it easier for Microsoft to extend the C# and VB languages, because its architecture has been designed with that in mind. But it was not designed as an extensibility service for the language itself.
You mentioned a REPL. That is a Read-Eval-Print Loop, which is the classic way you interface with languages like Scheme. Since the Roslyn team was going to be re-architecting the compiler anyway they put in some features that would make it easier to develop REPL-like functionality in Visual Studio. Having left the team, I don’t know what the status is of that particular feature, so I probably ought not to comment on it further.
One of the principle scenarios that Roslyn was designed for is to make it much easier for third parties to develop refactorings. You’ve probably seen in Visual Studio that there is a refactoring menu and you can do things like “extract this code to a method” and so on.
Any of those refactorings, and a lot more, could be built using Roslyn.
There is to my knowledge no plan for that sort of very dynamic feature in C#. However, there are things you can do to solve the simpler problem of generating fresh code at runtime. The CLR of course already has Reflection Emit. At a higher level, C# 3.0 added expression trees. Expression trees allow you to build a tree representing a C# or VB expression at runtime, and then compile that expression into a little method. The IL is generated for you automatically.
If you are analysing source code with Roslyn then there is I believe a facility for asking Roslyn “suppose I inserted this source code at this point in this program — how would you analyze the new code?”
And if at runtime you started up Roslyn and said “here’s a bunch of source code, can you give me a compiled assembly?” then of course Roslyn could do that. If someone wanted to build a little expression evaluator that used Roslyn as a lightweight code generator, I think that would be possible, but I’ve never tried it.
It seems like a good experiment. Maybe I’ll try to do that.
NV:Although, the TPL and async/await were great additions to both C# and the framework, they were also cause of a lot of commotion, generating more questions than answers:
What’s the difference between Asynchrony and Parallelism?
EL: Great question. Parallelism is one technique for achieving asynchrony, but asynchrony does not necessarily imply parallelism.
An asynchronous situation is one where there is some latency between a request being made and the result being delivered, such that you can continue to process work while you are waiting. Parallelism is a technique for achieving asynchrony, by hiring workers – threads – that each do tasks synchronously but in parallel.
An analogy might help. Suppose you’re in a restaurant kitchen. Two orders come in, one for toast and one for eggs.
A synchronous workflow would be: put the bread in the toaster, wait for the toaster to pop, deliver the toast, put the eggs on the grill, wait for the eggs to cook, deliver the eggs. The worker – you – does nothing while waiting except sit there and wait.
An asynchronous but non-parallel workflow would be: put the bread in the toaster. While the toast is toasting, put the eggs on the grill. Alternate between checking the eggs, checking the toast, and checking to see if there are any new orders coming in that could also be started.
Whichever one is done first, deliver first, then wait for the other to finish, again, constantly checking to see if there are new orders.
An asynchronous parallel workflow would be: you just sit there waiting for orders. Every time an order comes in, go to the freezer where you keep your cooks, thaw one out, and assign the order to them. So you get one cook for the eggs, one cook for the toast, and while they are cooking, you keep on looking for more orders. When each cook finishes their job, you deliver the order and put the cook back in the freezer.
You’ll notice that the second mechanism is the one actually chosen by real restaurants because it combines low labour costs – cooks are expensive – with responsiveness and high throughput. The first technique has poor throughput and responsiveness, and the third technique requires paying a lot of cooks to sit around in the freezer when you really could get by with just one.
NV: If async does not start a new thread in the background how can it perform I/O bound operations and not block the UI thread?
No, not really.
Remember, fundamentally I/O operations are handled in hardware: there is some disk controller or network controller that is spinning an iron disk or varying the voltage on a wire, and that thing is running independently of the CPU.
The operating system provides an abstraction over the hardware, such as an I/O completion port. The exact details of how many threads are listening to the I/O completion port and what they do when they get a message, well, all that is complicated.
Suffice to say, you do not have to have one thread for each asynchronous I/O operation any more than you would have to hire one admin assistant for every phone call you wanted answered.
NV: What feature offered by another language do you envy the most and would like to see in C#?
EL: Ah, good question.
That’s a tricky one because there are languages that have features that I love which actually, I don’t think would work well in C#.
Take F# pattern matching for example. It’s an awesome feature. In many ways it is superior to more traditional approaches for taking different actions on the basis of the form that some data takes.But is there a good way to hammer on it so that it looks good in C#? I’m not sure that there is. It seems like it would look out of place.
So let me try to think of features that I admire in other languages but I think would work well in C#. I might not be able to narrow it down to just one.
Scala has a lot of nice features that I’d be happy to see in C#. Contravariant generic constraints, for example. In C# and Scala you can say “T, where T is Animal or more specific”. But in Scala you can also say “T, where T is Giraffe or less specific”. It doesn’t come in handy that often but there are times when I’ve wanted it and it hasn’t been there in C#.
There’s a variation of C# called C-Omega that Microsoft Research worked on. A number of features were added to it that did not ever get moved into C# proper. One of my favorites was a yield foreach construct that would automatically generate good code to eliminate the performance problem with nested iterators. F# has that feature, now that I think of it. It’s called yield! in F#, which I think is a very exciting way to write the feature!
I could go on for some time but let’s stop listing features there.
NV:What will the feature set of C# 6.0 be?
EL:I am under NDA and cannot discuss it in details, so I will only discuss what Mads Torgersen has already disclosed in public. Mads did a “Future of C#” session in December of last year. He discussed eight or nine features that the C# language design team is strongly considering for C# 6.0. If you read that list carefully — Wesner Moise has a list here
– you’ll see that there is no “major headliner” feature.
I’ll leave you to draw your own conclusions from that list.
Incidentally, I knew Wesner slightly in the 1990s. Among his many claims to fame is he invented the pivot table. Interesting guy.
NV: Java as tortured as it might be, revitalizes itself due to Linux and the popularity of mobile devices. Does .NET’s and C#’s future depend on the successful adoption of Windows by the mobile devices ?
EL: That’s a complicated question, as are all predictions of the future.
But by narrowly parsing your question and rephrasing it into an — I hope — equivalent form, I think it can be answered. For the future of technology X to depend on the success of technology Y means “we cannot conceive of a situation in which Y fails but X succeeds”.
So, can we conceive of a situation in which the market does not strongly adopt Windows on mobile devices, but C# is adopted on mobile devices? Yes, absolutely we can conceive of such a situation.
Xamarin’s whole business model is predicated on that conception. They’ve got C# code running on Android, so C# could continue to have a future on the mobile platform even if Windows does not get a lot of adoption.
Or, suppose both Microsoft fails to make headway on Windows on mobile and Xamarin fails to make headway on C# on Android, etc. Can we conceive of a world in which C# still has a future? Sure.
Mobile is an important part of the ecosystem, but it is far from the whole thing. There are lots of ways that C# could continue to thrive even if it is not heavily adopted as a language for mobile devices.
If the question is the more general question of “is C# going to thrive?” I strongly believe that it is. It is extremely well placed: a modern programming language with top-notch tools and a great design and implementation team.
NV: Do you think that C# and the managed world as a whole, could be “threatened” by C++ 11 ?
EL: Is C# “threatened” by C++11?
Short answer: no
There’s a saying amongst programming language designers – I don’t know who said it first – that every language is a response to the perceived shortcomings of another language.
C# was absolutely a response to the shortcomings of C and C++. (C# is often assumed to be a response to Java, and in a strategic sense, it was a response to Sun. But in a language design sense it is more accurate to say that both C# and Java were responses to the shortcomings of C++.)
Designing a new language to improve upon an old one not only makes the world better for the users of the new language, it gives great ideas to the proponents of the old one. Would C++11 have had lambdas without C# showing that lambdas could work in a C-like language? Maybe. Maybe not. It’s hard to reason about counterfactuals.
But I think it is reasonable to guess that it was at least a factor.
Similarly, if there are great ideas in C++11 then those will inform the next generation of programming language designers. I think that C++ has a long future ahead of it still, and I am excited that the language is still evolving in interesting ways.
Having choice of language tools makes the whole developer ecosystem better. So I don’t see that as a threat at all. I see that as developers like me having more choice and more power tools at their disposal.
NV: What is you reply to the voices saying that C# has grown out of proportion and that we’ve reached the point that nobody except its designers can have a complete understanding of the language ?
EL: I often humorously point out that the C# specification begins with “C# is a simple language” and then goes on for 800 dense pages. It is very rare for users to write even large programs that use all the features of C# now. The language has undoubtedly grown far more complex in the last decade, and the language designers take this criticism very seriously.
The designers work very hard to ensure that new features are “in the spirit” of the language, that design principles are applied consistently, that features are as orthogonal as possible, and that the language grows slowly enough that users can keep up, but quickly enough to meet the needs of modern programmers. This is a tough balance to strike, and I think the designers have done an exceptionally good job of it.
NV: Where is programming as an industry heading at and will an increasingly smarter compiler that will make programming accessible to anyone, something very positive of course , on the other hand pose a threat to the professional’s programmer’s job stability?
EL: I can’t find the figure right now, but there is a serious shortage of good programmers in the United States right now. A huge number of jobs that require some programming ability are going unfilled. That is a direct brake upon the economy. We need to either make more skilled programmers, or making writing robust, correct, fully-featured, usable programs considerably easier. Or, preferably, both.
So no, I do not see improvements in language tools that make it easier for more and more people to become programmers as any kind of bad thing for the industry. Computers are only becoming more ubiquitous. What we think of as big data today is going to look tiny in the future, and we don’t have the tools to effectively manage what we’ve got already.
There is going to be the need for programmers at every level of skill working on all kinds of problems, some of which haven’t even been invented yet. This is why I love working on developer tools; I’ve worked on developer tools my whole professional life. I want to make this hard task easier. That’s the best way I can think of to improve the industry.