O'Reilly's open source conference, OSCON, is valued just as much for the intelligent conversations that take place in the hallways and over meals as it is for the tutorials, sessions, and keynote speeches. In this video, Terry Camerlengo sat down with Damian Conway, author of Perl Best Practices and Perl Hacks, (we won't say he cornered him) to get his thoughts on a wide variety of subjects, including the what-and-when of Perl 6 and what he thinks is important for the next generation of computer scientists.
Damian Conway: I'm a private consultant. I travel around the world wherever my clients are, but mainly in Europe and North America. I teach Perl, Vim, presentation skills—basically whatever people want. Then, much of the year I'm in Australia (as you can tell from the accent). And I work from home; I telecommute. I worked a lot on the Perl 6 Project, and writing—stuff like that—so telecommuting is a good way to work.
Terry: Are you also a University lecturer?
Damian: I do hold an Honorary Associate Professorship at Monash University in Australia. I actually put in my ten years as a university lecturer but then I got let off for good behavior, but I've kept my contacts there and I come back give guest lectures and do some lecture training there as well.
Terry: Tell me your academic background. What did you study and what degrees did you obtain?
Damian: I have a couple of degrees: a Bachelors in Science and then a PhD in Computer Science. For the PhD, I was doing computer graphic stuff back in the day before we had the hardware to do it really easily. And, as happens with your PhD, I've never done anything in graphics again; it kind of kills you for whatever your topic was. Since then my main problem has been that I'm doing research in like eight different directions at once. This is why I didn't pursue an academic career—it doesn't really build a body of work. But, you name it, and I've done it: user interface design, computer programming language design, synthetic molecular biology, and so on.
Terry: So what do you know about fast three-dimensional rendering using iso-illuminates?
Damian: Oh my God, no, no; it comes back to haunt me! That was my PhD topic. It was a technique for drawing relatively simple shapes but back before we had the hardware to do that with any kind of speed. And the whole idea was nowadays when people draw stuff they draw the surface and they subdivide the surface into very small polygons and render those very accurately. But back in the day when it was expensive to draw a single polygon I came up with a different technique which is—I said, well, if you can mathematically compute the contours of brightness on the surface then all you've got to do is draw each of those contours in a constant color in a single line, and of course line-drawing was much cheaper than area-drawing at the time, so you'd just draw all these lines and it turns into a smooth surface.
Terry: Okay; wow.
Damian: It's pretty cool and it would have been useful except that Moore's law caught up with me and processing power became so powerful that it was not necessary to do these kinds of optimization anymore.
Terry: How interesting; you also mentioned you did some visualization work with biology.
Damian: That's right.
Terry: What sort of work did you do?
Damian: I was supervising a PhD mainly and that was on looking at the mechanisms that underlie the self-assembly of certain kinds of very small structures. In each of the cells in your body you have these very small sort of connecting tubes through which nutrients and chemical messages pass. And we were looking at, how in the heck does that get built correctly? You know, what makes it build into a tube rather than something squiggly that you can't pass stuff through? And so we were looking at simulating the physics of this. The problem at the time was that the technology that we had to image these tiny structures really wasn't good enough to actually see them very well, because they're very, very delicate.
Terry: So are these protein channels...?
Damian: No, they're literally little tubes. We couldn't really work out the exact structure of them because the technology at the time couldn't resolve them well enough to do that. So we took another approach. We said, there are three leading theories about what the shape is so if we simulated each of those three theories, the behavior of the constituent proteins in those three theories and then we simulated the process of electron microscopy, which was what was being used at the time, then we'll get synthetic micrographs of these and if any of them actually looks like the real micrographs that's evidence in favor of that particular theory. If you've got a theory and you've got an accurate simulation that produces something that looks nothing like the real picture, then there's possibly a problem in the theory. That was the basis of the PhD. It was really interesting stuff and, you know, I know very little about biology but it was very nice to be able to bring the stuff that I knew about—visualization and simulation and computing—to that process.
Terry: That's really fascinating. To leave that topic and talk about something slightly different, since you are a lecturer, there have been some people that have suggested that using Java as a language for teaching CS students is watering down the quality of the CS students. Maybe Java is not the ideal language to do this. A lot of people feel like you should be using something like C++ or Perl or C. Do you have any opinions on this?
Damian: I don't buy the argument that because Java is relatively abstracted from the middle of the spectrum, it isn't a good choice for a first programming language. A lot of people came out with very, very good computing skills whose first language was Lisp, and that's as abstracted from the middle as you can possibly be. When I was going through—back in the Jurassic age—we were actually taught both ends of the spectrum as well. We were taught Pascal as our first programming language, which again is a very abstracted kind of language, but at the same time we were also taught microprogramming. We were taught assembler and below that microprogramming and we would be down doing the hardware, as well. Back in those days, computer science really was the science of building computers from the electrons up. I think the problem with teaching Java as the first and often the only programming language is the only bit of it—what it's teaching is that there is one tool—one hammer in your toolkit—and you have to treat everything as a nail. What I see in a lot of curricula nowadays is a lack of diversity, a lack of experiencing different ways of thinking about computation, different ways of solving problems, different architectures, different language paradigms, different models of thinking, and that to me is the big problem. I don't mind that you teach Java first; I wonder sometimes whether Java isn't too complicated to teach first since you really have to have all of the ideas simultaneously.
When you're first learning to program, there are so many different levels that you need to think about simultaneously. You need to think about the algorithmic level, you need to think about the syntax level, you need to think about the semantics of what's going on in the code that you're writing, you need to think about the data structures and the layout of the code, you need to try and remember all of the bits of things that you were only taught last week, and put them into practice. I think adding on extra layers of abstraction in the sense of the need to use libraries to do basically anything in Java even just the object orientation which really relies on a lot of fairly deep understanding of what computation is—is a bit much if it's your first go.
Now maybe nowadays no one comes into a CS degree and they haven't done programming before. And maybe therefore you're just fitting them out for industry by teaching them the 500-pound gorilla.
Terry: But you can't assume that.
Damian: But no; you can't assume that and my experience even just a decade ago was that fifty percent of our kids were coming in and they had no programming experience at all. They were taking the course because it looked like it was going to be a good career path or they were interested but they never had the opportunity. So my concern is not what language you teach first, but what kind of diversity, what kind of ecosystem are you teaching that into. I always bring up the example of bananas. The problem with commercial bananas is that they're not grown from seeds. I mean bananas are famous for not having seeds in them and the problem there is that they're all effectively clones. They're sterile and you get new banana plants by cloning existing banana plants which means that modulo their genetic diversity that just happens from mutation—they're all the same plant; they're all the same genome. So when something like Black Sagitoka fungus comes along, if it affects one banana, it affects all of them. And if you want to take this metaphor up—that's the problem with Microsoft and the Windows operating system dominating the entire market, and that's why we have so many problems with viruses, because if a virus will hit one machine it will hit probably all of them.
But the same thing is true when you're teaching. If you're going to teach them just one set of skills then they have to apply those skills to problems where that set of skills is not the set of skills to solve the problem, and you end up with lousy software because the software has to be bent and mangled and beaten into a shape to—
Terry: To make it do what you want?
Damian: That implements in Java. So my problem is I want you to teach me ten programming languages in the three years of my degree so that when I get out I understand functional programming, I understand declarative programming, I understand constraint programming, procedural, object-oriented, aspect-oriented; you name the paradigm; I've had some exposure to it.
Terry: Wow; excellent. So my next question probably makes no sense at all now given what you just said, but I was going to ask if you could recommend a great introduction to programming book that you—just one book that you think could teach a beginning computer scientist—someone starting out in school—most of what they would need to know. I guess there isn't such a thing?
Damian: No. That's my message. If there was one such book then by the nature of it being "one such book" it's not the right answer. It's like saying you know can you recommend one political party? No; the whole point of having political parties is to have different ones so we get a diversity and a richness.
Terry: Okay; great. So, can we move onto Perl and talk a little bit about Perl? First of all, I'm curious. When can we expect to see Perl 6?
Damian: Oh man; this question—the answer I always give when people ask this question is to say "by Christmas."
Terry: Okay; fantastic.
Damian: But I'm very careful not to say which Christmas.
Terry: It's July now so by—
Damian: Yeah, by "a" Christmas.
Damian: The problem with giving you a sensible answer to that is that like many open source projects, Perl 6 development is totally—well, almost totally—volunteer-driven; we do have sponsors and people have been very generous in donating to the development process, but you know we don't have a foundation of fifty programmers sitting in a room implementing Larry's ideas. What we have is a small number of people, a few of whom are supported to do the work maybe one day a week, but most of whom are doing it out of the love of the language and in their own time. And that includes Larry.
The problem we have is that when you have an almost entirely volunteer organization, you have carrots but you don't have sticks, so you can't really drive a process in that way. I mean, I think they've been doing incredibly well driving the process on timelines and getting releases out of Alpha components, but if you ask me how long is it going to take, I can't tell you that because I don't know who is going to drop out, or have things that they have to do, or have to—God help us—go and get a real job and earn some money; so I can't say. To guess, I think next year.
Terry: Okay; well this is 2008, so sometime in 2009?
Damian: Well, see I didn't want to put the year on it, so when people are watching this in 2015, they'll say, "Oh, next year it will be out!"
Terry: I hope people are still watching this in 2015! So you've been termed as Larry Walls' interlocutor; what does that mean exactly?
Terry: What role do you play in Perl?
Damian: So my role—I've kind of had a couple of roles in the Perl 6 development. One has been kind of like Larry's evil nemesis—so Larry comes up with great ideas, but when you come up with great ideas you need someone to come and say well, maybe this would be a great idea instead or maybe this would be a different way of doing that. The richness of the language is that when you throw ten or fifteen solutions to a particular design problem into the pot, and you argue them back and forth, you need another set of eyes that's coming from a very different perspective so that those eyes can see the things that you missed. Especially if you're developing the ideas, you may sometimes not see some of the ramifications of it. So I see that as being one of my roles. But, also in the reverse sense of throwing ideas in the pot, as well, so Larry can then adapt and modify them to his needs, and we get the best of both worlds.
In addition, I guess one of my skills is in communicating, and so I've seen my role as being a way of bringing all of the stuff that's going on in the development process and boiling it down into a form that people can relate to, can see why it's important, can understand why it's taking so long and why it's so hard to get it right. In some sense, being the public face of it, the guy that explains it to the world. The design documents are terrific but they're very highly technical and they're very focused and truncated on just getting the details out. For most people, that's not enough to pick up the subtleties and the nuances of the design so they'll say, "Oh yeah—I could really use that!" So part of my job has been to find ways of bringing the relevance of what we're doing back to the community and keep them informed of it.
Terry: Excellent; could you explain the differences between Rakudo and Pugs? Did I say that correctly—Rakudo?
Damian: Rakudo; yeah, well we probably didn't say it correctly because we didn't say it with a Japanese accent, and I'm sure I won't try and butcher it by doing that.
Perl 5 has been almost famous for the fact that it only ever had one implementation and that's been a weakness in some regards along the lines of the genetic lack of diversity I was talking about earlier, but it's also been a tremendous strength because if it works in one place it will almost certainly work anywhere else. With Perl 6, we're changing that model quite a bit. We're saying we want to have multiple implementations out there if we possibly can and anything that passes the spec can call itself Perl 6. So the very first implementation that was being created was organized and largely driven by Audrey Tang who's an incredibly talented Perl programmer. She's based in Taiwan. The first implementation was actually implemented in Haskell, and was an extremely sophisticated and complete implementation of a prototype of Perl 6. That project has kind of stalled a little bit as Audrey has moved onto other projects and interests, and the long-term goal was always to create an implementation to run on the Parrot Virtual Machine.
Initially that one was just called Perl 6, but when we started to realize that we wanted a whole ecosystem of implementations, you can't just make one of them called Perl 6 because that's not fair. So we needed to come up with another name, and a name that I had come up with several years ago was Rakudo. It was kind of an abbreviation of Rakuda-Do, which in Japanese means way of the camel, which kind of seemed to fit.
But the abbreviation to Rakudo—well Rakudo itself means paradise—so we thought well, you know, that's a pretty high target but that's what we'd like to achieve. So that's what it's called; it's basically simply the implementation of Perl 6 that runs on top of the Parrot Virtual Machine. So the differences in terms of practicalities of them is that Rakudo is currently undergoing very, very rapid development and advancement and change; Pugs is pretty much static now. It's not really being changed at all. They're about comparable in the percentage of Perl 6 they implement, but they have other differences too. Because Rakudo runs on top of Parrot, once it's compiled it runs very quickly because the Parrot is a very effective virtual machine, but we still haven't optimized the front layer of it so that it compiles quickly. On the other hand, Pugs compiles extremely quickly but because it runs interpreted on Haskell, which runs interpreted itself, it runs comparatively slowly.
Terry: Oh interesting; are there plans to incorporate other languages onto the Parrot JVM?
Damian: That is one of the defining characteristics. Well one of the major specifications that we had for Parrot was that it had to be easily target-able by pretty much any dynamic language. It's really a virtual machine that's aimed at dynamic languages. The big virtual machines out there, the JVM, the .NET framework are mainly targeted at languages that are predominantly statically typed and you can simply implement dynamically typed languages on them as J-Python, as Jython, has demonstrated very effectively but they're not really optimized for it. So the Python runtime is optimized and there are—I couldn't list them all for you, but there are like a dozen languages that we currently have prototype implementations for. And of course the other thing is by having a common runtime layer underneath—and it's a very high level abstract runtime—we're hoping that we're going to get a lot better interoperability between those languages, so that you can take, for example, a Java library that you would like to use in Perl 6 and just use it. And the objects that are coming out of the Java library can be treated as being Perl 6 objects, within the limitations of their semantics.
Terry: Wow, that's fascinating.
Damian: It's a very exciting thing.
Terry: That's like dogs sleeping with cats, you know.
Damian: Yeah—that kind of sign of the apocalypse coming.
Terry: Could you talk a little bit about the dynamic type-system in Perl 5?
Damian: Sure. The idea of a dynamic type system is that variables are just containers; they're generic containers, so you go down to the container box store and you buy a box. You don't go down there and buy a box that can only be used to store jumpers, or a box that can only be used to store records, or something like that. You go down and buy a box, and of course what you put in the box determines what kind of box it is. It's the sweater box or it's the record box or it's the junk box or it's the scrap box. Dynamic languages work that way. It's the values in dynamic languages that have types, just as they do in static languages, but when you put a value with a type into a variable, that variable assumes in some way the type constraints of that value, and the only things you can do now with that variable are the things that you could with that kind of value. But if you take that out of the box and put a different kind of value in and the box changes and has a different behavior. That turns out to be very powerful because it's a kind of late binding, so you don't have to make your decisions statically at compile time and end up sometimes with bad decisions. And in many respects when we're talking, for example, about object orientation, it seems to work better if you do have that ability to be late-bound and to defer decisions until you actually have the object and you call a method on it and it does whatever.
Now in object-oriented languages, which are predominantly statically typed, I'm thinking here for example of C++ and to a lesser extent Java, you can very occasionally get bugs that come in because the static type of the variable that you put a dynamic value in is compatible but not identical, and that can lead to occasionally very strange behaviors going on. Dynamic languages—everything in it is dynamically determined. When you do something that's when you look at the type and that's when you decide whether you're allowed to do it or not. Now that has the advantage of being less susceptible to mismatches at the compile stage and it has the disadvantage that you tend to get your error messages at runtime rather than at compile time, which puts a lot more emphasis on good testing.
Terry: Okay; so you're introducing static typing in Perl 6?
Damian: We are.
Terry: And why are you doing this? Was there a sort of a backlash? What's the motivation?
Damian: The motivation for it is—
Terry: And will you keep dynamic typing as well?
Damian: Oh yeah; the first thing to say is Perl 6 is a dynamically typed language. There is no question about that; the type information resides with the values. Perl 5, if you look at it the right way, you've kind of got to squint your eyes and tilt your head a little, bit but if you look at it the right way it's also a statically typed language. So in Perl 5 you have basically three kinds of variables: you have a scalar variable, you have an array, and you have a hash, which is an associative array or a dictionary in other languages. And the only things you can put into scalar variables are scalars. Now the point is that there are strings and numbers and references and all kinds of scalars, but you can only put a scalar in there. And there are operations, for example, that you can only do on an array and if you try and do one of those operations and give it a scalar it will give you a compile-time error. So Perl 5 already has this static typing but it's a very indiscriminate kind of static typing. There are only three choices.
So, what we've decided is, well, static typing has advantages too in the sense that if you know what kinds of values you want to put in a variable, then it can be very nice to get compile-time warnings when you don't. So, it would be very nice to have the option of adding static typing, but of course optional static typing is almost an oxymoron. It only works if it's always there.
So what we think of Perl 6's type system is—it is a statically typed system with defaults, so you don't have to put a type on a variable, but if you don't, then it defaults to its universal type, which again is scalar or array or hash. But you can say no; I want to be more specific about that not just a scalar, but a scalar that can only store numbers. So, if you need it, it's there; if you don't, it's not going to get in your way.
Terry: Excellent; that's great. So, parameter passing modes in Perl 6, there's positional, named, and something called slurpy. Most importantly, what is slurpy?
Damian: Well, let me go back to this: the most important thing to say is that Perl 6 subroutines have parameters. Perl 5 ones don't; everything just comes in in one big array and you have to extract it yourself. It's only been forty years since the concept was first used in programming languages, so we're getting there eventually.
We're all familiar with positional parameters; positional parameters just say look, the first argument has to be the name. The second has to be the rank. And the third has to be the serial number. And that works great for name, rank, and serial number because that's the order you always put them in anyway. Most languages really only provide positional parameter lists or arguments. The problem with that is that it works great when you've got one or two or maybe three parameters, but some subroutines need like eight or ten or fifteen to do their job because they're just so configurable. And when you've got eight, how do you remember the order of them? The answer is, you don't; you have to go and look them up every time you want to use them. So an alternative way of passing parameters is by saying I'm going to pass you the eight arguments here but I'm going to give each of them a name. I'm going to label each of them and if they've got a label it doesn't matter what order they come in because I can look at the label and say "Oh, so that's the name and that's the rank and that's the serial number."
That sort of named parameter passing isn't really well supported in most languages and certainly not at the compiler level. Most people will pass the dictionary or hash or an associative array, which has names, etc., but then there's no type checking or sanity checking on that. You have to install it yourself.
So Perl 6 has that built in. Every parameter obviously has a name, but what you can do when you call the subroutine—you can just say the name of the parameter and then the value, and it will get assigned to that even if it's in the wrong order. And there's a slightly different syntax that says this is a named one so it knows to rearrange them.
Now the problem with that is that there's one other thing that you want to do when you're passing parameters. A lot of subroutines will have a fixed number of usually one or two parameters that tell them what to do and then you often have an arbitrary number of parameters that follow that say this is the data to do it on. A simple example of that is a map or apply operation where you say here's a little function that I want you to apply to each of the following values and it's going to be a list of values—it could be one of them or there could be 100,000 of them. Of course I don't want to have to define 100,000 named parameters to do that; I want to have something that, just having taken the expected first parameter, simply sucks up all the remaining parameters into a single array structure. And that's a slurpy parameter; it slurps up all the remaining arguments and presents them to you in one container, which you can then process in a sensible sort of way.
Terry: Interesting; so that's an extremely functional feature of Perl, I would say, right?
Damian: Yes, and for Perl 6 we are bringing far better support for functional styles of programming into the language.
Terry: So currently it has support for map and apply...are there any new—?
Damian: Well the most important new one is Reduce.
Damian: And without that you really can't get to the end of your functional programming on list operations because you could never boil it down to the one value you want. But most of the extra features that were required were not so much built-in functions, but the ability to have this sophisticated parameter specification mechanisms and also to have multiple subroutines with the same name but different parameter lists so you don't have to cater for every parameter possibility and then do if statement. Functional languages, most of them will allow you to say there will be three different versions of the head function, the first of which if you give it an empty list, it will give you back and empty list; if you give me a single value I'll give you back that value; if you give me a list I'll give you the first one and then do the head on the rest. So we're going to support that as well in Perl 6.
Terry: The sigils, can you just talk a little bit about what they are, if I pronounced that correctly and is that being removed for Perl 6?
Damian: I can never tell whether someone from North America is pronouncing something correctly because you get everything wrong as far as I can tell. The word I use is sigils and the sigils are the line noise at the start of variables in Perl. Now, this is not an idea that Perl invented; if you use any kind of shell, then you know that the environment variables has dollar signs at the start of them. Well, in Perl, any scalar variable has the dollar signs at the start of it; if it's an array it has an @ sign, if it's a dictionary or a hash, it has a percentage sign. And those then give the ability to have both a scalar variable and an array that have the same identifier name but have different purposes. Perl has had these from its very beginning, and there's always been a problem with them. The problem with them has been that they are like grammatical inflections, so that if I have an array I put an @ sign in front of it, which is roughly the same as saying "these things." But when you want to talk about one of those you have to say "this thing"; you can't say "these things." So, in Perl 5 what you do to make that work, if you want to refer to one element of an array, you change the sigil. You change it from the @ sign which means these to the dollar sign which means this. And then you put a lookup in it—square brackets—to say which index you want. Linguistically this is a very elegant idea and it gives you the ability to specify quite subtle distinctions of behavior. But it turns out that in practice, ninety percent of the programming population just can't get that idea into their heads. And it's not because they're dumb; it's just because it's not a natural way for them to think. The natural way to think about sigils is if it's a dollar then it's a scalar If it's an array then it's an at-sign. If it's a percentage sign then it must be a hash. And the problem is that when you have to change the sigil when you change the way you use it that confuses people—especially people who speak English because we don't do that very much.
Other languages have a lot more inflection when you change how you're using a word—not so for English. So we found that a lot of users of this just were not capable of dealing with the "Oh, you have to change the sigil even though it's the same variable." So in Perl 6 we changed that; in Perl 6, the line noise at the front of the variable always stays the same. If it's an array it stays @ sign no matter what you're doing with it—whether you're looking it up, whether you're taking sublist of it, or whatever. That's been a really important change in Perl 6; it's going to be challenging for experience Perl 5 programmers, but once you get the hang of it you never want to go back. It's very much better.
Terry: I asked you earlier what books you think would be great for an aspiring computer science student but more general from that, what general advice would you give to a young aspiring programmer? And I think you sort of touched on that earlier.
Damian: I think that my advice if I'm going to be consistent would echo what I've said and that is—it's good to be deep as a programmer, it's good to be deeply talented in one or two fields, but it's more important to be broad. It's more important to have at least a general understanding of a great range of ways of doing things, a great many styles of programming, a great number of programming languages and to just be familiar with a lot of different algorithmic approaches to things because that gives you repertoire. It gives you the ability to—in a particular situation when it's crunch time and there's not time to come up with a clever technique just out of your head—you've got something to fall back on. You say, well if we were using Haskell for this then we would just do this thing; can we adapt that to what we're doing here? Not saying you have to use Haskell to solve it, but could you use that kind of approach? I think that's really important. To be a good programmer, you have to be a broad programmer.
And the other thing that I would say is that to be a good programmer you have to actually program. And this is something that doesn't happen. You know we go through our schooling, we start out and we learn all these things and we're constantly doing exercises and assessments and so forth and then you get out and you start going to meetings and doing design, and all the rest of it, and you stop coding. And if you get promoted, then you're literally promoted out of the opportunity to do any coding, and I think that's a problem. If you want to be a really good tennis player, you will go out and practice every day. If you want to go and be a great martial artist you'll be in the dojo every single day. If you want to be a great programmer, you will code every day—even if you have to find time on your own to do that. If you're up you're a programmer. If you're up at 11 o'clock at night anyway, or 3 o'clock in the morning, some of that time at least has got to be spent coding because as soon as you get rusty, you start dying as a coder.
Terry: Fantastic. Well thank you very much. This has been Damian Conway; thank you very much.
Damian: Thank you.