TeXtalk: an interview with David Carlisle

Welcome to the TeXtalk! We have a very special guest for today: our friend David Carlisle, member of the LaTeX3 project, editor of the MathML spec, a very active member of TeX.sx, 16k+ rep, 104 badges, and 325+ answers so far. Get ready for this awesome interview!

Paulo Cereda: Could you tell us a bit about you?

David Carlisle: Well I’m 50, living in England, and I started using TeX in 1987, but these days I’m more of an XML person, working for NAG in Oxford.

Joseph Wright: You started with TeX in 1987 – was that with Plain, LaTeX or ‘something else’, and what was it that led you to TeX?

David: LaTeX pretty much from the start. I came back to Manchester to work in the Computer Science department (after a 2 year post doc at Cambridge and the previous 6 years as a student in Maths at Manchester). The CS department was split in to the Framemaker and LaTeX camps, but from the start the idea of an open document format appealed, so someone lent me a TeX book while I was waiting for my machine to arrive and I accidentally read to the end before I got the machine set up.

Ariel: How important do you think a mathematical background is to learning the ins and outs of LaTeX? I see that most of the core people in the project have a substantial connection to maths. For example, did elements of your maths training come in handy writing the longtable package?

David: I don’t think it helps much really with using TeX (it does help to have some idea of a compile-edit-run loop; if you are only used to a word processor TeX is difficult) but a math background, especially then, helps enormously in motivating people to learn. At that time there basically was no alternative; people had been happy with IBM golfballs etc., but once you saw TeX output you were no longer satisfied, so more or less every working mathematician was forced to learn TeX even if they had never used a computer before. It’s a bit different now.

N.N.: What editor did you use for TeX and LaTeX as you started using it? Were you happy with how it handled such work?

David: Is there more than one editor?

I started using emacs at the same time and used it for everything: email, news reading, web browsing sometimes… I did recently switch to Thunderbird for email, but I still miss rmail really.

Werner: I see that you’ve been a member of TeX.sx almost from its inception (Sep 2010), but only recently started contributing (around Jan 2012) – at quite a staggering pace… What was the cause for the delay?

David: I’m an editor of MathML and have a Google alert. Mostly I ignore them but like to see them going by, but as that one was a TeX-related forum and from someone at my old university at Manchester I signed up and answered the question but didn’t visit the site again. Then around the start of this year Frank [Mittelbach] pointed me at a question on one of my packages and suggested I might like to answer.

Werner: It’s always good to have package authors hunting and pecking around this site. :)

Ariel: I was frankly quite amazed when I made several longtables and then when I ran into issues, David himself helped me out! I still send links to people to point out how responsive and awesome the LaTeX community here is – thanks to your untiring answers and solutions to every tough longtable question!

N.N.: How could you accidentally read The TeXbook to the end? What caught you?

David: Well I had a machine on order but had never worked in a CS department at all so was wondering what to do and reading that seemed as good as anything else, if you read for 24hours a day it doesn’t take long (I don’t remember how long, not that long as I recall.)

Joseph: You did a lot of work on tables, one of the most complex parts of TeX (surpassed only by the Output Routine, in my experience). Was there a particular reason for that, and is there any area where you feel (La)TeX table support is still lacking?

David: It was all a mistake. I’d read The TeXbook so I knew that \halign constructs broke over pages, but I decided to be a LaTeX user rather than plain, so I was just trying to work out why tabular didn’t break over a page, so opened up the sources and saw it was boxed, so wondered what would happen if you took the boxes out and wrote longtable by mistake.

I posted it to comp.text (or .tex) and a few people used it and sent bug reports etc., and things just grew.

Joseph: Ah yes, I recognise the ‘I wrote a package by mistake’ problem: that’s how I wrote siunitx and then ended up involved in LaTeX3. How did you get involved in the LaTeX kernel?

David: I had started answering questions on c.t.t (mostly to teach myself TeX, force yourself to answer questions in areas you never use) and I’d had some email contact with Frank over a variant of his array.sty that I’d made, then Frank and Chris [Rowley] offered me a trip to Hamburg one day (It was the DANTE meeting that launched NTS and they were having a 2 day LaTeX3 workshop afterwards) so I went there and got the LaTeX3 kernel afterwards and so joined the gang in 1992.

Stephan Lehmke: I was also at the DANTE meeting in Hamburg, btw 🙂 Another thing that started there is NTS. Later there was ExTeX. Were you ever involved with TeX at that level? Is LuaTeX the only way into this kind of future?

David: Not Really. Frank and Chris were the ones on the LaTeX3 team who really understood TeX-the-program I always really worked from the level of The TeXbook up. I think a Unicode version has to be the future but I wish I knew enough to know why there are two at the moment.

Ariel: Do you still add features to your longtable package based on questions from here?

David: The thing about TeX is if you change anything you break something. TeX is like lisp in that the program is available as data and people can and do patch commands at arbitrary places, so if you leave a package stable for a while (say 20 years) you just know that some packages rely on the third token of \LT@some@internal@mess being \relax. This makes changing things hard. I did change colortbl in response to some issues here and I already have reports of things that broke (which I’m not sure what to do about yet).

Joseph: You mentioned that nowadays you are mainly focussed on XML. Do you see that as a long-term complement to (La)TeX syntax, or as a replacement? For example, MathJax is making LaTeX-like syntax more, rather than less, popular on the web (at least that is how it looks to me).

David: Both. In 1998 of course, XML was going to take over the world, replace HTML, etc. etc., it didn’t quite work out that way, people didn’t want to switch and browser implementers hate XML with a vengeance so HTML and XHTML (from a specification point of view) stagnated for a decade but then picked up with the (non XML) HTML5. But of course XML (or at least an XML compatible DOM tree) underlies that and so MathML can (and does) slot in to the HTML5 world.

Real people though will never type documents in XML syntax in emacs (I’m not sure why).

Yiannis Lazarides: Where do you see TeX/LaTeX going?

David: Hard to say, at the height of the XML boom I’d have said that author syntax would move to XML (probably hidden by the editor) and TeX would be a back end typesetting role, but now I think that is only one path, MathJax shows again that given the choice people prefer to use a TeX syntax even in a MathML based system

Yiannis: How about LaTeX3: do you think it is going in the right direction?

David: I hope so, I haven’t been paying attention :-) I get all the internal mails but I had to drop out of the loop a bit (you can only really work on designing one language at a time) but it has been encouraging to see Bruno and Joseph etc., kick starting the project again.

Paulo: You are one of the principal authors of the world-known Mathematical Markup Language – MathML. How did you get involved?

David: Unlike TeX, MathML was actually the day job. When I came down to Oxford to work at NAG it was initially on a European Union funded project mainly of universities and a few small companies to develop the OpenMath standard for math markup. OpenMath and MathML are closely linked and I ended up bugging the Math Working group so much they invited me onto it, just at the time that the MathML1 spec came out.

Ariel: Does LaTeX3 change anything from LaTeX2e for complete beginners who have not started to cut open and hack packages yet?

David: That’s the wrong question really. LaTeX3 is being designed from the bottom up, so low level data structures and programming constructs, with which you could design a user interface for defining document level commands with which you could define document commands, which a user could use in a document. But really only the first layer is there yet.

There was a complete stand alone LaTeX3 system back in 1992 when I joined , but it struggled on the machines of the time and wasn’t really usable except for those interested in TeX programming (rather than just typesetting their documents).

Paulo: One of the most recurring questions on the main site is about converting TeX to HTML. Do you think the existing approaches can cover most of the cases or maybe there’s still a lot of room for improvement, especially with the emerging formats?

David: The current approaches are getting a lot better. LaTeXML from Bruce Miller at NIST is particularly impressive (he told me he uses my xii.tex as one of his test examples, the only other converter that I can see handle that is tex4ht) . The openness of TeX is fundamentally a problem though: anyone can come along a make a package and extend the language and somehow converters are expected to cope.

If you start off with 6 lines of Perl to make a blank line in to <p> then your system gets stressed if the first document you are given is a utf16 document in Chinese using TikZ.

Paulo: CTAN lists several of your packages. Could you tell us a bit about your first package?

David: I think it was probably longtable, I remember mailing someone asking if they could add it to the ymir archive in the States (there was no CTAN then and the UK wasn’t on the internet so even mailing outside the country was something of an adventure) and you had to use the TeX-friendly vvencoding as plain text or even uuencoded text was likely to get mangled in janet/bitnet/internet gateways.

Most other packages came out of answers on c.t.t. Some of them I think Robin [Fairbairns] just took from c.t.t and put on the archive with my name on; there were no worries about licences and stuff in those days.

Paulo: 16k+ rep, 104 badges, 325+ answers so far. What’s your secret? :)

David: Well it was an interesting challenge to myself to see if I could play this game even not really having used TeX for a decade. It’s like riding a bike really you never forget the principles, (and the questions never change :-)), but I’ve had some experience at typing badly typed answers on help forums, between c.t.t. and tex.sx I posted nearly 11,000 messages to xsl-list, so basically I’ve been doing this since the start. After a while you lose your shyness and don’t take total ignorance as a reason for not posting an answer.

Ariel: What is your favourite brand of tea? And loose leaf or bags?

David: I’m not really a tea connoisseur anything warm and wet (coffee more often actually).

Yiannis: LaTeX2e has a very complicated architecture, were you involved with the design? For example the vertical skips for the lists are hidden in the .clo files, one would think that all the list variables would have been collected in one place.

David: Well yes I suppose so (although of course Frank mainly should get the credit for the good bits and any bad bits naturally we blame on compatibility with 2.09).

Yiannis: :)

David: I think it’s natural (as far as anything is natural in TeX) for explicit dimensions to be class options: LaTeX sets up the framework, the class sets up what list and display environments are available but it is not until you handle [12pt] that you can set explicit dimensions.

egreg: I’m late for the party, I know. I started using TeX for writing maths (the final “s” is a homage to David, of course); at the time the only way was AMS-TeX. Why did you choose LaTeX?

David: Well I was working in a CS department that had some very advanced LaTeX users but secretly I was wanting to get some math[s] papers finished off, then I just got interested in TeX for its own sake. So I had access to the psLaTeX developed at Manchester from the start with scalable fonts, separate formats for Times, Helvetica, Bookman. One of my earliest codings was the pspicture environment to make postscript lines to fit with those formats.

I never actually used AMS-TeX (but did look at LAmsTeX some years later).

percusse: Though hindsight is 20/20, is there anything from the XML world, something that we don’t get to hear from you much, that feels like it’s missing from TeX and friends? Or any emerging tech ideas to bring in for increased flexibility? The example I have in mind is the <key>=<value> usage as an example (via many packages pgfkeys, xkeyvals etc.) which made my life much easier when handling and taming tedious option sets. I don’t know if they are really recent but I think they gained popularity relatively late (probably due to TeX.SX too).

egreg: Don’t forget that David is the author of keyval, the first package implementing the key-value method in LaTeX.

percusse: Yep I tried to soften it after I posted, I feel dizzy from the immediate blush…

David: The main thing that XML systems offer (and the main thing that Leslie [Lamport] really tried to emulate as far as possible in LaTeX) is a separation of the syntax from the application. But in TeX, even in LaTeX, that is never possible. In (some version of) an ideal world there would be a high level declarative description of the allowable document markup and one or more implementations of that.

But in that world you can’t answer a question by some trick use of 31 \expandafters to poke a token into an internal command, so the ideal world sounds better but it is less fun.

Actually I think Timothy van Zandt gets that honour with PSTricks, but I think keyval may have been the first to offer a system that allowed you to apply the parser to different packages.

We did have an extensive keyval template system for LaTeX3 customisations of document class parameters. I’m not sure it ever got out though. Joseph Wright would have a better idea of its current state.

Joseph: I think you are referring to the template system. That now works quite well, after quite a lot of revision. There is also l3keys, which I wrote inspired by pgfkeys but without the object-oriented stuff. l3keys is the LaTeX3 version of the keyval package, more or less, but uses a keyval interface to set up complex keys, such as choices.

l3keys is how I accidentally joined the LaTeX3 Project.

David: Schedule an interview here in 20 years time, to see how you liked the experience.

Joseph: 🙂

egreg: Do you think we’ll have LaTeX3 by then? /just joking

David: You’re the working mathematician: how big can epsilon get? 2e is 2-and-a-bit waiting for 3 to come along.

egreg: Usually \varepsilon is small. 🙂 But when \LaTeX2$_\varepsilon$ came out it was a big achievement.

Paulo: Would the separation be like applying XSLT to the content and get the implementation?

David: Yes, perhaps. Which is what we do a lot here at NAG, but we are a closed world where we have complete control (over our own documentation) but it tends to fall apart if you want to strictly validate the allowed constructs in the input document and be able to extend the syntax on the fly to answer random questions on TeX.sx. It is not clear yet what the right balance is.

percusse: OK somebody has to ask this: I know you are reluctant to see any TikZ code floating around. What do you think of the TikZ/PGF/Beamer packages which I tend to see as a product family? And if you like you can put PSTricks into the game, making it a graphics question. Do you see them as fancy extras or things that filled a gap?

David: I’m not reluctant to see TikZ. I think it’s a good thing (I used to know the innards of PSTricks quite well, and as I mentioned above I did one of the first re-implementations of picture mode for a postscript back end); the running gag about TikZ is that it’s one of the few things that’s completely alien to me.

A lot of questions and more answers use it, but I had never even heard of it until I started on this site this year.

percusse: Excuse my teaser 🙂

Paulo: How do you see the new gems like XeTeX and LuaTeX?

David: Luatex and XeTeX are different, I knew about those but had not actually used either. I have run them now (so long as I get a MWE from the OP) The one time I tried to make a XeLaTeX file when not presented with a MWE I messed up fontspec completely 🙂

Paulo: How do you feel about TeX.sx?

David: It’s been fun, and might have got my TeX activity back to the point I start reading rather than archiving the LaTeX3 internal mails, we’ll see…

Paulo: Do you have a favorite answer of yours? 🙂

David: Not sure, I suppose this one just about typifies my preferred answering style: How do you swap the commands for two symbols?

Werner: Yes, that was… is epic!

egreg: I’ve used 15 \expandafter‘s in a row; what’s your career high? 🙂

Werner: Here is perhaps another career high in Philippe Goutet’s answer: Why do we need eleven expandafters to expand four tokens in the right order?

David: When not trying to use as many as possible to avoid work I’m sure I once used 31 to do something useful, but I can’t find it now.

Of course you get bonus points of you do something with a long line of \expandafter that isn’t 2^n-1 in length 🙂

egreg: Fully agreed!

Paulo: Can you name something you really like in LaTeX? And is there something you dislike?

David: I really dislike the pressure to type it in camel case. What I liked about it originally, and is still I think the main point, is that it makes my mathematics look readable (my handwriting is worse than my typing, so this is no small thing)

Ariel: Any hidden hobbies apart from stalking TeX.stackexchange and bringing World peace through longtable solutions?

David: Well hobbies and things got a bit re-aligned 8 years ago (not coincidentally the same time as me having less time on the LaTeX3 work) because of the other computer user in this room (who’s navigating the lego website if I recognise the tune 🙂

egreg: Does your kid use LaTeX? 🙂

David: 8’s probably a bit young for LaTeX but he does program in scratch, any of you wanting to introduce programming to children I couldn’t recommend scratch highly enough scratch.mit.edu

And he does use the program that is probably the most widely installed uniocde-aware xml editor that also implements most of the math layout rules from appendix G of the TeX book. I dare not mention its name in this forum.

Ariel: Didn’t know about that! Do you think its geared towards any particular type of language e.g. say, Perl?

David: scratch? it’s really nice: you make while loops and message passing multi-threaded programming style just by pulling down graphical icons and slotting then together.

Ariel: Good to know. Programming should be made mandatory for all kids these days… as opposed to say learning about logarithms. it’s an uphill task getting useful stuff done without programming skills and just sticking to the click-everywhere culture.

egreg: Does he like cricket? Of course cricket had to be mentioned in the interview.

David: Sadly he seems to prefer football.

Paulo: Which team? 🙂

David: Playing rather than watching 🙂

Paulo: ah! 🙂

Jasper Joy: By the way shouldn’t ConTeXt be pronounced like contact since TeX is pronounced like tech?

David: Leslie gave a talk on LaTeX4 at a TUG meeting some years ago, he said LaTeX about 50 times during his talk and pronounced it differently each time.

Jasper: Haha.

Ariel: how do you pronounce it: Laytech or Laaaaahtech or some other variant?

David: laytech, more or less, but after 25years of getting offered interesting clothing options, I’d quite happily call it something else.

egreg: Looking forward to the planned TeX.SX meeting in Oxford. Prepare a good list of pubs. 🙂

Martin Schröder: One of your mostly unsung heroics is XMLTeX. It’s actually what’s under the hood of Stephan Lehmkes DocScape and a typesetting company in Leipzig uses it extensively. Any chance of a LuaTeX version of that?

David: XMLTeX is a strange beast. I wrote almost all of it in a couple of weekends so it was never any long term planned project. Sebastian used it in passivetex (which was why I wrote it) but then I thought no one used it and a decade goes by and then this year I have found two commercial typesetters using it, plus some questions from people wanting to use it in a student project, all come (to me) within the space of two or three months.

A large part of XMLTeX is messing with encodings so using a Unicode back end would simplify it a lot, but on the other hand parsing XML with TeX is a bit like the swapping two definitions answer I mentioned above: it’s an interesting TeX challenge but kind of pointless when you can use a real XML parser and XSLT to parse the XML and write out TeX-syntax TeX.

Martin: And I have to ask about the elephant: What do you think of ConTeXt?

David: I’ve never actually used ConTeXt but Hans was developing around the same time, and we met at several TeX meetings and had one specific workshop just for the ConTeXt, LaTeX and e-TeX teams (somewhere in Germany, I can’t remember) ConTeXt came after LaTeX so could benefit from that experience, It’s a nice system but I think the open package extensibility of LaTeX will always make it more popular, if more chaotic.

But it is good to have an industrial strength format that is pushing the development of the TeX engines more than LaTeX can, as LaTeX’s user base means that compatibility and stability are more important I think.

Martin: Oldenburg 1997. I was there. 🙂

David: Yes that’s the one!

Martin: Did you already have a look at Patrick Gundlach’s publisher (speedata.github.com/publisher)? It’s basically DocScape done in Lua, but lacks the access to LaTeX given by XMLTeX.

David: Not really (I’m not really using TeX these days 😉

Martin: Any chance of you working at the LaTeX3 output routine again? I think the current version is still the one you did with Frank in 2002. 🙂

Joseph: Give me a chance to finish the review first (not that I know what I’m doing!)

David: Never say never. (Frank and Chris were the main ones that worked on the 2e routine. I was working on some extensions to xor around the time we were doing the LaTeX Companion but then got diverted).

Martin: And any chance of seeing you at a TeX conference again? The next BachoTeX will be very special (20 years of BachoTeX)…

David: Conference travel isn’t so easy now as it’s hard to claim it’s work related, so possibly not, we’ll see.

Martin: So we’ll have to wait for a TeX conference in Oxford. 🙂

Joseph: As luck would have it, we are planning the UK-TUG AGM at the moment, and it may well be in Oxford, some time in the autumn 🙂

Paulo: What do you recommend for a newbie eager to learn TeX, LaTeX and friends?

David: Start with a complete document that you actually want to make, and go with the flow, don’t start off by asking questions like “I started using LaTeX yesterday how can I use lfgwhqhdq font at 12.8888 pt in alternate headers on a translucent pink background”. Save those questions for your second document.

Paulo: Any favorite resources? The TeXbook, The LaTeX Companion

David: Both those, but my main resource is to never close emacs and always have a buffer with latex.ltx in it.

Paulo: Thanks a million for this great interview!


Stay tuned for the next episode of TeXtalk!

Leave a Reply