Irreal

Tuesday, April 26, 2011

Moving Day

I'm moving the Irreal blog to my new Web site at irreal.org. I like Blogger and have been happy with it but I want to host the blog under my own domain and since I'll be using WordPress my work flow will be easier because I can use org2blog to post directly from Emacs.

I'll leave this site up at least until I get all the posts here moved to the new site and probably longer than that. I hope you'll visit the Irreal blog at its new home.

Saturday, April 23, 2011

Supreme Court Briefs Before Technology Took Over

In view of my post about writing theses before TeX, this article in Forbes by Ben Kerschberg on Remembering The Magic of Supreme Court Briefs Before Technology Took Over really resonated with me. Kerschberg remembers what it was like preparing briefs for the Supreme Court in the “old days” of 1995. In those times, briefs were still printed using hot-metal typesetting, an expensive and time consuming process. Although Kerschberg doesn't mention it, the Court had its own hot-metal press that they used to print opinions and even memos between the justices.

Kerschberg describes going to the printers when briefs came off the press to proof them. They always dressed casually because the proofs were still wet from the ink and they invariably got the ink all over themselves. As with the technical typing I described in my post, errors were extremely expensive because a single typo would mean that a whole line or even paragraph would have to be reset and reprinted. This, in turn, meant they would have to be reproofed for any additional errors that got introduced.

The article is interesting and worth a read. It's another reminder of how much technology has made our lives better. It's easy to bemoan the loss of the beautiful printing that hot-metal presses produced but no one misses the extra work it resulted in, even for those not involved with the printing itself.

First Draft of R7RS For The Small Language

As I recounted in my Two Schemes post, the Scheme Steering Committee decided to split Scheme into large and small languages and formed working groups for each. The small language working group has just published their draft R7RS for the small language. I've read over the draft and am very pleased with it. At 67 pages it remains small and concise and, in fact, it retains the flavor of R5RS as promised.

So what's different? There's a complete list of changes at the end of the draft, and also in the post announcing the draft, but here are some that appealed to me:

Modules are now part of the language definition. This is important because it was one of the major sources of incompatibilities between different implementations.
Records à la SRFI 9 are now included as part of the language.
Parameters, sort of like dynamic variables from Common Lisp, are introduced as global variables that can be safely dynamically rebound. Objects such as current-input-port, current-output-port, and the newly introduced current-error-port are defined as parameters.
Blobs, homogeneous vectors of integers in the range [0..255], are a new disjoint type. This is to allow Scheme to better handle binary data. Ports can now be defined as character or binary.
letrec* is now part of the language and internal defines are specified in terms of it.
case now supports a ‘=>’ syntax analogous to cond's.
case-lambda is now part of the base library as a way of dispatching on the number of arguments.
string-map, string-for-each, vector-map, and vector-for-each round out the sequence operations.
Character escaping in strings is expanded.
Case folding in the reader is configurable with no folding as the default. Generally I don't care about this because I alway use lower case anyway and don't understand why anyone would do otherwise. As an Emacs user I already do enough chording without worrying about capital letters. Still, it does raise an issue: as soon as Louis Reasoner sees that case matters he starts using really ugly camel case identifiers like GeeIWishIWereCodingTheWindowsKernel. Now Louis is welcome to do whatever he likes as long as I can ignore it, which I can if there is case folding. Of course, once Louis gets used to case mattering, we get stuff like
```
(define sum-odds
  (lambda (lst)
    (define SUM-ODDS
      (lambda (lst result)
        (if (null? lst)
            result
            (if (odd? (car lst))
                (SUM-ODDS (cdr lst) (+ (car lst) result))
                (SUM-ODDS (cdr lst) result)))))
    (SUM-ODDS lst 0)))
```
Mostly, I just wish this issue would magically go away. Or better yet that people would be sensible in their coding conventions.
No Unicode support is required but the set of characters used by an implementation are required to be consistent with the latest Unicode standard to the extent that the implementation supports Unicode.

Most of this is already in R6RS—indeed, what isn't in R6RS—and almost all of it can be added by the user. For example, we've all written code like

(define string-map
  (lambda (f str)
    (list->string (map f (string->list str)))))

as a quick and dirty string-map but now we have a, presumably, more robust version that handles multiple strings. Even the reference implementation of parameters can be done at the user level—this is, after all, one of the strengths of Scheme. What's nice is not a lot of the work is done for us—well, OK, that is nice—but that these things are now standardized and portable. There's more than I've mentioned here so be sure to checkout the draft itself or at least the list of changes from R5RS.

I'm looking forward to the large language R7RS draft. It will be interesting to see if the working group does indeed manage to produce the “happier outcome” that the steering committee promised.

Monday, April 18, 2011

More on Passwords

The other day, I saw this little piece of fluff positing that passwords such as “this is fun” are more secure than passwords such as, say, “j4fS<2” because they're longer and easier to remember. In fact, the author says that “this is fun” is 10 times more secure than “j4fS<2”. There's a lot wrong with this article and I was going to blog about it but Troy Hunt beat me to it with this excellent post.

There's a lot of confusion and, let's face it, just plain cluelessness about choosing safe passwords so it's good to have articles like this one shed some light on the matter. As it turns out, Hunt has written a series of posts on passwords all of which you can see here. I urge you to read these; the examples he gives of banks, airlines, and others using horribly insecure password policies will take your breath away. Think I'm exaggerating? How about ING using a 4 digit pin as a password for online banking? There are plenty of others too.

Hunt was rightly puzzled by all this so he decided to query some of the worst offenders as to the reasons for their policies. He describes his results in The 3 reasons you're forced into creating weak passwords. I won't ruin the surprise answers that came back—you should read them yourself. Of course, the answers are sadly all too predictable for there to be much surprise but you should still read them.

It's not all fun and games, though. There's a lot of useful information in these posts. For example, I didn't realize that passwords over 14 characters are essentially safe from rainbow table attacks. In view of that, I'm considering having my password generator output 15 characters instead of 10.

Saturday, April 16, 2011

Lisp Is Not An Acceptable Java

Today (April 16) on the Lisp reddit I saw a link to Lisp Is Not An Acceptable Java, a post that repeats every known myth and piece of misinformation about Lisp. Even the title is snort-worthy. After reading the post and shaking my head sadly, I moved on to other things but my mind kept jumping back to the article and I thought that, really, this shouldn't go unanswered. So I got all fired up to give it a good Fisking. But when I read it again, I thought, “this guy is obviously a troll and shouldn't be fed.” Then I noticed that the date on the post was 2011-04-01. OK, an April fools' joke and a good one. The author, Matthias Benkard, is, of course, a Lisper.

What's sad about this is that it works so well. The attitudes and misinformed opinions that the Benkard parodied are common and you do see them expressed all the time. Just last week I blogged about LispWorks' article Common Lisp—Myths and Legends, which was written to rebut the very things that Benkard was satirizing.

I'm not sure what to do about all this. Nobody, it seems, is. All we can do is ~~smack down~~ gently correct those who repeat these misconceptions and hope that the meme of Lisp's irrelevance and manifest shortcomings is not too firmly established to be destroyed.

UTF-8

We all know what UTF-8 is—sort of. It's a character encoding that vastly expands the number of available characters (to 2²¹ characters) yet is still compatible with ASCII. This is a good trick and I occasionally wondered how it worked. I tried reading Pike and Thompson's paper about it when they first introduced it in Plan 9 but while the paper is good on history and “whys,” it doesn't go into much detail on how the encoding works.

Happily, via reddit, I just stumbled across the best explanation of UTF-8 that I have ever seen. It's short and simple and explains everything in a very clear way. If you don't already know all about UTF-8, you should definitely spend 5 minutes with this article.

In conjunction with the above, you might also want to read Xah Lee's excellent post on Emacs and Unicode Tips. It shows you how to get access to these extra characters in Emacs and gives some guidance on integrating UTF-8 into your day-to-day work with Emacs. For that matter, Lee has a Unicode Tutorial page that you might find useful.

Friday, April 15, 2011

Generating Strong Passwords

I'm working on a project that requires several passwords that will be used only occasionally but are not “throw away” passwords. I also need them to be reasonably strong. I could just make some up and save them in an encrypted file, of course, but that would mean just one more file to update and maintain. What I really want is some way to generate then on the fly whenever I need them. The other day I saw a password generator, SHA1_Pass, that does pretty much what I need. What I don't need, though, is a heavy duty app with gui dialog boxes and all that.

So in the DIY/NIH spirit I rolled my own. I want to just type in a name such as “Joe” or “Mary” and get back a strong password. A simple way of doing that is to hash the name with SHA1 and map the result into printable characters. For a little extra security, I concatenated some random characters to the name before I hashed it. Here's the resulting c code:

 1:  /*
 2:   * makepw.c -- generate strong passwords given a key phrase
 3:   *
 4:   * Compile with:
 5:   *    gcc -Wall -lcrypto -o makepw makepw.c
 6:   *
 7:   */
 8:  
 9:  #include <openssl/sha.h>
10:  #include <stdio.h>
11:  #include <stdlib.h>
12:  #include <string.h>
13:  #include <stdint.h>
14:  
15:  int main (int argc, char **argv)
16:  {
17:      int i;
18:      int j;
19:      uint32_t wd;
20:      char xlat[] = "abcdefghijkmnopqrstuvwxyzABCDEFGHIJKLM"
21:          "NOPQRSTUVWXYZ023456789.!-$";
22:      char buf[ 256 ] = "Some-Secret-Key";
23:      union
24:      {
25:          uint32_t s1[ 5 ];
26:          unsigned char md[ 20 ];
27:      } u;
28:      
29:      if ( argc != 2 )
30:      {
31:          puts( "wrong number of arguments" );
32:          exit( 1 );
33:      }
34:      strcat( buf, argv[ 1 ] );
35:      SHA1( (unsigned char *)buf, strlen( buf ), u.md );
36:      for ( i = 0; i < 2; i++ )
37:      {
38:          wd = u.s1[ i ];
39:          for ( j = 0; j < 5; j++ )
40:          {
41:              putchar( xlat[ wd & 0x3f ] );
42:              wd >>= 6;
43:          }
44:      }
45:      exit( 0 );
46:  }

As you can see, the code is straightforward. The random characters are already sitting in buf so I just concatenate the input into buf and call SHA1 to hash the result (lines 34–35). In lines 36–44, I translate the result into printable characters 6 bits at a time using xlat and output each character as it's produced. The upper 2 bits of each word are discarded. The astute reader will notice that the union introduces an endian issue so if you are going to run this on machines with different architectures you will have to compensate for that. An easy way is to replace line 38 with

wd = htonl( u.sl[ i ] );

Also notice that there's a buffer overflow vulnerability because the size of the input isn't checked but this is just a quick hack for my own use so I kept it simple.

If you need more (or less) characters in the password, just adjust the i for loop end count—you get 5 characters for each time through the loop.

When I run makepw with input jcs I get

WRQgWEN-2Z

This works well for me but if you want to do something like this you may want to replace the Some-Secret-Key in line 22 with a key that you enter instead. Then you would call it as

makepw secret-key jcs

That provides a more secure application at the expense of having to remember and enter a master key.

On the other hand, you could make it more like SHA1_Pass and not worry about a secret key at all. Instead of entering just a name like “Joe” or “Yahoo!” you would enter some random phrase like “My sister Mary's password for Yahoo!”.

We can refine this a little further. One nice thing about SHA1_Pass is that it puts the password into the clip board for you so that all you have to do is paste it into wherever you need it. I'm too lazy to figure out how to do that programmatically and since this is running on a Mac I'd probably have to use Objective-C. Fortunately, there's an easier way. I just wrote a tiny script that pipes the password into pbcopy, an Apple utility that copies its input to the paste buffer. X users can use the xclip utility instead.

#! /bin/bash
makepw $1 | pbcopy

Now just name this script make-pw, make it executable, and put it in your execution path and you're good to go. If you're a Mac user, you can use Automator to turn it into a service but if you're going to do that you might as well use (the free) SHA1_Pass.

A final note: The above code was run as a Babel code block and directly generated the resulting password. The #+begin_src line is

#+begin_src c -n :flags -lcrypto :cmdline jcs :exports both

The more I use Babel, the more I like it.

Wednesday, April 13, 2011

The Setup

The other day I got a pointer from Hacker News to an interview with Russ Cox on The Setup. Russ is always interesting and I really enjoyed the interview. Then I made the mistake of clicking on the Archives link and discovered many, many more. It's an excellent black hole for any spare (or not so spare) cycles that you may have. I've already wasted half a day on it.

Each interview asks 4 questions

Who are you and what do you do?
What hardware are you using?
What software are you using?
What would be your dream setup?

That doesn't sound too promising but it turns out that those questions generate some pretty interesting interviews. I find it fascinating to hear what tools and hardware people are using and why. After you read a few of the interviews, the contrasts between the answers become another interesting aspect.

Many of the interviewees are software engineers as you would expect but there are all sorts of people represented: artists, photographers, scientists, writers, designers, and many others.

It's definitely worth a look but be warned: it can be addicting.

Tuesday, April 12, 2011

Org-mode and LaTeX

Speaking of Org-mode, which I seem to be doing frequently lately, the indispensable Emacs-Fu has a very nice post on producing PDF documents with Org-mode and LaTeX. Org-mode has always exported its documents to LaTeX (at least as long as I've used it) so this isn't anything new but djcb tells us how to change the fonts a bit so that your documents don't look like every other LaTeX document in the known universe.

That sameness in document look and feel is one of the reasons that I prefer to do my writing with Groff, but take a look at the sample document that djcb produces and see if you don't agree that it looks very nice. As usual, it's hard to read Emacs-Fu without learning something useful.

Monday, April 11, 2011

Using Scheme with Babel

As I mentioned in my Blogging and Babel post, Babel supports many languages that produce things other than pictures. Naturally, I wanted to try it out with Scheme. Here's a toy problem that demonstrates some of the things you can do.

Suppose we have some input arguments in a table


1	2	3	4	5

and we want to calculate the values of various functions of those arguments. We start with the definition of the input table and a Scheme source code block:

#+tblname: input
|1|2|3|4|5|

#+source: func-tab
#+begin_src scheme :var args=input :exports results
  (define fib
    (lambda (n)
      (let fs ((n n) (a 0) (b 1))
        (if (zero? n)
            a
            (fs (1- n) b (+ a b))))))
  
  (map (lambda (n) (list n (* n n) (expt 2 n) (fib n))) (car args))
#+end_src

Babel delivers tables to Scheme as a list of rows so the input to the code block is ((1 2 3 4 5)). Likewise, if we want a table as output we should return a list of rows. That should help you understand what the code is doing. As you can see, the map produces a list of rows where each row is (n n² 2ⁿ fib_n). All of the functions except fib are built in, but I included fib to show that you can define functions in the code block if you need to.

Notice that we named the input table with the #+tblname: input line just before the table. Similarly, we named the code block with the #+source: func-tab line; we'll use that name a bit later.

When we run the func-tab code block by typing C-c C-c with the point in the block, we get the output table:


1	1	2	1
2	4	4	1
3	9	8	2
4	16	16	3
5	25	32	5

The table gives the correct results but suffers from the fact that it doesn't have a heading telling what each column is. We can fix that by pushing the list ("n" "n^2" "2^n" "fib_n") onto the front of the result of the map function in the func-tab code block but then we end up with


n	n²	2ⁿ	fib_n
1	1	2	1
2	4	4	1
3	9	8	2
4	16	16	3
5	25	32	5

which is better but still not what we want. The problem is that there is no hline under the headings so when we export the table to html no <th> tag is generated and we can't use CSS to format a nice looking table. What we want should look like

| n | n^2 | 2^n | fib_n |
|---+-----+-----+-------|
| 1 |   1 |   2 |     1 |
| 2 |   4 |   4 |     1 |
| 3 |   9 |   8 |     2 |
| 4 |  16 |  16 |     3 |
| 5 |  25 |  32 |     5 |

in Emacs. There's no documentation on how to do that but by trawling through the Org-mode mailing list I found some hints. At least in elisp and Scheme you can represent the hline by including the symbol hline as a row where you want it to appear. Thus we want to push

'("n" "n^2" "2^n" "fib_n") 'hline

onto the front of the result of the map in the func-tab block. We could do that by adding the code to the func-tab block but I want to demonstrate something else instead. In general, if we want to post process output from a block, we can merely call the block from our post processing code—that's why we named the func-tab block.

In our case we merely want to push a couple of new rows onto the result of the func-tab block but we could do any processing at all. To see how that works, we define a new block to do the processing:

#+begin_src scheme :var tab=func-tab :exports results
  (define cons2
    (lambda (a b c)
      (cons a (cons b c))))
  
  (cons2 '("n" "n^2" "2^n" "fib_n") 'hline tab)
#+end_src

Notice how we set our input, tab, to be func-tab which causes this block to use the output of the func-tab block. Now the neat part is that when we get ready to export to HTML we evaluate only this last block. This, in turn, will cause the evaluation of func-tab but no intermediate table will be produced so we get only the final desired result. If for some reason we want the intermediate results too, we merely evaluate the func-tab block directly.

Another nice feature of this method is that the blocks can be written in different languages. In fact, the final code block was written in elisp until I discovered that the same hline trick worked in Scheme.

When we evaluate the last block and export to HTML we get our final result. I've added at little CSS sugar to show how we can improve the output when we have an actual <th> … </th> heading.


n	n²	2ⁿ	fib_n
1	1	2	1
2	4	4	1
3	9	8	2
4	16	16	3
5	25	32	5

I'm still learning about Babel and will probably post about it again as I learn new things.