The number of hits to this blog has skyrocketed, from an average of 0 hits per hour to 27 hits per hour. I guess this means that it’s getting real: the book is coming out this summer! See the AMS Bookstore for the latest details and ordering information. Thanks to the American Mathematical Society for their great efforts in production and marketing. I worked with the production team on a cover design, and here it is…

What remains, on my side, are final details for production. I’ll be receiving the proofs in the next day or two, which gives me a few weeks to check that everything is in good shape. It also gives me time to make some important minor edits (e.g., changing “any” to “every”, and being more careful when talking about sets) — thanks to the reviewers for the many constructive suggestions. The bulk of the work will pass to the AMS teams who are working hard to publish this book.

I haven’t posted on this blog for a long time, and I don’t plan to post here in the near future. I will leave the blog up on the internet though, and I’ll have more to say when I see the text in hardcover for the first time.

Over the summer, I plan on developing a website to accompany the book: that’s where I hope you’ll find Python notebooks, interactive visualizations (developed with d3.js), the inevitable errata, and more.

Thank you for following, and I hope you enjoy the book!

I’m back in Santa Cruz again, where the Fall quarter just began. Fall in Santa Cruz means warm weather, busy surf on weekends, and time to Git Teaching!

In the programming community, Git and GitHub are popular tools for version control and sharing. As a pair, they accelerate project development and collaboration. I didn’t use them for writing the book (regrettably), but I decided to use them to create and share teaching materials. Thanks to the Udacity course on Git/GitHub, I was able to pick up the basics in about 5 hours. So what do I mean by using Git for teaching? I now have two public Git repositories on GitHub.

Git Lesson Plans

The first contains my lesson plans for the quarter. Since this is at least the 10th time I’ve taught this material, I usually prepare for class by scribbling down a half-page of notes to myself. I wanted something almost as quick as scribbling, but which would be easy to share with the world (and perhaps to polish later). So I made a little lesson plan template, made 20 copies (for 20 lectures), and put it up on GitHub. Using Markdown is almost as quick as scribbling, since I can edit it right in my web browser at GitHub (then print a copy for class). You can see my first lesson plan, from last Thursday.

Git Assessment Generator

The second repository is more ambitious. Since I’m teaching 70+ students, I want to create quizzes with some randomly generated questions. I considered ad-hoc solutions with PythonTex, started poking around, and realized that there isn’t a sufficiently general Python package for creating and rendering questions for math assessments. Webwork has PGML and MathObjects. But this is PERL-based, oriented towards calculus and online assessments, and not as flexible as I wanted in the creation and rendering of questions and answers. I found myself a bit over my head in programming, but a contribution from Janis Lesinskis got things off the ground. His began the project at his GitHub repository. I forked the project at my repository and added code for flexible Python-generated questions. It’s not close to ready for the public, but I think Janis set up a great foundation on which I can build something useful. I’ll build it further as I write the first quizzes for my class, and I hope to leverage the power of LaTeX, Python, and the Jinja for templating.

How are these related to the Illustrated Theory of Numbers?

The Illustrated Theory of Numbers is a text, designed primarily for print media. It is not open-source, though I am sharing some excerpts and some methods I used to create it. On the other hand, I want to build an open community to share resources related to teaching number theory — lesson plans, assessments, and resources outside of the text I’m writing. My own lesson plans might be helpful to instructors who wish to teach out of the book I’m writing, and so I’m happy to post them for the world to see. Similarly, writing quizzes is time-consuming, and I think that sharing assessments is a good way to build a community.

Unlike the calculus textbook industry, I don’t want to put out new editions every year, charge for online accessories, etc.. I hope that by the time the book appears in print (mid-2017?), students and faculty can find a great set of free complementary resources online.

Today I received a very nice package from the AMS — a complete color printout of a draft of the book! This means it’s time to proofread, and think of all the things I should have done along the way.

So here is a retrospective look at what I should have done.

Created a stylesheet as I was writing. For example, when abbreviating circa (as in Euclid lived c.300 BCE), is there a space as in “c. 300” or not as in “c.300”. I’ll look it up… but my life would have been easier if I had kept a running list of style details for consistency throughout the manuscript. Credit goes to Ellen Muehlberger for this advice — when I told my wife about this problem, she said that her advisor (E.M.) told her to maintain such a stylesheet throughout her dissertation. Great advice!

Used a version control system throughout. I just took the Udacity course on Git and GitHub yesterday (better late than never). It’s an excellent quick class, and I’ll be using GitHub for producing a webpage for the book, sharing teaching materials, etc.. Unfortunately my current book files are organized like my camping gear… randomly stashed in large not-quite-clean tubs.

Considered CMYK color issues from the outset. I had read and heard about color issues before, but didn’t do anything about it. So now I have a large book, and of course the colors look great on a monitor (in RGB) and varying stages of terrible in print (in CMYK). Fortunately, I have been using xcolor throughout, which gives me precise control over colors in the CMYK (or any other) colorspace. I was a bit surprised at how different things ended up in print. For example, a standard red corresponds to 0% Cyan, 100% Magenta, 100% Yellow, 0% Black in CMYK space. I wanted to desaturate the red when filling large areas; the xcolor setting red!50 yields the result 0% Cyan, 50% Magenta, 50% Yellow, 0% Black, which makes sense. But on paper, this looks distinctly orange, in a sort of carrot-left-in-the-freezer-too-long way. Similarly blue turns pale purple as blue!50. The solution is fun, if a bit time-consuming. I printed a CMYK color chart which I found via stackexchange, asking the local print shop to use their nicest inkjet and book-quality paper. I’m using this printed color chart to choose my colors now. So, instead of using a command like blue!50, I’ll define my own color (blueB perhaps) as 60% Cyan, 10% Magenta, 0% Yellow, 7% Black, and use this wherever I used blue!50 before. I’ll probably give the local print shop some business with color experiments over the next month. There’s really no way for me to match what the eventual offset printer will do, but I’m hoping to get as close as possible.

As of 9am this morning, the Illustrated Theory of Numbers is under contract with the American Mathematical Society! So now is a good time to write about the process of choosing a publisher and settling on contract details. My sample size of book contracts is now 1, so I wouldn’t extrapolate too much from my personal experience.

Why the AMS?

I spoke with two other publishers along the way, and ended up with the AMS for a few reasons. First, the AMS Mission is “To further the interests of mathematical research, scholarship and education, serving the national and international community through publications, meetings, advocacy and other programs.” The AMS(registered as a 501(c)(3) not-for-profit) represents the research and educational interests of mathematicians like myself — and I doubt that this is the case for the large textbook publishers (e.g., Pearson, McGraw Hill, Elsevier).

Other reasons that I am excited to work with the AMS are the following:

I believe that the production quality of the book will be good, while the price will be about half that of the market-leading textbooks in elementary number theory.

The editor, Sergei Gelfand, and others were at all times professional and responsive and helpful.

The AMS seemed to understand my goals for the book, and their goals and mine seem very close.

The AMS was responsive to my pickiness about design and layout, while reasonable about their own capabilities.

I trust the AMS as an organization — that if something goes awry, I have more support than strictly provided by the contract.

Negotiables

I consider some of the illustrations in the book to be a form of artistic output. I may adapt some of them and create posters, clothing, mugs, etc.. So it was important to me that the AMS maintain the rights to publish the book and derivative works, except that I would maintain the right to produce (and possibly sell) artistic works. This is described in the contract.

I have some worries about electronic distribution of the text. As a reader, I appreciate that I can download whole books from Springer through my library. But as an author, I don’t really want a downloadable publication-quality PDF of the whole book to circulate freely on the internet. I think that some texts (e.g., the vast majority of research literature, publicly funded research and education projects) should circulate freely or very close to freely. But since the Illustrated Theory of Numbers is a personal project, not funded by the NSF or anyone else, and contains original artwork, I think it’s fair to protect copyright a bit. I also hope to make enough money for a little vacation too.

I’m not 100% satisfied with the language around electronic distribution in the contract, but I think it’s about as good as it gets. Due to the rapidly changing landscape of electronic publication, the contract gives the publisher flexibility. One place where I requested a change was in electronic sales upon termination of the agreement. If the contract is terminated at some stage, rights to sell e-books also terminate shortly after. Although the language is not entirely to my satisfaction, I am placing my trust in the AMS to represent my interests as an author and their financial interests in protecting copyright.

Royalties with the AMS are based on a percentage of “net income”. This (apparently standard) term refers to the amount of money the AMS takes in from selling the book, minus some costs for returns. It’s thankfully not the same as “net profit” — the costs of production are not deducted. I found the percentage reasonable and generous.

In the Illustrated Theory of Numbers, the pictures serve different purposes. Some lend geometric insight to proofs. Others display logical flow. Others render an abstract concept. This post is about those which are data visualizations.

Using the term “data visualization” automatically increases web traffic, but I’m not just doing it for the hits. Instead, I think it’s time that the best data visualization practices are directed towards the most interesting data in number theory: the prime numbers being the prime example. While the data visualization community typically studies people and places and money and the natural world, the Illustrated Theory of Numbers gets its data from numbers themselves. It is one of the fascinating things about number theory that the data is entirely deterministic while at the same time obeying heuristics for random variables.

In this spirit, I’ve provided two drafts of a data visualization below, displaying the distribution of prime numbers up to 5 million. I’ll explain the editing process that led me from left to right.

My goal in this image is to provide the reader with a sense of the microscopic irregularity and the macroscopic regularity of the prime numbers. In the left column, the prime numbers are thick bars (10 points, I think). Each column displays a range of prime numbers: the first displays primes up to 50, the second the primes up to 500, etc.. The rightmost column displays the primes up to 5 million. In some ways, this is the simplest kind of data set — a one-dimensional distribution.

From the beginning, I decided on this basic layout of columns, so that by the rightmost column the image would appear smooth, and gradually getting lighter towards the top as the primes spread out. The numbers which represent primes on the far left are replaced by densities on the far right. A number near 5 million has about a 6.5% chance of being prime.

I made a lot of changes to this image, starting with the draft on the left (from a few years ago) and ending at the draft on the right (a few weeks ago). First, I pushed the prime number labels onto the bars. There might be some printing/clarity risks with white text on black bars, but it gets across the idea that the bars are the prime numbers and it reduces the chance of confusion that the same “ticks” apply to all columns.

In that spirit, I narrowed and separated the columns. This, I think, lightens the whole page, saves ink, and increases clarity. The red lines now indicate how each column is effectively contained in a tenth of the column to its right. I’ll admit there’s a bit of influence from the cover of Tufte’s Visual Display of Quantitative Information, though the subject matter is completely different. I hope the red lines also break the tendency to scan directly left-to-right, and indicate how data is squeezed into shorter intervals.

Also lightening the page, I changed the shading in columns 3-6. In the first two columns, solid black bars are used to represent prime numbers. But in columns 3-6, a shade of gray is used according to the density of primes in each bin. Among the numbers between 4000 and 4499, there are 60 prime numbers. Since 60/500 = 12%, I used a line segment at 12% black in the later draft. (With TikZ, this is accomplished by setting the color to black!12).

At first I was concerned that this would be too light, and I’ll see how it all looks when it’s printed professionally. But on the Ricoh printers here, the result looks good — even at 6.5% black (the density around 5 million), the gray is easily distinguishable from the white paper. And this fits with the principle of “smallest effective difference” described in Tufte’s Visual Explanations. It’s a bit hard (though not impossible) to see the primes spread out, as their density goes from 8.5% to 6.5% in the rightmost column. But that’s also part of the honest representation — it would be dishonest to the data to exaggerate the image to make the primes appear to spread out more quickly. The table of densities at the far right exhibits the gradual spreading unambiguously with numbers.

A note to the reader — the images tend to render with horizontal stripes on a computer monitor! Another reminder to print on a regular basis.

There’s probably a bit more tuning to do before publication. The primes deserve the effort.

To keep it short, without excuses: I’m back to writing!

I’m moving back to the United States from Singapore this summer. What this means, practically speaking, is that I have a lot of writing time between now and mid-September. During the 5-6 weeks when my life’s possessions are in a shipping container, I’ll be on writing retreat at an undisclosed location below.

I’m committed to having a first draft for the referees by June 4, and a final draft by mid-September, and everything is going according to schedule so far. I’ll update this blog periodically over the summer… so stay tuned for more shortly.

Like most blogs, the Illustrated Theory of Numbers blog went on a long hiatus. Unlike most blogs, this one has returned! Really, there has not been too much to show over the past few months, but I’ve gotten back to work on Part II of the book, covering binary quadratic forms (including all discriminants, Pell-like equations, a bit on SQUFOF factorization perhaps, the class number and the “Siegel bound”). This might strike the experts as a bit out of order — where is modular arithmetic already? — but I’m sticking with “global” number theory until Part III of the book. Of course, some readers might skip Part II and go directly to Part III, but I hope they will return to Part II to learn Conway’s beautiful “topographic” approach to binary quadratic forms.

I was heavily influenced by Conway’s visual approach to binary quadratic forms, found in Chapter 1 of “The (sensual) Quadratic Form.” It’s amazing how far you can go with his topographs. I think I can go through reduction theory, symmetries (a.k.a. orthogonal groups), and finiteness of the class number, without ever needing to multiply two matrices. As recent work of Savin and Bestvina illustrates, along with recent work of my PhD student Chris Shelley, Conway’s approach generalizes in interesting ways to binary Hermitian forms and beyond.

One thing I use often in Part II is the determinant. Fortunately, for two-by-two determinants, the geometry is relatively simple. Below is a two-page spread introducing the geometric interpretation of the determinant. A helpful bonus, presented in parallel, is the discrete version: Pick’s theorem for lattice parallelograms.

From mathematical and design perspectives, I like the idea of presenting parallel proofs in visual parallel on opposite pages. The left displays a theorem in continuous Cartesian geometry; the right displays a theorem in discrete geometry. The idea of dissection is the same, but the discrete version requires a bit more care.

I’m not exactly sure what to call the theorem on the right side of the page. It certainly falls under the purview of Pick’s Theorem but really, someone must have proved it before Pick, at least for parallelograms! I wouldn’t be surprised to see it in the work of Gauss or Eisenstein, if not earlier. Unfortunately, my German is not so good (nicht so gut?), though I can recognize the frequency of “Gitterpunkt” (grid-point) and “Gitterpolygon” (polygon with vertices on grid-points) in 19th century sources. Any reader who can find an earlier reference for Pick’s theorem, even for parallelograms (rectangles don’t count on their own!) gets an acknowledgment in the book!

The utility of Pick’s Theorem is the following — it gives a cute geometric proof of the following fact well-known to algebraists: A pair and of integer vectors forms an integer basis of if and only if . Indeed, a grid-parallelogram of area one cannot cover any grid-points except its corners, by Pick’s Theorem. This avoids any mention of matrix inversion, for example.

From a design perspective, this two-page spread was a lot of fun (and a bit of work). A combination of \foreach and scoped \clip commands in TikZ allowed for the easy creation of dot-textures on the right page. Perhaps the toughest decision (and one that isn’t final) was the choice of four colors; they are called “pinkish,” “blueish”, “greenish”, and “orangeish” in my source file. Following some technical color-theory advice, I worked with a color tetrad — literally a rectangular arrangement in the HSV color wheel, converted for LaTeX via the xcolor package. Analogous regions, such as the triangles, are in the nearby colors blue and green. Less saturated colors are on the left page, where there are large regions of solid color. Fully saturated colors are on the right page, where the colors are in small dots. The real test of color will come when I print this out, along with another half-dozen copies with other rectangular tetrads of color, and see what looks the best.