Thursday, November 16, 2017

Harvard CS Concentrators Jump Again

During my time as Area Dean for computer science, the number of computer science majors at Harvard more than doubled.  Growth has continued, and according to the Crimson, we're now clearly the second largest concentration at Harvard.  (It looks like we just barely surpassed Government last year, and now we're clearly larger.  Economics is still bigger.)

When you consider that Applied Math is the 4th largest major group, and that's a group that is largely composed of people who do some mix of Math/Computer Science/Economics, I think the numbers actually underestimate the growth in CS at Harvard. 

It's been an exciting time since I started at Harvard in 1999.  One could see what was coming.  Though, sadly, the administration still seems far from having caught up -- we (like many other CS departments) really could use some more resources coming our way.   Universities are a bit slow to respond (sometimes for good reasons), and the speed of these changes is, in context, just remarkable.  And, I'd have to say, have added to the fun of being a CS professor.  Back in 1999, CS was incorrectly perceived to be a backwater at Harvard, small and unnoticed, but where there was a lot of excitement and energy.  Fast forward less than two decades, and now we're clearly at the core of Harvard and its future. 

Thursday, November 09, 2017

BARC, Copenhagen

A few summers ago, I had an opportunity to visit Copenhagen, and work with Rasmus Pagh for a month.   I (and the family) liked it so much we went back the next summer for a hashing workshop.  While algorithms and data structure people were already familiar with Denmark's Aarhus and MADALGO, it was clear to me that Copenhagen was on the rise as an algorithmics center.

That's now been formalized, thanks to a huge new grant for algorithms research centered around a collection of fantastic researchers in Copenhagen -- and they're looking for graduate students, post-docs, visitors, and faculty.  Rasmus sent me a blurb that I'm posting below, introducing the new Basic Algorithms Research Copenhagen, or BARC.

This is wonderful in so many ways.  First, it is always great generally for our community when algorithmic work is recognized and funded.  Second, this is a fantastic group of researchers -- for example, obviously I am biased, but Rasmus and Mikkel Thorup, both individually and together, have just been knocking it out of the park over and over again for years in algorithms and data structures (particularly with regard to hashing).  Third, for many, this is a marvelous opportunity.  Copenhagen is just a wonderful city, and the chance to go and work there with the top people is something many people will enjoy.  (And don't we do better work when we're someplace we enjoy being?)

With all that, let me put up the blurb on BARC, and point you to the home page at barc.ku.dk and the Facebook page at http://fb.me/barcdk.  (Note there are deadlines coming up for some of the positions.)


BARC (Basic Algorithms Research Copenhagen) is new center for foundational research in design and analysis of algorithms. The center, which will be inaugurated onDecember 1, is led by Mikkel Thorup. It aims to "attract top talent from around the world to an ambitious, creative, collaborative, and fun environment” in Copenhagen, Denmark. Other core scientists in the center are Stephen Alstrup, Thore Husfeldt, and Rasmus Pagh. The center is part of a larger effort increasing the number of faculty members in algorithms research, and making Copenhagen a hub for algorithms research. The BARC center has received funding of over 6 million USD for the next 6 years from VILLUM Fonden to employ graduate students and post-docs, and to attract a stream of long- and short-term visitors. People interested in BARC should visit the web pages at barc.ku.dk or follow @barcdk on social media — the first assistant/associate professor, post-doc, and PhD student positions are already announced, with deadlines in November and December.



Wednesday, September 06, 2017

Simulated Annealing for JPEG Quantization

I have a new paper on the arxiv (web page describing results, and full version** available) with two Harvard undergrads, Max Hopkins and Sebastian Wagner-Carrena (who may both be applying for grad school, and who are both great, hint, hint everybody...), titled "Simulated Annealing for JPEG Quantization", on finding better default JPEG quantization tables.   This work started as their class project for my graduate class on "Algorithms at the end of the wire" last year, and they stuck with it through the year and into the summer to hone it to a paper.  I'm always excited when class projects to turn into papers, especially when it's done by undergrads.

Here's the story of the paper.  To start, as the wikipedia page on JPEG says, in JPEG after you transform 8 by 8 pixel blocks into 64 coefficients in the frequency domain, you "quantize" the coefficients -- that is, divide each coefficient by a fixed number (with a different number for each coefficient), and then round.  This reduces the amount of information needed to represent the block, at the expense of some loss of fidelity.  In particular, high frequency coefficients are quantized with much higher numbers, so that most high frequency coefficients become 0, further improving available compression.  This is generally OK because reducing the higher frequency components has less of an effect on the human interpretation of the image than reducing the lower frequency components.  

There are "default" quantizations tables built into the JPEG standard.  Back when the powers that be developed the original JPEG quantization tables, they were essentially choosing them "by hand", with much less to go on today about models for the human visual system, and much less computing power available.  These days, we have better models, or better said actual metrics, of how "good" a compressed image is with respect to the original in terms of the human visual system, and a great deal more computing power.  This suggests we should be able to effectively explore the very large space of possible JPEG quantization matrices using heuristic techniques, such as simulated annealing.  This idea has been considered before, but we seem to be at a much better point in terms of metrics and computation now than with past attempts.   

Of course it's still not so easy (it is a 64-dimensional space, and the metrics have their own issues), and there was a good amount of work and testing to be done (and some insight to be gained).  But we found what we think are clearly some better "default" JPEG tables (that we've made available at the web page above), and because of the way JPEG works, you can add them in to your JPEG file to use in place of the "standard default".  So it's all backward-compatible, ready-to-use, no need to change JPEG, which is ubiquitous enough that actual changes are unlikely to happen.  Any experts out there, go ahead and try them out and let us know what you think.  Also, I think the work generally opens the door to others to aim for additional improvements, using different metrics and/or different heuristic techniques.

** The arxiv version is, strangely, not the full version.  Apparently, arxiv has "size limits", and because we're doing side-by-side comparison of many images, apparently we're above the size limit, and they wouldn't give us an exception.  I hadn't known that could happen on arxiv, something to keep in mind in the future.  

Tuesday, September 05, 2017

Grace Hopper College

I hadn't seen the news about Yale renaming Calhoun College to Hopper College;  it just popped into one of my newslines here in the New York times.  I guess I hadn't seen that it was in the works months and months ago, as described in this Yale News article.  Consider me behind the times, but wanting to spread the news in case others hadn't seen it.

That's just awesome.

Saturday, August 12, 2017

Best Post I've Read on the Google Memo

After the shout-out to Meena, she suggested I might have more to say on the issue of the Google memo.  I (like I imagine so many others) have been following the array of follow-up articles and posts since this issue erupted.  There is so much to that needs to be said, and I feel that I am not the best person to say it -- what so many others have said in response.  In fact, I'm sad and angry that collectively so many of us are spending so much time thinking about this memo -- to counter that, my hope is that just having the issue of diversity prominently in the news and in the tech-sphere mindset will lead to more discussions and action that will result in actual progress.  

But maybe I have a few things to say, and a pointer to someone who I think has said it best.

1)  Anyone who somehow thinks there doesn't remain active bias against women in mathematics, engineering, and computer science hasn't been talking to actual women in these fields.

2)  In my opinion (and I owe this expression of thought to David Andersen, although these are my words), this memo wasn't about "the science".  My opinion is that the memo author went in with a prior belief of the truth, and then found some science that (at best, does a very poor job) of justifying what he believes is the truth.  That is not how science works.  (David called this "confirmation bias in action".)

3)  While there have been many good posts on the topic, I found this post today says it best, and encourage you all to read it.

Saturday, July 22, 2017

Current Harvard Oddness

It's summer.  And I'm now on sabbatical.  So perhaps I shouldn't care about strange Harvard politics goings-on, but I can't help it.

Here's the tl;dr version, which is already too long.  (What's the recursive form of tl;dr?)  Sometime back, the powers-that-be at Harvard decided that they didn't like the Harvard final clubs (Harvard's kind-of-like fraternities, "social clubs" that have been around for ages, but that are not in any official way affiliated with Harvard).  There's plenty of reason not to like them, but at least initially concerns about sexual assault seemed to be the motivating factor.  So the powers-that-be decided that if you belonged to some private single-sex organization, they would not let you be captain of a sports team, or be approved by Harvard for a Rhodes fellowship, or things like that.  A number of faculty -- perhaps most notably, Harry Lewis -- objected to this policy, on multiple grounds.  (Perhaps one large one is that there are many private single-sex organizations that are quite positive, and it seems odd to put all these organizations under the same blanket policy.)  So after it was clear that there was some significant faculty objections, for a bit it was temporarily shelved, and a new committee put in place to make recommendations.

Several weeks deep in the summer, the report comes out, suggesting policies even harsher and more draconian than the original plan.  And the reasons for this outcome, from the latest-breaking reporting, seem at least somewhat confused.

There's now two issues, seemingly only tangentially related, but that have come together here.  The first relates to the suggested policies themselves.  But the second relates to how Harvard is governed -- can these types of disciplinary regulations for the students come into being without a vote of the full faculty.  The first, rightly, gets more attention, but as a member of the faculty, the second is of significant importance to me, and relates to a historical trend of taking power (or at least trying to take power) away from the faculty that seems consistent since I've arrived at Harvard.

I could write pages and pages about this, but fortunately for me (and you), the Crimson and Harry Lewis already have.  So for those who care enough to find out more, here are links.  The most recent Crimson piece, Seven Votes, suggests there's more behind the latest report that is worth knowing about, and probably is as good a starting point as any.  All of Harry's posts are interesting, but perhaps the best are where Harry relays his remarks from faculty meetings -- they're great to read, but I admit, it's much more fun to watch him deliver them in person.

Crimson articles (that last week or so)

Key Events in Harvard's Social Groups Sanctions Policy
Faculty Committee Recommends Social Groups be 'Phased Out'
Social Group Ban Revives Faculty Concerns Over Governance
Seven Votes    **** [most recent, and controversial]

Harry's blog (chronological order, back to about November).

My remarks to the faculty in support of the nondiscrimination motion
Why History Matters
My remarks to the FAS meeting of December 6th, 2016
Further Q&A on the motion for the December 6th meeting
The State of the Debate
Withdrawing the motion
Guest post on nondiscrimination
Where the rubber meets the road
How it would probably work in practice
Professor Haig's motion against compelled oaths
More about the implementation
I know it's a dumb question but
This week's development in USGSO policy
Harvard's nondiscrimination policy
The new policy about social clubs
Further comments on the social club policy

Friday, July 07, 2017

Mitzenmacher and Upfal, 2nd Edition

The word is that the 2nd edition of our book is now (finally) available/in stock at Amazon.  You can tell it's the 2nd edition, because the "Alice cover" is now in blue -- I think it's a good look. But even more than that, there's 100+ pages of new material, on things like VC dimension and Rademacher complexity, power laws, normal distributions, cuckoo hashing, algorithmic versions of the Lovasz local lemma, and more.  More exercises and such, too. (Of course, it's still got the good old stuff, including standards like Chernoff bounds, balls and bins, the probabilistic method, and so on.)  So if you've never bought it, now is the time to buy, and if you have bought it, give the old edition to a friend and buy the new version for yourself.

If you're getting it from Amazon, here's a friendly link below, which will give me some Amazon credit.  Thanks!


Monday, June 26, 2017

STOC General Comment Page

The 2017 STOC is over, and I thought it went very well.  The new format ran with what seemed to me to be minimal to non-existent glitches, and overall it sounded like people enjoyed it.  The local arrangements were terrific -- much much thanks to Hamed Hatami and Pierre McKenzie who made the whole thing look easy.  (It's not.)   I'd have liked a few dozen more people, but I'm hoping we'll see some positive momentum going into next year.

A heads-up to mark you calendars now that STOC 2018 will be held June 23-27, in Los Angeles.

I'm putting this post up to see if anyone wants to make general comments about the TheoryFest/STOC 2017 experience.  Feedback is always useful, and if there's any constructive criticism and/or wild enthusiasm for any parts of the 2017 STOC, we'll keep that in mind as we go forward next year.  Please, however, be respectful to those who did the work of putting everything together.

And for those who went commenting here doesn't absolve you from filling out the survey that will be sent out, though!