1. I've been very gradually upgrading this site back to life for a few years now. Very gradually #amirite. However, after earlier this year having found myself accidentally on the front page of Reddit, HN etc. with my post about building the IMDb boards, I found myself slightly embarrassed, not only by the amount of attention (40k+ uniques in the first two days, holy shit!), but also by people pointing out how clunky the site is to read. Often several times a day.

    The styling on the blog section, much like the rest of the blog section, wasn't in a terribly well developed state of completion. I just threw together some hand-written CSS to approximate the look and colours of my last existing Wordpress theme, which I had been fairly happy with. Now that theme was set up maybe ten years ago, and my initial port over to this 'new', self-build CMS maybe four or five years old itself, and I had given no thought at all to mobile, or in fact any screen device very much different from my own laptop display. And my main laptop display is a 1024x768 pixel non-IPS Lenovo ThinkPad x220. That is probably a significantly worse screen than your phone has.

    In 2017 it's pretty stupid to build web pages just to be viewed by desktop browsers, so today I'm pushing out a rebuild of the display layer and theme, that hopefully works a little more responsively across varied devices. It should also be easier for me to evolve. I hope it improves things for my handful of select readers. I'm not terrifically good at front-ending, and my heart isn't often in it, but I have tried my best.

    I'd like to be updating this site more frequently again, he writes, like one of those bloggers apologising for never blogging, but a large part of getting any kind of schedule working there, is streamlining the publishing workflow. To that end, as well as a more modernised front-end and theme, today I've also released a new site deployment system, that allows me to update the site software more easily. This is clunky, but at least automated. Previously everything was just checked out into a home directory, hand compiled and run on the server. Now that's mostly still happening, but now it's all scripted with configuration management tools so I can release updates like this without having to remember exactly how to set it all up again by hand from first principles.

    Of course, for writing articles, I'm still shelling into the server and hand writing html files like a farmer, but it's all steps in the right direction. Sometimes I don't shell in to the server, I author the posts directly using emacs tramp-mode which practically counts as using a GUI round here.

    posted by cms on 2017-07-16
    tagged as
  2. ...as if millions of voices suddenly cried out in terror and were suddenly silenced. I fear something terrible has happened.

    Download an EPub edition of this post courtesy of redditor agonnaz

    Update: My erstwhile colleague Mathias wrote up his thoughts about his role in this story

    scribbled design notes

    Some time on Friday, IMDb announced that they intended to shut down their message board system, permanently. I don't find this to be a particularly surprising decision. I'm more surprised that the message boards are still there, in 2017, seemingly essentially unchanged for the last fifteen or so years. They've had a few coats of paint, and a handful of feature improvements, but they largely seem to be backed by the same system design developed by the in-house tech team, way back at the dawn of the century. And for the bulk of that early development time, I was the primary developer. As it has said on my homepage for many years, 'you can blame me for the message boards'.

    A long time ago in a galaxy far, far away

    I was incredibly excited to be asked to join the IMDb developer team at the end of 2001. Aged 30, with almost a decade of professional software development under my belt already. Although 2001 sounds today like it was the relative stone-age of the modern web, which of course in many ways it was. At this point I had already spent several years working on basic web applications in the original dot-com boom, and I was in-awe of the IMDb, which even back then was a somewhat venerable internet institution. Founded in 1990, it thus predates the invention of the World Wide Web by several years, having started out as lists of data shared via USENET posts. At the time I joined, they were a couple of years into their Amazon ownership, and starting to expand the team.

    As I started, they were just on the cusp of launching IMDbPro and had an ambitious roadmap to completely rebuild the main website from the inside out, using the shiny new technology stack the small development team had built from the ground up to power the IMDbPro application server. This, I thought was a very clever hack - imdb.com was a hugely popular website, and this approach of adding industry focused features to a subscription remix of the site built on top of the same data feeds (still basically formatted text lists, using the conventions of the old USENET based tools) meant that in effect we could use the far smaller user base of the pro site as a test-bed for the new tech, and gradually port sections of this across to the terrifyingly high volume 'consumer' site, without having to do a rewrite and a relaunch. To further sweeten the deal, if you look at this arrangement, this meant that the test-bed users would actually be paying to break in the newer software, and helping you iron out the bugs.

    In 2001, a shiny new high performance web stack meant perl. Apache 1.3.x running mod_perl to be more precise. In case you don't know what mod_perl is, it's a piece of semi-deranged brilliance that wraps the perl language interpreter into an apache module as a persistent runtime and exposes the internal API of the HTTP server to it. This lets you write applications that are now effectively themselves apache webservers, with direct access to every part of the HTTP serving lifecycle. Furthermore, by using the other neat hack, Registry.pm you could use modules or scripts that had been designed to work as CGI scripts, and get the some of the same speed boosts, unmodified. With these techniques, you could write perl applications that went almost as fast as Apache could, and in the late 90s/early 00s it was this or PHP. PHP back then was pretty grotty, I thought, and the cool kids were all using perl. Perl had libraries, and excelled at gluing existing bits of UNIX together. This meant you had to write far less of the application by hand. Yup, by hand. Let me dig into that a little bit

    It's the pictures that got small

    Writing web software back then was a fairly different prospect. In my circles, we didn't really have much in the way of frameworks. There were a few enterpris-ey things floating around that converted your big IBM and Oracle and Microsoft client/server application into some kind of terrible intranet suite that required ActiveX support to load any pages, and I'd poked around with Zope with some interest, but by and large if you were doing anything interesting, you used FreeBSD, or linux (2.2, with SMP support!).

    You'd most likely use Apache 1.3, forking, and write your site as a combination of static pages, server side templating and CGI exec-d programs, in some kind of UNIX scripting language (usually perl, but any of the usual suspects were relatively common, including actual honest-to-god shell scripts), or maybe you'd write a performance critical CGI as binary in C.

    For data processing, you might connect your application directly to a pre-existing company RDBMs, if you had such a thing and your DBA, if you had such a thing, let you, or you might deploy a SQL db on or nearby to your web host - usually MySQL 3.22 with ISAM and a quasi-religious intolerance for foreign key support but that was OK you could do all the data validation in application code. (A bit like JavaScript databases in 2017)

    We had libraries for common tasks, like parsing wire protocols and file formats, and wrapping utilities to do things like generate or resize graphics, but you'd stitch a selection of these together in an ad-hoc fashion to make a 'system'. A typical web stack would be table-based HTML with attribute styling and inlined images for typography and spacing, possibly pre-rendered, but maybe dynamically generated, then some CGI scripts for user management full of hand coded cookie and session tracking. A relational database for persistence, using hand coded SQL and a custom database schema. Page generation via a self-written templating system, gluing skeletons of layout-oriented HTML around variable interpolation with inline conditionals. This part would often run as server-side includes, but sometimes this would also have just been handled by CGI scripts.

    Maybe you'd have a hand built filesystem cache in front of this. 'Front-end' back then would often build static page representations, first in Photoshop or Illustrator, which would then be converted into single HTML page masters in Dreamweaver or FrontPage and then handed over to the back-end coders to clean up and crack apart into templated fragments, by hand. Single byte string encodings through-out, no threading, a light veneer of Object Orientation over internal data structures - you'd have a small cluster of actual physical servers, perhaps in a data center, but often on-premises, sometimes in racks, sometimes actual tower servers in the corner, directly connected to an internet router of some pitiful capacity. Sometimes your cluster was as small as one machine.

    Architecturally you'd have a webserver, perhaps two if you wanted to split 'heavy' dynamic serving from lighter or static content. Your database might end up on its own box with better IO and networking. If you had enough web servers you might put some kind of load balancer in front, perhaps a HTTP reverse proxy as an accelerator cache (often another Apache, sometimes Squid). In 2001 I'm not sure I fully understood what a CDN even was. You'd deploy with FTP or maybe rsync, sometimes the production filesystems were locally mounted via NFS or SMB and you'd just copy stuff over, or edit it in place. Version control, if you even had any might just be renaming files, perhaps SCCS or RCS. Advanced users might have CVS. Designers might have a pre-OS X Macintosh, suits would use Windows, developers had something more of a free-for-all - windows 2k, desktop linux, I used BeOS for several years whilst that was still a thing, and seemingly everybody, but everybody used emacs to write code - GNU emacs was common, but the cool kids were using XEmacs. Sometimes a remote XEmacs client on your deploy host attached to your local X11 server over the wire. Crazy days.

    My God, it's full of stars

    So that's the scene in 2001 when I joined the amazon.com family as an SDE, working on the new IMDb platform. I was a fairly hot perl programmer, having spent a good few years designing and rewriting custom web 'frameworks' and optimising mod_perl architectures. I was really good at SQL, at least I thought I was in comparison to most of my peers, and I had developed a particular fondness for the then slightly uncommon PostgreSQL database engine. I'd done quite a few web things - early corporate intranet portals, hobby sites, moderately popular dot-com publishing houses, but this was a step change into an entirely bigger league.

    In reality, especially as I look back with hindsight, I can see I had very little idea what I was doing, but hardly anyone did. There wasn't a lot of published material on architecture - everyone read Greenspun, but there was nothing like the modern tech web, scalability porn, conference circuit. No HN, no Reddit, no twitter, no Facebook, and looking things up on StackOverflow was still almost a decade away. It wasn't even that easy to find what scant information there was, you have to remember that Google was barely yet a thing. Information sharing tended to happen on mailing lists, using actual email, or maybe still on USENET. (Paul Graham hadn't yet written 'A plan for spam', and we didn't really have functional automated spam filtering).

    IMDb had an unusual working setup for the day, as befitted it's birth from a federation of USENET correspondents. Everyone worked completely remotely, scattered around the world. At the time I joined, there was an express preference for staff who could attend a weekly company meeting over lunch, near Bristol (in a cafeteria, attached to a swimming pool), and the majority of the tech team building the software was now based around this area. Home Internet connectivity was still largely 56kbps or lower dial-up, possibly metered, although I was lucky enough to be in a part of Bristol eligible for an insanely fast 1Mbps cable connection.

    Anticipating having to work on significant amounts of DP, potentially offline, I asked if I could be provided with a small server with SMP and RAID capacity, and was rather surprised by a small tower HP Proliant rig turning up at my house, cocooned onto a loading pallet too big to fit through the front door. I had to unglue it piece by piece and carry it up to my 'home office', a box bedroom full of IKEA tables, slightly too tall to be comfortable desks, and assemble it in place. I christened it mavis.imdb.com, and installed Debian stable on it, which involved most of a day figuring out the hardware RAID drivers, and from that point on it's shrieking fans and disks were a constant part of my daily life for the next half-decade. Eventually a house move allowed me to get it into a makeshift server cupboard where I could deaden this persistent din behind a door and blankets and curtains. I occasionally wonder now, in my middle-age, if I have a frequency gap in my hearing to match that particular pitch, but if so, it's not affected me enough to care to get it measured. As the noise tended to interfere with music, for the first few years I developed a habit of listening to BBC Radio 4 morning to midnight, and therefore, when there wasn't a test match to listen to, for a brief period of my life I developed an unusual degree of expertise in the comings and goings of 'The Archers'.

    One consequence of the remote working, and patchy connectivity was that the development work in the tech team was informally silo-ed up into sub-systems that individual engineers had ownership over. The very first task I worked on, after getting a working build of the entire stack onto mavis, was porting the statistics page across to the new web stack (internally known as 'mayhem', after project mayhem, everyone was big on movie references, naturally) by way of familiarising myself with the application and infrastructure. I made a perfunctory stab at that, and then I was searching around for something more substantial to own. The forums, or 'message boards' seemed to be a natural candidate.

    The most recent piece of work I'd done at my previous gig, had been to contribute a threaded discussion system to our general purpose content management system, which allowed a tree of conversations to be attached to any content id in the catalogue, so the site users could have a threaded comments section attached to any content. This had worked pretty out well. By contrast, IMDb had a pretty threadbare generic forum system, a standalone phpbb installation, almost entirely isolated from the rest of the system, organised into a few dozen general purpose with I think even a separate login system.

    A business goal for the next year was to drive up user registrations, and the forums system seemed like a good feature to assist with this. It offered additional site value that was only viable to registered users. Another target was to integrate the boards system more directly into the movie database, allowing people to have conversations directly attached to the pages for movies and shows. Another important requirement was to allow for a system that would let the data contributors directly communicate with the data management team. So I was tasked to do something with the forums to meet these broad goals, and the implementation and design of it was largely up to me, informed by regular feedback from the wider team onto weekly progress reports and via the team lunch meeting.

    We're going to need a bigger boat

    I considered a number of approaches.

    • I could have extended the PHP forum system as was, to support the new features, but I didn't really consider that for more than a couple of minutes - it was PHP, which I didn't know terribly well, and disliked, and would be harder to tightly integrate with the rest of the mayhem app, which was a domain optimised mod_perl web service.
    • I wondered about wrapping a USENET service, which had a lot of appeal, in as much as a lot of the base mechanics of hierarchy would be already covered, and a highly scalable architecture with a portable standard with several existing back-end implementations. I really liked this idea a lot, but I rejected it eventually when I realised that it would be difficult to build an integrated web front end that offered as much functionality as a stand-alone newsreader. If I had been able to find a decent open-source web NNTP client I might very well have done this.
    • Another alternative would have been to find an alternative forum system that was more amenable to customisation. I considered using the slash system that powered slashdot.org, but I rejected that because at that time it had a reputation for poor performance and uptime, and was struggling with coping with trolls. I really should have paid more attention to these ideas, both of which would come back to haunt me
    • eventually using a mixture of naivety, hubris, ego, enthusiasm and pragmatism I decided I'd build something custom, scaling up over the ideas I'd used for the comments module in my previous job.

    The basis for that system was something I was quite proud of, and in some senses it was quite a clever hack. We had wanted threaded discussions, but it's famously tricky to model trees in SQL. My first attempt, with hydrating flat lists into trees at runtime from a SQL result set was computationally a little bit expensive for the hardware of the time, and slowed up page rendering in the articles with comments.

    So I came up with an ingenious scheme. I'd store several sort fields against the comment records - one representing the vertical position in the thread, and one representing the indentation level, and every time a reply was inserted into a comment thread, I'd compute the correct indent level by adding one to the parent reply, set the vertical position to one larger than the parent, and then update every larger sort sequence to increment it by one; so that they were sequentially stored in thread order when read by that index. As I was storing the timestamp, and a sequential post id, I thereby had a system that could trivially read back conversations by order of time, order of posting or order of reply. This meant that posting was relatively computationally expensive, but only on the database server, whereas reading was simple and fast. I reasoned that reads were many times more frequent than writes, and biasing the system this way would optimise it for the common case, and avoid the need to build a cache invalidation system.

    This system actually had worked out pretty well in practice, at least for Accounting Web comments sections. Although it's conceptually neat, it's also actually pretty fucking dumb for a couple of reasons.

    1. updating records has a high overhead in PostgreSQL because of the mechanics of its concurrency implementation
    2. this system means that adding comments becomes linearly more expensive as threads grow in size. The more popular a system gets, the work needed to post an individual comment increases in a polynomial fashion


    I wasn't entirely stupid, I had calculated this downside, and I'd done some scaling calculations on paper to see what the cost of implementing this for the IMDb would be, and here I made my first actually stupid mistake, I used the metrics of the existing forum system to try and predict the capacity of the new one. I can't remember the exact numbers now, and I've long misplaced the notebooks, but it was something lower than a thousand posts a day, and the average thread length was a few dozen posts. Amazon could afford a useful database server, and it seemed like I easily had a couple of orders of magnitude of headroom. Telling myself that premature optimisation was the root of all evil, and conveniently ignoring the fact that this design was literally entirely borne of an optimisation hack, I decided to proceed with this scheme.

    Show me all the blueprints

    I gave the design a lot of thought. I had been a USENET user back in the glory days before spam and binaries had rendered it toxically uninhabitable. I adored slashdot. I'd used a lot of shitty web forums since then, and I had designed a flexible engine that could handle any kind of post based discussion grouping. I thought this was a great opportunity to design a discussion system that I'd want to use myself. scratch your own itch. I think I already mentioned, I didn't really have much idea what I was getting myself into. Ah, youth.

    I thought that most of the grief and spam I'd seen in other systems, was primarily because of the cheapness and disposability of user identity. I figured we could tie that down by disallowing anonymous posts, which was aligned with the goal of increasing user registrations already - maybe ultimately we could link them into amazon.com accounts, and therefore real identities. I wanted to give the users the ability to personalise and curate their site home page, so they'd have an investment in a community they valued, and would be publicly accountable to.

    Another thing I'd noted about other forums was how quickly they stagnated into a dominant clique, and deterred new joiners. I decided this was in part because of the permanent record; the conversations got stale because everything had already been said, and the groups then tended to be dominated by handfuls of high-status members with visible post-history. Groupthink dominates, outsiders are shunned, filter-bubbles prevail. I thought that an interesting solution to this would be to actively expire user posts. IMDb already had a system of user reviews for more static user content attached to database entries. The boards were for conversation - so we'd just periodically remove older content, and make no secret about it. This should stop the entropy lock-down, and also give us a mechanism to keep a lid on the database / thread size to help with performance. Everything should stay fresh and sparkling and self-rejuvenate.

    I know lots of this was naive thinking and with 2017 hindsight, it's easy to see the flaws. In 2001 though, there was much less experience of online community management. We thought we knew about trolling, because we'd experienced previous communities, but I don't think anyone yet had a handle on the scale and the scope of it in a significantly mass-medium consumer Internet.

    I really wanted nested threading, which is a very good, perhaps too good, way to promote reply-oriented posting and reading. For that same reason, I didn't want threading to be the implied default mode, because I thought it promoted point-by-point refutation, which lead to arguments and flame-wars. So I envisioned a system that could seamlessly move between a flat or a nested view, with a cookie to fix it to your individual preference.

    Each post would have two actions - a new top level post in the thread, or a reply to the particular post, and the different view options would allow you to see how the thread timeline fitted together from each point of view. I felt this would encourage replies, without mandating them as the only form of discourse. This meant that the organisational system was topic ( either a generic, or a database object ) , consisting of a thread - which was defined by the opening post made by any user at the topic level. This then collected numerous replies, which themselves could have sub-threads of reply.

    Mindful of the fact that this was still an era of expensive and slow dial-up and low end computers, I wanted the ability to view in narrow or expanded views. I didn't want to force people to download gigantic pages of browser and modem-choking deeply-nested table layouts, so we would flip between outline and expanded views as well as flat or nested. I wanted people to have a static, but customisable home page that they could add content, style and flair to; hoping to give them a sense of curation and ownership and identity, that should help act as a brake on too much antisocial or negative behaviour. I'm not sure I was even smart enough to wonder if people would use their home page to host offensive content. (Of course, some did).

    So I started to build it. Initially it went really well. On the data model and storage engine side of things, I was on a pretty solid footing, it was familiar ground. I carried on using PostgreSQL, and we specified a decent (for the times) server to host it on. No H/A or replication at all. I'm shocked at that idea now, but at the time I had reasoned that we were building an ancillary, purposefully ephemeral side-car discussion system with a different storage layer to the main site, and we'd be fine with regular hot backups - in the case of disaster we could shut them down without affecting the main site, and restore from backup. In the case of total and utter catastrophe, we could just reset them to zero and start again, they weren't designed for permanence anyway. Feedback about the design and features from the rest of the team was positive, with plenty of enhancements and suggested tweaks, and the system started to take shape.

    The UX layer was way harder than I'd anticipated, and because of this, I started to get a bit bogged down in the 'second 90%' of the first deliverable. The mayhem engine that the team had built (a really clever piece of software design, that I don't really have time to give justice to here) had never yet really had to cater to highly dynamic pages - it's core purpose was to serve flexible views of an almost read-only statically compiled dataset of movies and people. It was originally built around doing that in a particularly optimised way.

    I had to build up my own HTTP POST and form handling layers that would integrate with the existing HTTP handlers, from a somewhat lower starting point than I was used to doing, and this soaked up quite a bit of testing and debugging time. Even worse was the display code. We didn't really have much facility for dynamic page layout in the templating system - which was both highly customised, and complex; the site page templates were used to drive the static build system, via a custom compiler - the markup in the template specified what data views would be generated by the build, which directed the data builders that compiled the binary movie database- the pages were effectively just compiled to a stub handler for a specific route which would seek to the object index in a particular data index, and then basically sprintf the data out port 80 as a hydrated web page. This was a fast way to serve varying pages with identical structure, but not immediately well suited to highly adaptive constantly updated live pages or submission forms. Still, I wanted the boards system in the existing stack as well as I could manage, and so I laboured to build the missing features into my system in a way that could integrate well, which involved at least one complete abandoning and rewriting of the internal API.

    The actual boards display templates themselves were a significant time soak. We had a great designer, who took my ugly box tables prototype output, and turned out nice looking blueprint designs for all the various view modes and forms as static web pages. This was of course the era of the browser wars, and we were expected to support a bewildering array of user agents from the Netscape 3.x era onward, inclusive of weird-ass things like AOL clients and MSN web-tv set top boxes and goodness knows what else...

    Busting these intricate table-based views apart and back together again into a cryptic markup and logic language, adding the various (session global) mode flags such that all the different view combinations rendered as functional pages that degraded gracefully took me weeks. I was slipping past shipping dates and entering a terrible crunch death-march to just try and get something out of the door. Unhelpfully, this was all happening at a time when I was having a few strains in my family life, and also struggling a little bit to balance this into a sensible routine of working from home, I was ping-ponging between getting distracted away from 9-5 and then overcompensating by working across nights and weekends. Eventually we had to pull out features to ship.

    I drastically cut back the home page customisation, abandoned all the planned but unstarted work for a search index, and only had time to add the most rudimentary admin features. I had wanted to migrate the existing posts across to the new system, but I'd not even begun to start on that, and that also hit the cutting room floor. With a lot of assistance from the rest of the tech team to get it over the line, we hit publish on the initial TNG boards system some time in the summer of 2002, later than planned by some months. This pattern of the message boards being more work than expected for all parties that touched them would be the prevailing tone for the next several years.

    A test designed to provoke an emotional response

    User feedback was immediately negative, and highly vocal. Lobbying started instantly for the reinstating of the previous system. People complained about the new designs, the complexity of the new display options, the inevitable launch bugs. I was silly enough to join in the conversation to help explain the launch and solicit feedback, and from that point on I had an onslaught of direct contact messages and emails, occasionally positive and friendly, but more often than not weird and offensive, sometimes abusive. You do try to tell yourself that you can just ignore the trolls, but in truth it is quite difficult to remain completely unaffected by emails that compare you to a child rapist and calling for your death in offensive terms, even if it was only provoked by you breaking a font size in a particular version of Internet Explorer 3. You never quite get used to that, I find. I was pretty crestfallen with all the negativity after all that work, although the team were positive and assured me that some of the board users could be like that, and that in general people are more vocal when they're complaining, and are naturally somewhat resistant to change. I still felt pretty down.

    My mood did soon change after a few weeks. The new boards were kind of a hit. Maybe a smash hit. They quickly overshot my scribbled calculations of scale in a slightly worrying manner. With some judicious database tuning, the performance stayed OK though. For now. Then we added links from every title page (IMDb pages were sub-grouped into title pages, for tv shows and movies identified by a key called a tconst which looked like tt1234567 and name pages, for people, robots, animals etc. from cast and crew which were identified by a key called an nconst which look like nm1234567; top level boards un-linked to other database objects therefore got a new key type called a bdconst, somewhat inconsistently, these looked like bd1234567 and didn't matter very much because there was only ever a few dozen standalone boards) and the numbers started to properly hockey stick.

    At the time we used to compute the page views in a weekly report which broke out the top N subsections according to first level directory. We never shared traffic numbers publicly, and so even after all this time I will be respectfully coy, but the highest chart topping positions were obviously things like /title, /name, /search /news /chart etc. At launch, the boards were lurking down the bottom, nowhere to be seen, but after we started the title conversations they were solidly into the top five, where they remained with ever-accumulating numbers, and user registrations clocking up correspondingly.

    From that point on, I spent a significant amount of my waking life 'doing the boards' for the next several years. Initially I was scrambling to put in the missing features we'd pulled before launch - post editing, markup for posts and then profiles in a hand-rolled version of BBCode; again with a stupid insistence on display time optimisation, I converted this to HTML at write time, which meant that when we added post editing, I had to backward parse the HTML back into bbcode to be re-edited, all with a misconceived series of chained regular expressions. This lead to an endless sea of parse bugs that pretty much guaranteed that the markup and emoji (although they weren't called that yet, we called them 'smileys') set would be once fixed effectively sealed forever, even though I'd taken the trouble to add an admin edit tool, that allowed for updates to markup to be made by non-developers through the CMS API.

    I'd thrown together a naïve search API, entirely based on un-indexed SQL substringing, which I'd fully intended to replace after launch. It never worked, and the system filled up so quickly that it killed the page cache entirely by constantly table scanning the texts, so much so that I spiked it in the first week, and never got a chance to work on it's planned replacement. I was still getting emails complaining about that five years later after I'd left.

    With the surging popularity, came increasing amounts of negative user behaviour, and I had to increasingly devote development time to adding abuse processing tools for our small moderation team, onto what had only ever been an afterthought of an admin system. We never proceeded to link up the user accounts to amazon accounts, and I'd never planned to add user-driven moderation. My quixotic hopes for user killfiles (renamed to 'ignore lists', which is a far better and kinder name), global killfiles (known as the 'Phantom Zone', because I love Superman) with account history purging and deletion weren't enough on their own, and the tooling for processing abuse reports were too clunky and slow, largely because I hadn't planned enough for them from the offset.

    I was now fighting a constant war on two fronts. With the popularity of the system way beyond my original estimate of a few thousand posts a day. We quickly escalated to a point where the really popular off-topic boards were ersatz real-time chatrooms, accepting hundreds of posts a second at peak-times. All of this in a cursor-pooled synchronously blocking database directly attached to the HTTP display servers. I spent a great deal of my work time just constantly rewriting sections of it all to squeeze efficiency out of this setup. First with indexes and schema changes, then with hardware upgrades and tuned and profiled system software, then with a complete rewrite of all of the database logic to use stored procedures, and finally a long overdue table sharding so we could cluster boards between different tables and tablespaces to balance the IO and garbage loads. At the same time on the other front we were trying to come up with ways to lower the proportionally increasing cost of trolling and abuse.

    My partner was temporarily stationed away in London by this point, so I was home alone, aside from the dog. Workdays at this point quite often consisted of walking 12 paces from the bedroom, still brushing my teeth at about 09:30, getting a support email, starting to poke at something interesting with the boards, and then not giving up until the small hours of the next morning. I was fairly obsessed with all of it, and my health was suffering, although I was too close to all of it to properly see this at the time. I developed a weird collection of neurological symptoms which stubbornly refused diagnosis, and subsequently appear to have been entirely stress-induced.

    We still were choking out at peak load times, and it was starting to have a knock-on effect to the rest of the site. Eventually, a super-talented colleague helped me out by implementing a workable version of my poorly articulated designs for a caching database proxy; implemented seemingly overnight by him in C, it spoke postgresql wire protocol and cached result sets in a filesystem that we mounted on ramdisk. Kind of a home-brewed combination of memcached and pgbouncer. The simplicity and effectiveness of this just took my breath away, just as much the lesson that if a software thing doesn't exist, you can just make it yourself. Everything is just ones and zeroes, as I am very fond of saying to this day.

    With this addition we got to a place where the system was in enough of a steady state. We implemented more banning and reporting, added a reputation score based system that slowed the rate of posting for users with lower reputation scores, which also helped reduce the saturation write loads at peak. Eventually we added an automatic moderation robot with a learning capacity and pluggable rulesets. I called him Spike. He worked fairly well, if a little bluntly at some times.

    I hope I'm not giving out the impression here that it was all entirely negative. It was definitely a rollercoaster few years. Exhilarating, and also very entertaining. The boards were a living thing that had sprung out of nowhere, literally something I'd created in my spare bedroom. It sort of felt like a Pacific-Ocean sized colony of sea-monkeys eternally fizzing away with unexpected activity right there in my spare room.

    Although they were often frustrating, the users were also inspiring, and creative, and surprising, and occasionally pretty funny, even some of the (gentler) trolls. On top of an understandable level of frustration and annoyance, I generally found I felt a sense of sympathy for them, and their complaints and frustrations with the system. All of this was before the age of 'social media', and I could almost feel the shape of it hanging there, slightly beyond where we were heading, off-piste and in a direction we probably shouldn't venture into.

    A consistent surprise was the amount of effort people put into curating their limited patch of profile space, and how social and to us off-topic, it all was. We were constantly running into people trying to use the boards for personal social spaces - I argued for providing individual personal boards for every user at one point, but the management team explained that we weren't really in the core business of general social networking. It confused me at the time, and I had to think about it for a while, but I think that was correct thinking, and there's a lot of wisdom there. You simply can't do all possible things well. With a small team, and a big world, you benefit most from focusing entirely on the things you're best at and the things you want to be better at.

    A few of the sillier trolls stand out. There was one early griefer, who we very easily IP traced to a school library, I think based in Canada. We waited until he was in mid-session one afternoon, and then if I recall correctly, management called his head teacher, who was then able to apprehend them in the act. There was another, very silly catfish troller called tabitha_cyeg, with an obviously manufactured identity. Their M.O. was posting bizarre conspiracy theories about the site technology, and myself, during which they'd claimed to have hacked into using l33t-sounding but completely irrelevant NetBIOS vulnerabilities replete with faked server logs, and on one occasion 'hacked' emails from myself revealing my true name to be something along the lines of 'Claude M. Savoire'. Quite a few users were seemingly entirely convinced, but to me it was pythonesque.

    Getting contacted by the Feds to deal with users who'd been posting death threats about President Bush was weird, at least it was the first time, and I got a few PMs and emails from actual industry figures, which was always quite exciting. I personally banned a moderately famous Hollywood producer this one time, for abusive posting, which is something of a curiosity. I remember going to watch Jay and Silent Bob Strike Back at the cinema around this time frame, and getting a particular kick from the sub-plot where they individually visit all the internet forum posters who have been rude about their previous films.

    I watched people fight and friend. Saw a few romances and a marriage or two emerge from the regulars. I read, and occasionally got involved, against my better judgement, in fascinating and productive conversations. I still bump into people IN REAL LIFE who reminisce about the boards and are to this day impressed with me when I tell them I had a big hand in their genesis. I once spent an evening in a darkened restaurant patio overwhelmed to tears as a kind man explained to me his young daughter, hospital-bound and dying of cancer, had used the Harry Potter IMDb boards as her main social life in her last year, and how much that had meant to him and her. Stories like that are just a profound privilege to have had even the most tangential involvement in.

    And I learned so much. Working with such a smart team, on such a great and special piece of the internet. Learning about every aspect of scaling a web stack from the disk blocks up to the network and back down again. This era was still 32-bit Intel hardware, and I learned a huge amount about that, and UNIX profiling, and the linux virtual memory system and file system, and HTTP caching. I made so many mistakes, because there just wasn't any other way for me to learn, and I did figure out how to fix or improve on many of them.

    I learned about PostgreSQL internals from the wire protocols all the way down to the storage models in some detail, and to this day I'm a pretty great PostgreSQL DBA, when I need to be. I learned a lot about UX influence and steering behaviour, albeit by mostly getting it wrong. I learned about building search engines, and service orientated architecture, and why you really shouldn't hang responsive systems off of blocking I/O, and maybe message queues are useful. I learned how to measure system performance all the way down to the CPU cache level. I learned how to keep focused on problems I didn't yet know how to solve, or perhaps didn't yet understand. I learned lots and lots of things about movies and cinema history, much of it just by osmosis poring over the data sources. I learned how to better manage my own time and projects, and I learned what it feels like to burn out, and what you should do about it when you know that you are. Since I left Amazon.com, I've had a great and varied career, and I think at least 75% of the useful things I know how to do well I learned first-hand on that gig, and I've always treasured, and respected that.

    Always. Be. Closing.

    And now they're shutting the boards down. I first heard about it via text message, oddly enough; but shortly after that it was all over my news feeds followed by a slow stream of emails, checking in. Friends, ex-colleagues, some of them from former boards users. I felt an odd sense of shock about it, in a way, and slightly emotional. Sixteen years is a ridiculously long time in Internet years. The web itself wasn't sixteen years old when I joined Amazon, and nor was the even older still IMDb. I don't use the boards myself any more, although I do occasionally look over them, perhaps once or twice a year. It's been clear for a while that they're not getting a fraction of the use that they once did, and that's fair. The web is a different thing in 2017 entirely, and that's also a good thing.

    Communications technology evolves, and hopefully improves all the time. People have all kinds of social networking now for communicating, and the bulk of this is happening on different, smaller screens than anything I could have envisioned when I was first sketching out some pencil ideas in a gridded notebook. An actual Filofax I believe. It was very humbling to see the amount of twitter traffic noting the IMDb announcement, as well as the number of actual proper news sites that wrote this up as something significant. The Verge report seemed to think the IMDb message boards were era-defining. That's something, I guess. All things must pass.

    There's just one more thing that's bothering me

    'Mjeyds'. On the imdb board bbcode syntax, there's a particular smiley that you markup using this bizarre word. People occasionally ask what the term means, and I've always enjoyed the mystery, being one of very few people in the world to have any claim to know the answer. I guess it's now or never for the reveal.

    The emoticon set was curated, uploaded and configured by my erstwhile designer colleague. He took responsibility for naming them. He wasn't English, hailing from Denmark, I believe via several other countries. When I pressed him for an explanation of 'mjeyds', he said it was supposed to be an onomatopoeic of the way the late Graham Chapman said a languorous 'yes' whilst sucking on a pipe in a scene from Monty Python's the meaning of life. If it is, I guess it works better if you're using a Danish alphabet? If you've got all this way through this post only to find out the answer to that question, then I am sorry if it is an anticlimax, but thank you for reading. Maybe some things are better left mysterious. Another lesson learned.

    Crazy Credits

    this is a personal web page, and an entirely personal and subjective retelling of my own experience building and maintaining a small section of IMDb.com a long time ago. Whilst I'm happy to take personal responsibility for a large amount of the boards creation and inspiration, I don't want people to get the impression that this was in any way a solo effort. All of the work outlined above was produced in the context of a small dedicated team, and although I've refrained from naming names, and attributing ideas elsewhere this is borne more from a desire not to miss anyone out - after this amount of time there's simply no way I can credit individuals for parts I can remember without failing to attribute others for equally important contributions I have forgotten. I've done my best to be honest about facts and timelines, and tried not to infer too much about third party motivations, but I know I've forgotten things and misremembered others. Working from memory, after this amount of time, such errors are only human. If you spot anything terribly wrong, or have any questions or corrections, please get in touch. I'd like to thank the entirety of the IMDb team 2001-2005 for working with me on all the aformentioned things, and more. Great team, great times

    posted by cms on 2017-02-05
    tagged as
  3. And just like that we're back. What happened cms?

    It was never entirely my intention to go offline for such an extended hiatus. Even though the web is intrinsically brittle and ephemeral, I like to do my bit to keep my little backwater serving 200 OKs to the half-dozen people who stop by to check in regularly, and the couple of dozen who linked to something I put up at some point. It's basic web-citizenship as far as I'm concerned.

    Before we went fully dark, I'd not posted for a long time already. And before that I'd slowed my posting down to something of a crawl. I think there's a few reasons for that. It's easy to get bored with blogging for the sake of blogging, especially in our current age where everyone shares profligately across many social platforms. It's fairly common to see blogs that have fallen into a recursion of no posts for months, then a post apologising about that, and then further disuse. I don't think this is one of those, but the proof is in the posting I suppose.

    There's certainly been less time in real life for auxilliary pursuits like online rambling, and that's a big part of the reason. No time for any proper content posts, concomitant with a surge of alternative social platforms to play around with, meant it often seemed a bit redundant to post arrays of short-links, when I could just throw them up on twitter/adn/diaspora*/flickr/ello/imzy/whatever, with a bigger audience, and more interaction.

    I was also feeling a bit self-conscious about standing up in public. After leaving last.fm (fairly amicably, as these things go, fwiw, albeit with a slightly battered heart), which felt like a fairly visible shift sideways, I was quite deliberately courting more obscure, maybe more unexpected job roles, and I remember feeling like I really didn't want to bare my thoughts to the internet judgement machine whilst I wasn't even entirely sure what I was doing myself a good deal of the time. Also busy! Young family plus startups really left little time for anything much else.

    I also was really feeling the pain of Wordpress. I never quite managed to find an authoring approach to use with it that didn't make writing anything seem like far harder work than it ought to be, also because I always insist on self-hosting, the sheer weight of it for maintainence and security updates, and backups, and DBA-ing, and having to write PHP or perhaps even plugins to do the inevitable customisations someone like myself inevitably finds themselves suckered into doing. So Wordpress was a drag, which was feeding my reluctance to contribute much of substance. So I decided to pause on updating whilst, in true wannabe-hacker style, I whipped together some kind of alternative content publishing system.

    I'll just take a paragraph out to stress that I actually admire WordPress a great deal. It's a very sophisticated and flexible web platform, and a great choice for site management, in either managed or self-hosted configuration. It kept this site ticking along for years. It just isn't a particularly good fit for my requirements, which are extremely simple

    I thought about using another off-the-shelf blogging system, which would have been the sensible route, but I figured that would just lead to a similar frustrated stalemate. So I started to sketch out an application that would allow me to quickly fling out tagged and dated content without much overhead of hosting or writing. And I carried on intermittantly piecing this app together, often on trains, for a couple of years. As an exercise in procrastination, it worked out better than I expected, and I carried on posting short content to twitter and others, reasonably happy to continue to defer the responsibility.

    But then the site went dark. I was hosting it all on a linode instance. I've been a very enthusiastic linode user for perhaps ten or more years, I think they have an excellent product, offering well-provisioned VPS instances, inexpensively, with an easy to use management site. Generally I've been very happy with them to date.

    This changed somewhat last year, and my confidence deflated a little. There was an extended outage of service across linode in December 2015, apparently as a result of a targetted DDOS. This lasted for many days, and the communications about it from linode were muted and suspiciously vague. This isn't really what I expect from a first-tier ISP. I came away with the impressions that there were some significant architectural problems with their infrastructure, probably from acrued technical debt, and potentially some exploitable vulnerabilities in their public facing application software. I decided it was time for a change.

    I did some reasearch and rented a couple of new hosts. This time I've gone for low end, physical servers. This represented another procrastination opportunity, because when I originally set up the beatworm.co.uk linodes, almost ten years ago, I just hand configured everything by remote shell. Now I like to use the ansible configuration management system to set up hosts, and I took this opportunity to port my public infrastructure across to use repeatable playbooks. This turned into another major yak-shave, because there was slightly more to it than just a WordPress deployment, I was hosting mail, calendars, media streaming, IM, DNS, the works. After getting lost in this tarpit for a couple of months, I decided to move the application tier over to use the playbooks from the sovereign project, which covers much of the same ground, but is already written, and uses more modern components. Of course it wasn't entirely straightforward to integrate these plays over my existing base provisioning, and I ran into a couple of glitches and gotchas with some of the choices they'd made for configuration, but it only took a couple of weekends worth of fiddling to get it all running in a fairly acceptable shape. I moved the DNS across, at which point the wordpress site was left behind, and everything went dark.

    I was surprised at how much this bothered me.

    I like an outlet for sharing things. I enjoy the idea of having a stable internet identity. I don't like the way the modern web has folded these ideas into a handful of consumer products run by just a couple of corporate gatekeepers. That's not the web I grew up with, and it's not the web I want to see either. A very loosely federated ecosystem of ad-hoc resources, all mixed together as hypermedia, aggregated and accessed via an assorted bag of user-agents. That's how it works best. I like to write, because I like the practice and discipline of working toward articulating my thoughts for a general reader.

    I like being able to curate an archive, and keep control over how that information persists and is presented. This is hard enough to do when you have primary jurisdiction over the medium and material (there is plenty of bitrot on view in my archive, particularly in the really old material, which has been migrated across multiple publishing platforms now), and basically impossible if you're relying on a third party service, which periodically re-invents itself to better serve it's own objectives, which are only ever to be tangentally aligned with your own, at best.

    I don't like the sense of obligation I get from formal social media platforms. There's a subliminal sense of pressure to perform, to update, to observe the conventions, to consider and measure the implied audience. I'm not a joiner by nature. I just end up gently resenting the throng. I like to feel like I have a voice, but I don't want, or even expect to reach, an automatically provided audience.

    So, I picked back up my now-neglected website platform experiment, and knocked it together enough to get an MVP out of the door. It serves HTML over HTTP. It has a relatively minimal set of style rules that should allow it to work gracefully across various screen dimensions. It has rudimentary support for RSS (not that many people use newsreaders any more). It's simple to run in a staging environment, and I can write posts in plain text in emacs, and edit and post them without much extra grief. It's only got about 22% of the functionality I had originally planned, but I feel the urge to ship it, use it, and hopefully I'll refine it in production.

    There's a couple of interesting quirks to this new hosting setup. It's an ARM-based micro-blade, hosted on a scaleways C1. The blogging software is semi-static, in as much as it serves generated content from the filesystem. It's written in common lisp, and deployed in a different lisp to the one it's developed on There's no frameworks (aside from using zurb foundation classes to base the CSS). There's no database. There's no comments, because I haven't yet decided on a productive way to support them.

    posted by cms on 2016-09-04
    tagged as
  4. I already mentioned in passing, St. Vincent, the band-shaped solo project brand thing of the super-engaging Annie Clark, was by far the best act I saw at Primavera Sound 2014. It was also the act I was most looking forward to seeing going in, it’s always nice when those line up.

    I guess I’m a super-fan. I first spotted Annie playing with Sufjan Stevens' touring band. I next encountered her playing solo support for the National, touring her first St. Vincent release, upon which occasion I bolted out of the auditorium by the third song, in order to make sure I got a copy of the CD she was plugging from the merch stall before she packed away. I saw another couple of shows in Bristol, with the full band, and bought all the records, including an interesting collaboration with David Byrne.

    Last weekend, while idly browsing the Glastonbury live blog, I noticed that they’d just updated their description of the current iPlayer feeds to include St. Vincent streaming on the iPlayer from the park stage. I’d been avoiding the Glastonbury video feeds due to a combination of not being in the mood, and the dullness of the tv schedules, but I wasn’t going to miss out on this, so I whacked it on the TV. True to form, it was a great set, live, risky, and peppered with amusing crowd-surfing and hat theft. Even with a bit of sound problem, and some streaming glitches I enjoyed myself, and was amused to see my enthusiastic tweeting duly included in the Guardian live feed on the next page refresh.

    That was a really good set”, I thought to myself, afterwards, “but it wasn’t nearly as exciting as the Barcelona one. True, that lacked crowd invasions, and nobody lost a hat, but the lighting, and the sound, and the staging, and the lack of daylight, and the crowd being really into it…A pity there’s no TV-broadcast quality stream of that night archived away somewhere”. 

    Yes, I do really talk to myself like that sometimes. Especially when I’m pretending to transcribe my inner voice for a blog.

    And then, I ran into this on Youtube.


    Full set, multiple cameras, properly mixed sound, pretty good video quality. I have not yet watched it enough times to see if I can see myself ( front of house, stage left, VIP pen ) in the crowd, but I expect I will. 

    posted by cms on 2014-07-07
    tagged as
  5. There’s been a little flurry of le Carré activity in the British press this week, following on from release of MI5 archive files that indicate that an MI5 agent, known as Jack King ran a network of UK nazi collaborators during WWII. Highly fortunate timing for the British spooking establishment to garner some positive press, some might say. The last couple of months the news reports about them have mostly been about illegal mass surveillance techniques attempting to record and analyze all internet traffic at source, and creepy write ups of mass automated collation of private video chats. Some of them intended to be particularly private, no doubt.

    Journalists had a bit of fun trying to retrospectively finger the real Jack King. The Telegraph decided King was probably John Bingham, Lord Clanmorris, whose name is usually mentioned in passing in press stories about ‘le Carré’, itself a pen-name for David Cornwell, who often mentions that Bingham is one of the component inspirations behind his super-famous fictional master spycatcher, George Smiley. The Telegraph also span off an article about Bingham’s sense of disapproval of his protégé's literary exploits. Mr Cornwell, writing under his given name, sent in a marvellously succinct letter by way of reply.

    Bingham was of one generation, and I of another. Where Bingham believed that uncritical love of the Secret Services was synonymous with love of country, I came to believe that such love should be examined. And that, without such vigilance, our Secret Services could in certain circumstances become as much of a peril to our democracy as their supposed enemies.John Bingham may indeed have detested this notion. I equally detest the notion that our spies are uniformly immaculate, omniscient and beyond the vulgar criticism of those who not only pay for their existence, but on occasion are taken to war on the strength of concocted intelligence

    Navigating around the little flurry of reportage about this little back and forth, I found this engrossing older Q&A with le Carré, from the Paris Review, held at the time of the US publication of “The Tailor Of Panama”, back in the late 1990s. It is a marvellous read, concerning the mechanics, circumstances and techniques of his fictional writing, and touches into politics. This quote leapt off the page at me.

    My definition of a decent society is one that first of all takes care of its losers, and protects its weak.

    Quite. He’s quite a writer, that Mr. le Carré. If all you know of his work are the mostly excellent TV and motion picture adaptations of his more famous works, you might do yourself a favour, and read a few of the source novels. They work best tackled in publication order.

    posted by cms on 2014-03-09
    tagged as
  6. Top Skaters

    This is what my final day at last.fm looked like.

    In the morning, this.

    Last.fm 720° team

    In the evening, this.

    Yes, I'm working on getting a MAME cab smuggled into Moonfruit.

    posted by cms on 2013-07-06
    tagged as
  7. A-list iOS developer shop Tapbots today released a remix of their excellent twitter client (Tweetbot), focused on tiny pay-subscription social network platform app.net. I think Tweetbot is probably my favourite thing about my  iPhone, and so I immediately purchased it. No obvious disappointments, all the slick performance I like is there, and it brings across some features I've been lacking in ADN for a while, like the ability to swiftly upload photos. I promptly celebrated by taking photos of every last.fm staff member with an ADN I could track down. I think this will probably increase my use of ADN moderately. Mobile is an essential component of gathering the off-the-cuff asynchronous status updates a service like this is built upon.

    I'm not sure that it will gigantically increase my engagement with ADN alpha. I was a bit suspicious of all the frothy cliques, with an intangible unease that I struggled to define, at least until I suddenly realised it was a cogent reminder of the very earliest days of bootstrapping the IMDb message boards. That left me feeling more comfortable with what the thing was, but no more inspired to engage. I'm still in love with the idea and the ideals of the place, and I'm reasonably confident it hasn't yet fallen into it's proper, more useful place. I'm shallow enough to enjoy my sexy low user id on some level that even I don't properly understand.

    Has App Dot Net "arrived?". I think not yet. Netbot feels like a threshold event of some kind, in as much as serious developers are prepared to put enough effort into the ADN platform to produce fully realised software harnessed to it, and this degree of finish does not come cheap. ADN seems to be on a little draught of second wind recently, there's been a couple of fun toy apps, some positive press, and the recent price drop, bringing a wave of fresh users in. I'm still very positive about ADN as a concept, an indicator that there's now a long tail of internet folk interested enough in paying for stuff to make services like this potentially viable. I won't be really excited about ADN until I see the first compelling application built over it that is some mostly new and useful thing, rather than a new skin on an old one.

    posted by cms on 2012-10-03
    tagged as
  8. It's not exactly the done thing on today's web, but I'm a huge believer in paying for web services. I've never been comfortable with the ad-supported web. When pure advertising is the only revenue stream supporting a product or service I worry about the deleterious effect upon that product or service.

    I don't like the implication that they're really working for their sponsor's interests ahead of mine. I don't like the mental effort of hunting down all the opt-outs, of second-guessing potential consequences of the creepy data-mining and covert information sharing with networks of 'trusted partners'. More straightforwardly, for many cases, I suspect the numbers don't really balance; I find it difficult to rely heavily on something with a potentially precarious revenue stream. I don't want to push too much content into, or build infrastructure around things that won't necessarily be around in a year or two.

    Paying directly for things makes everything seem more explicit and straightforward. I'm the customer. I can make informed decisions about the cost and usefulness of the thing. It's in the better interests of the service provider not to abuse the relationship. A product unspoilt and unhindered by commercial marriages should stand a better chance of evolving towards it's essential form. So I'm a relatively easy sell as a consumer. Offer me a useful service, at a reasonable price, and I'm quite likely to pay you for it. 

    The flipside of this is that I'm really cautious about the reverse. Purely ad-supported sites, especially ones that seem to be offering far too much for free without being noticeably saturated with advertising make me feel slightly paranoid. I like to see which way the money flows.

    Here's a list of the sort of internety things I currently pay for, and will happily endorse. 

    • Spotify - I'm a long-time tenner a month customer. I think it's too expensive, but I somehow never quite unsubscribe.

    • Flickr - I have a pro account for photo hosting. 

    • DynDNS - I have a paid account, which gets me DNS zone hosting as well as a dynamic hostname

    • Pinboard.in - I like this bookmarking service. I was a very early adopter, and therefore my account cost a pittance due to the unique way pinboard is funded. 

    • Lastpass - I like this service so much I subscribed, just to do my bit to ensure they stay in business

    • Linode - my internet hosts are linux virtual machines hosted with this service. Linode is excellent. 

    • Word Podcast: I subscribed to the (now sadly folded) Word Magazine, primarily to access their very enjoyable podcast.

    • Metafilter: I don't use this site very much any more, but back in the old days, I got so much surfing out of it, I eventually bought a paid account just to contribute back.

    • Reddit: Similarly, I bought a founder Reddit Gold account when they appealed for cash, because I really enjoyed Reddit back before the eternal September.

    • iTunes: I use iTunes for quite a lot of things, apps, movie rentals and purchases, music purchases, and I have an iTunes Match subscription. If you have enough Apple gear to make an 'ecosystem', it's a good service.

    • Amazon Prime: I love Amazon. Some days, I wish I still worked for them.

    • Netflix: Most of my TV watching these days is netflix via Apple TV

    • App.net: - I signed up for an app.net account the second I heard about it.

    It's not a huge list. I'd like it to be larger. There's whole categories of things I'd probably cheerfully pay for should they exist. I'd pay a subscription for a decent search engine that wasn't a front for a creepy advertising juggernaut. I might pay for a subscription 'social' network, maybe something like a family-focused Yammer. I'd love something like a cheaper netflix that just focused on pre-1960s movies and archive TV. I'd like something like the old programming.reddit or hacker news. I'd love a smart news aggregator, and if I can't find one to pay for soon, I may have to invent one.


    In the olden times, there was a lot of talk about internet micropayments, and about how they couldn't possibly work, or how they were imminent and essential to safeguard the future of the web. They never really quite happened, and the shiny allure of the internet as a huge content pipe of free everything triumphed over all, but lately it feels to me like the mood is perhaps shifting a little.


    People seem to be wising up to some of the privacy considerations of infinitely free stuff that is only ever paid for covertly. The mobile app store culture has engendered a user community more acclimatised to fee-paying for services. Kindle is powering a minor revolution in self-publishing. Finally, there's Kickstarter, which is perhaps the most interesting current development in internet financing.


    There's nothing particularly new about the thinking behind Kickstarter. Through a combination of great execution and timing, it seems to have hit critical mass over the last 12 months. In the midst of all the long-tail nerd-bait (I recently signed on for my first funding)  and snake oil there are signs of some interesting funding efforts converging towards the mainstream. Champion self-publicist Amanda Palmer recently powered her project past the magical $1,000,000 mark, to flurries of 'old media' press interest.


    App.net is a manifest demonstration that I'm not completely alone in this line of thinking. Launched slightly before twitter's recent frantic, shark-jumping, repositioning of it's terms of service, it seemed a futile, quixotic gesture when I signed up to fund it on it's kickstarter-esque ( apparently kickstarter's TOS precludes funding things like ongoing businesses, so they rolled their own thing ) signup page. I fully expected it to fall short of it's goal, but maybe pick up some positive news coverage as it flamed out, much like Diaspora did before. To my surprise it charged past the funding target ahead of the deadline, and closed way ahead of the target figure. Since then, they've launched the API, and built a sort of twitter clone built across it at alpha.app.net, which is busy enough to be an almost useful, slightly cliquey chit-chat network of it's own. It seems like app.net has the potential to self-host itself as at least a niche social network for privacy nerds and web developers. For some, that might be good enough, but I suspect the real power of app.net lies within it's potential to become a kind of ad-hoc real-time message bus for higher layered services over it's API. It remains to be seen if it can gather enough developer / user mindshare to deliver on the potential.


    The most high-profile campaign I've yet seen is the Penny Arcade Sells Out. High profile, high traffic funny-picture sites are the gold-standard of high volume ad serving, with content that massive audiences enjoy, but are used to reading for "free".  Although they fell short of their more extravagant targets, including the 'complete ad removal', they hit their funding target, and raised half a million dollars. An A-lister website demonstrating the ability to generate competitive income with top level ad-sales entirely from direct user funding? Nearly. Is the tide turning? I don't know, but I can feel it pull.

    posted by cms on 2012-09-08
    tagged as
  9. tee hee hee

    elfm.el is a rudimentary last.fm radio client implemented within emacs lisp. I wrote this at work to present at our internal "Radio Hackday"; dedicated to encouraging staff to experiment with the radio services and API, and make something with them in a day and a half for show-and-tell. Kind of 20% time distilled right down to an essence.

    I wasn't sure if I was going to have enough time to contribute anything, so I wanted to focus on something I could hack on by myself, because I didn't want to hold a team back if I got called away. So I picked something jokey, inessential, yet hopefully thought-provoking, as per my usual idiom.

    I had a real blast participating. I don't usually get time to attend things like proper hack days, being all old and family-bound. I really enjoyed the atmosphere of inspiration and industry. All the other hacks were amazing, and waiting for my turn to demo I felt quite embarrassed about my stupid cryptic toy, but it worked perfectly in the spotlight. I got almost all the laughs, and all of the bemusement I was aiming for.

    The code is here. It is awful. I haven't written any coherent lisp on this scale for many years. It uses too many global variables and special buffers. It doesn't scrobble. I had to rewrite all my planned asychronous network event machine halfway through implementation, when I re-discovered the lack of lexical closures in elisp. ( I've been reading too many common lisp books in the interim, I suspect ). I think there's enough of the germ of a useful idea in there that I might just clean it up and try and extend it into a proper thing.

    I built and run it using GNU Emacs 23.4.1 . I used an external library for HTTP POST, which I found on emacswiki ( HTTP GET I glued together using the built in URL libraries). I've also put a copy of the version I used in the distribution directory. I used mpg123 for mp3 playback, which I installed using Mac Ports. The path to mpg123 is hardcoded in the lisp somewhere, probably inside play-playlist-mpg123.

    Here's my demo script, which I evaluated in a scratch buffer. Evaluating these forms in sequence will authorise the application, tune in the radio, and then fetch a playlist of five tracks and start playing them.

    ;;;; -----DEMO , this example code is out of date, see README 

    ; will open a browser to authorise application


    ; authenticate a user session


    ; tune the radio to this URL

    (radio-tune "lastfm://user/colins/library/") 

    ; refresh the playlist 

    (get-request (get-playlist-url)) 

    ; filter the playlist response to sexps, play the list

    (play-playlist-mpg123 (reduce-playlist)) 

    There is only one playback control at the moment; stop, which you can manage by killing the buffer *lastfm-radio* which has the playback process attached to it.  You can retune the radio with any lastfm:// URL format,  by re-evaluating radio-tune, and then refreshing and playing the playlist i.e. repeating the last three steps in sequence.

     The internal hackday was a cracking idea. Most of the hacks were focused around radio enhancements with broad-ranging appeal, the vast majority of them looked practically useful. I suspect most of the work will filter out into site and product updates. In addition to this, and perhaps more valuably, it worked really well as a community exercise, evolving knowledge-sharing, cross-team working, and enthusiasm, and converting them into inspiration, craft, and art. More of this sort of thing, everywhere!


    I've iterated on the original hack quite a lot to make it slightly less brain-damaged, and a bit cleaner to import into anyone else's emacs. Updated code is here and so is a README file with updated running instructions. It's still not really in a usable state for anyone else, but it's amusing me to fiddle with it, and I vaguely plan to get it to a releasable alpha state, at which point I will publish a repository.

    posted by cms on 2012-04-28
    tagged as
  10. My friend Jim won 15 quid by solving the New Scientist Enigma Puzzle. The really neat thing is he did it 32 years after the fact. Read all about it here, in his own words.

    Would anybody with a working BBC like to contribute a real world run time for his BBC BASIC based solution?

    Jim runs the Enigmatic Code blog about his hobby of solving New Scientist's Enigma puzzles using short python programs, which anyone can play along with at home.

    posted by cms on 2012-02-23
    tagged as
  11. I was churlishly unimpressed by the iTunes "12 days" Christmas promotion this year. However whilst subsequently browsing the iTunes Store home page I did find one app that impressed me enough to blog about.

    There's a store section called "Apps Starter Kit" which lists a dozen or so applications that Apple are promoting as "must have" installs for new iOS users. I installed a handful of these to my iPhone 3GS, but the one that has most impressed me so far is the iOS edition of DragonDictate.

    It's a "split brain" app, by which I mean it uses "the cloud" to perform the text-to-speech conversion. So far I have been quite impressed with the accuracy of the process, in fact I have created this blog post by dictating while walking the dog, with just a little editing afterwards for tidy up and to add hyperlinks. I suppose it is a little like a poor man's edition of Siri, minus the pretend A.I. and the search and reminders integration.

    You can get text by dictating into a text box within the application and there is a quick menu of options that allow you to create an SMS or an e-mail or copy the text to the system clipboard easily for use in other applications. This collaboration isn't too clunky and although dictating text into your phone is a little stilted it doesn't seem to be significantly less effective than my relatively crappy typing on the iPhone on-screen keyboard.

    The app was free, presumably it's intended as a promotional device to introduce users to the Dragon family of software applications. Obviously there are some privacy concerns raised by having the voice processing performed on a remote server, but the terms and conditions include a privacy policy which guarantees to preserve your anonymity and keep your data private. The application did even prompted me to ask if I wanted all of my contact names uploaded to the remote service for greater the use of name recognition, and took pains to explain that this would only include name fields from my contacts database and no other personally identifying information or contact details.

    I am not sure I would make a habit of using it for writing long articles or even blog posts like this but I think it could prove to be quite useful for such purposes as short e-mail replies or even sending SMS messages in situations where it's inconvenient to type.


    posted by cms on 2012-01-29
    tagged as
  12. According to wikipedia, the term "Churnalism" was first coined by a BBC journalist. I think they may still have journalists working there.

    See how many items of product placement you can see in this proud piece of presumably PR-led "pop sci" about smart vending machines. I found it, prominently linked, on the BBC news home page on Boxing Day. The entire notion has a whiff that classic of white elephant puffery from the old school the internet fridge about it.

    I don't know if I'm alone in finding this sort of thing repellant. The motivation to whip up this kind of nearly content-free guff into page length pieces must come from somewhere, which means a degree of specific intent. There's the skeleton of an interesting piece on mechanical learning and commercial interests buried in there somewhere, but I find it difficult to read when I keep being stabbed in the eyes by blatant marketing copy, much of which I uncharitably suspect of being pasted in directly from the source press-release. The focus of the piece ought to be on the science, perhaps some of the biometrics and algorithms supporting the interesting sounding audience impression metric (AIM) software, but that's given a throwaway mention; instead the article's centre of gravity seems distorted to orbit around some recently launched consumer products, with little depth of story. Weird details leave unanswered questions hanging. In what way is a new Jell-O SKU "Just for adults" to the extent that it requires a screening interview by femputer? Titillating teaser questions like this are familiar marketing devices used to capture and exploit base curiosity, but seem out of place in a news piece without any resolution. How does the system handle adults whose body shape diverges strongly from their defined four age brackets? What the merry heck is a general manager of personal solutions anyway?

    I gave up counting the product placement incidents after the first couple of paragraphs. Only someone with intimate knowledge of the BBC house style rules would know just how many direct repetitions of the properly capitalized brand names Kraft and Intel are strictly necessary, but there seem to be an awful lot of them littering the piece. There's a lovely Intel i7 box graphic three-quarters of the way down the piece; it seems to me only tangentally related to the story, yet conveniently re-uses the branding iconography supporting their current consumer-targetted CPU line.

    Like many a British license-fee payer, I have a peculiar, combative slightly proprietorial relationship with the BBC; being in some weird sense a stake-holder in this unique broadcasting organisation; pride mingles with a misplace sense of ownership, disappointment tangles with admiration. Once upon a time I viewed their web initiatives as exemplary, inspirational and essential. These days they seem increasingly overcooked, irrelevant, and misguided.

    I realise, in a sense, I'm a grumpy old man ranting at the telly, but I think this tapering off of content quality provided by BBC online is a real thing. If so, a really worrying trend; added to this we have an effectively Conservative administration, who I'm sure would love to see the BBC, already in retreat, broken up further. Spreading out the more lucrative parts of the special quasi-monopoly, to their chums in commercial broadcasting whilst binning even more of the less lucrative parts in the name of austerity would fit in well with their principles of government.<p>
    posted by cms on 2011-12-29
    tagged as
  13. Some time in 1997 I decided to get a modem for my home computer and try and get back on the internet. I hadn't really been online for a couple of years by this point. I'd spent a good 60% of the time I was supposed to be at university exploring the net, at approximately the same time the world-wide-web was being invented. Subsequently, a few of the offices I'd done contract work in were high-tech enough to have an internet pipe, but the majority were not, and by 1997 I was a year or two into the embryonic stages of what I then imagined to be a high-flying enterprise IT career. There were are few dial-up terminals in the office, but they were proper walled-garden, pretend the web isn't happening, CompuServe accounts, and I mostly ignored them.

    By the time 1997 came around, the internet was seriously encroaching upon the real world. URLs on product billboards, mainstream magazine articles, entirely dedicated consumer magazines, even. Java hype was everywhere in the trade media, and was getting a further boost up from the growing sense of discomfort about the disproportionate amount of influence Microsoft now wielded over the PC industry. I was pretty grumpy about Windows by this point. I'd cheerfully embraced it's third generation, as a standard way to build what were for the time fairly advanced interfaces for DOS, with a built-in graphical toolkit, and I was making my living building client/server applications for businesses, using a 4GL called 'Gupta SQLWindows', and a smattering of C and Visual Basic. The IDEs and the Win16 API were probably rudimentary, but I didn't know much better, and it was the closest thing to NEXTSTEP I'd found in a professional context. Then came Windows95, which promoted itself from a graphical shell for DOS, to a full-blown OS, which I found tremendously exciting until I'd worked with it for six months. All my tools and APIs were now yesterday's thing, and this new shiny Windows came with ridiculously inflated hardware requirements, and was frustratingly unstable. The joke term "Blue Screen Of Death" started to grate with familiarity. I grew insufferably contemptuous of Microsoft and everything it stood for.

    At home I'd been running a linux system for a year or two. Linux had grown up fast since I'd first encountered it as a barely installable joke UNIX passed around the office one day on a handful of floppies. I'd spent a day installing it on a COMPAQ laptop then, and quickly judged it to be no competition for SCO. It improved and spread rapidly, and within a couple of years I was sufficiently inspired by reports to acquire a cheap PC clone and install, break, reinstall a succession of linux distributions, starting initially with a Slackware 2.something from a magazine coverdisc (Computer Shopper, I suspect). Now I had a religion; I'd periodically switch distributions, usually from a CD/Book bundle in the bargain bucket of the local waterstones, sometimes from a CD set ordered by mail.  No net connection at home at all. Well, hardly anyone did, and there weren't yet any flat-rate or free dial-up systems.

    By 1997 though, I  felt I was ready. I bought a discounted 33.6 external modem, subscribed to an ISP that sounded platform neutral, and didn't rely on bundling DOS or Windows software dialers (Direct Connection, as was), and spent a surprisingly effortless afternoon figuring out how to connect my little linux system to the internet. This seems like it ought to have been a frustrating process, given that this was RedHat 2.x or whatever I was running by this point, and I had no internet to search for help, and no local experts to ask, but I seem to remember it being fairly trivial to set up and script a PPP connection. I think the first thing I downloaded was Netscape Navigator. Or maybe Doom. I remember setting up an offline USENET server, and then feeling my way around the web, hungry for more linux information. I would download any interesting software source code bundle I could find, and try and build it. I periodically toasted my linux box this way, inexpertly installing new homebuilt versions of libc or XFree86 with little attention to package management or change control, and not much more appreciation for the software build process. Outside of USENET the linux web community seemed disjointed. Little islands of conflicting information, often hanging off university home pages.

    One day I found this amazing sort of crowd maintained combination of a news feed and a bulletin board, already populated with a peer group almost custom-fit for me. I think I can remember how I found it. I was using a little desk applet for the Afterstep window manager called asmodem that let me toggle my modem. I was very big on customising my desktop then. I looked up the author's home page, to see if there were any good links to other AS wharf applets. One of the links to there was to this other place. I remember I spent a couple of hours there, browsing around what passed for the archives. It wasn't just linux and X, there were other nerd-friendly topics. I don't remember much about the content. I remember being engrossed, and following stories and commentary back and forth, drinking in content. Unluckily I didn't make a bookmark, and a couple of days later I realised I couldn't remember what the site was called.

    I think it took me as much as a couple of weeks to find it again. It had a stupidly hard to remember URL. http://slashdot.org/. I re-visited it frequently. It had a clever page construction, where the updates floated to the top, like a reverse INBOX. It aggregated interesting content, seemingly focused around linux, and GNU and other cool Free software like this new nuclear-mega-awk scripting language called Perl, and other nerdly content about movies, and sci-fi, and super-computers, and spaceships and BeOS. Stories were posted, usually based around a couple of links with commentary, and the users could add their own discussion in a threaded hierarchy, unmoderated, uncensored and even fully anonymously. I quickly became a compulsive visitor. Soon it was the first site I'd load after dialling up to the net.

    The anarchic commenting community sort of worked. You'd recognise the same usernames in discussions. Actually, I'd recognise sigs before names. Most of the discussion was lucid and informative. I'd usually get as much from links in the comments as I would from the submission or editorial. Even the trolls seemed funny and community-minded. It had a sense of culture, of community. First Post! Duplicate submissions on the front page, Hot grits down your pants, The naked and petrified guy, Mae Ling Mak, Natalie Portman, the caveman user I'm struggling to recall the name of (urk?), In Soviet Russia,  a Beowulf cluster, and all the rest. Memes, I suppose, but we didn't really call them that much then. The 'slashdot effect'. I remember every time there was a stable linux kernel point release, which was pretty frequently, they'd post a story about it, and I'd dutifully download the source, spend a couple of hours compiling it, and then install it, ruining my precious uptime in the process. JonKatz and his floundering attempts to become one of the gang.

    I remember frequent stories about all these futuristic new desktop interfaces that were in the pipeline. GNUstep was well on the way to bringing my idolised NEXTSTEP frameworks into my home, cost-free. Futuristic new graphics display technologies (Berlin, Fresco). The amazing (and almost functional) eye-candy of the Enlightenment WM with it's realtime miniwindow pagers and overlayed virtual desktops. Some new initiative called GNOME which was going to bring a CORBA-based networked component GUI desktop framework to run on top of traditional UNIX some day. Funny submissions, hoax submissions. Disappointingly frequent pseudo-science stories about perpetual motion machines and cold fusion, and the like. Crack dot Com were writing their new game "Golgotha" that would blend the large scale RTS wargame with the cutting edge first-person mouselooked shooting genre, and they were targeting linux as a first class platform at launch. It was all intoxicating stuff, and I spent hours immersed in it, genuinely feeling some part of a community.

    I was never a frequent poster. Initially I lurked, and dabbled with anonymity. I was very cautious about revealing too much of my personal information online in those days. I remember feeling really regretful for ages that I'd held off registering once I realised that people were competing over low UIDs. Still, here I am - user 24640 - 5 digits, not too bad. "scrutty" was the character I used to use on Perilous Realms MUD in my polytechnic days. I can't see any easy way to find my earliest comment by this account, and I can't remember what it was. Probably something embarrassing.

    I remained pretty obsessed with the site for years. My friend Tim was reminiscing on Twitter yesterday about my introducing him to it. I can remember coming home from holiday abroad, internet-free of course, and deliberately reading the previous seven days submissions to make sure I hadn't missed anything. I quit my boring career and got a job at a cool dot com startup, just as things were bubbling up. Everyone there seemed to read slashdot reloading dozens of times a day. Important technology stories broke there hours before the mainstream news sites got hold of any of it, we were always days ahead of the 'suits' with these information nuggets. Famous people had accounts and posted amongst us (John Carmack! ESR! Bruce Perens! Neil Stephenson! Wil Wheaton!) which seemed really bizarre in those days long before twitter or official facebook accounts. Comment moderation arrived, and I remember submitting comments and then reloading frequently to check my karma score, which used to be visible numerically. Karma whoring inevitably arrived, and brought meta-moderation along with it. I was the first in our office to be selected as a meta-mod, and I remember feeling proud or cool or a massive nerd,  or some composite emotion made of all three. I loved that the site was billed as news for nerds, a term I felt far more comfortable with than the more US-specific 'geek', which still grates on my ears a little.

    I remember their IPO conducted in some kind of interestingly nerdy dutch auction system. I remember watching the stories of subsequent corporate ownership and acquisition and nervously watching the site for signs of imported cultural spoilage. I remember the Slashdot PT Cruiser. Slashdot was just a daily part of life, reflexively checked and rechecked. I submitted a handful of stories, but I don't remember ever getting one accepted. I remember Jim chuckling one day across the desk from me, because whilst running HEAD requests against slashdot.org to test a proxy server or something, he spotted that slashdot was inserting Futurama quotes into it's HTTP responses, as X-Fry or X-Bender headers. I remember feeling I was drifiting a little out of touch with the herd when they posted their famous iPod launch story.

    I particularly remember that infamous afternoon in September, TeeJay looking over his screen at me and saying something about the Net being broken, and the World Trade Centre. All the news sites were down, but Slashdot just about stayed up enough for me to read about what was happening in New York city, and dash to the office kitchen to remain clamped, open-mouthed to the BBC news feed.

    When I was formulating the boards at IMDb, slashdot was a gigantic influence on my design. Most obviously in the nested table thread structure, and the view options, but in some other subtler ways, that lead me to eschew the fiddly point scoring and filtering, and implement constant post expiry to try and prevent the conversation ossifying around the earliest, most repeated subset of views. We inadvertently spawned the GNAA, who went back to slashdot, forming a particularly weird and unpleasant slashdot troll subculture. The first time I watched as IMDb was in a slashdot home page story (probably LotR or a Star Wars prequel) I remember my disappointment at the somewhat smaller than I'd imagined size of the slashdot effect, I don't think they even made it into our top 100 referrers report. I was already visiting the site less often, I had my own enormous forum to worry about, and I'd switched back to using a Mac (which had become consumed by the latest iteration of my beloved OPENSTEP). I was still probably reading it most days a week, but posting far less.

    I never quit completely. These days I'm probably down to a couple of visits a month, perhaps less than that. It still feels like an important part of my life, and I think it also represents an under-appreciated contribution to internet culture. It was the first blog-formatted site I recall ever seeing, although nobody called it that for years. It was the first successful news aggregation site to find a mainstream audience, and it unquestionably forged the the user-sourced content and discussion model template used by subsequent sites like Digg, Reddit and HN. I think it was a peer group for a huge number of people much like myself, and an important bridging stage for internet community culture in between USENET and the all-encompassing web. It was "Web 2.0" and "Social" years before they arrived. It really promoted a sense of belonging. I have never met Rob Malda, but I remember feeling elated all day, when he used slashdot to successfully propose marriage to his girlfriend, and yesterday when the surprising news broke about his resignation from the job he invented at the site he founded, it gave me far more pause than the more famous, wealthier man who grabbed all the headlines by resigning the same day.

    Slashdot will endure, and I expect I will still visit it, sporadically. I'm not going to pretend it's as important to me today as it was even five years ago. I only just realised yesterday, that Rob Malda is one of my heroes, and I never even said "Thank You". Well, I have done now.

    posted by cms on 2011-08-26
    tagged as