“Programming is like writing a book… except when you miss a single comma on page 126, the whole thing makes no sense” – Programmer humour
Many talented publishers think that they could never learn to code. They might think that it is too technical or too dry. They might assume that their proficiency with language has no application to computers. But programmers and publishing people have more in common than you might think. Programmers and writers both need to distill complex problems into clear, readable and actionable messages. Coding can be as aesthetically driven as any other art form and like writers the best programmers have technical ability and creative flair. So, what are some of the most common ways that coding overlaps with publishing?
On Friday 22 November 2019, before this year’s Futurebook conference, 40 delegates from across the publishing industry will attend the very first Day of Code at Hachette HQ in London. There, under the guidance of a team of 15 coaches, they will work in teams to build their own websites, and discover the power of code in practice. In this interview, Day of Code coach John Pettigrew previews the event and explains why coding is important for today’s publishers.
Git is a free to use distributed version
control software for tracking changes in software development. However, the
benefits of using a fully-fledged version control system like git don’t just
apply to writing software. Whether it’s a bug fix or a paperback edition both
publishers and software developers are constantly shipping new products and
revising older ones. This means that every day we perform similar sorts of
EPUB is the open standard that defines exactly what an ebook can be, and EPUB 3.2 – the first real update to the EPUB format since EPUB 3 in 2012 – has just been approved by the W3C Community Group and Business Group (the clever people that decide on such things).
If you are the sort of person who enjoys reading specification documentation then you can get up to speed with the full list of changes here. Otherwise, let me explain the main points of what has changed…
This is a guest post by Anna Cunnane. Anna is Systems and Data Manager at Abrams & Chronicle Books. Anna was winner of the Trailblazer Awards 2018, she is part of BookMachine Team Unplugged and was Chair of the Society of Young Publishers (2015-16).
Abbie Headon runs Abbie Headon Publishing Services, and offers a range of skills including writing, editing and commissioning, alongside social media, website development and publishing management. She champions fresh approaches to solving the industry’s challenges and can be found mingling at most publishing events. Abbie’s also BookMachine’s Commissioning Editor and sits on the BookMachine Editorial Board.
Francesca Zunino Harper is a linguist, translator, and publishing professional. She worked in the British and international academia researching on comparative literatures, translation, and women’s and environmental humanities for several years. She now works in the Humanities and Social Sciences area of publishing. You can follow her @ZuninoFrancesca.
Sara O’Connor worked in children’s publishing for 13 years and is now a full-time web developer. Frustrated with the cost and creative limitations of outsourcing good digital ideas (check out this article), she decided to retrain as a programmer. She now works with Emma Barnes at Consonance (the new name for Bibliocloud), helping to build the software she wished she had when she worked in publishing. Sara will be joining our panel at BookMachine Unplugged 2018: Talking Tech.
Janneke Niessen is a serial entrepreneur, angel investor, board member and mentor for startups. She has started and sold 2 international tech companies and is currently working on her third: Berlage. She is co-initiator of InspiringFifty, an initiative that aims to increase diversity in tech by making female role models more visible. As part of the InspiringFifty initiative, Janneke has published Project Prep, a novel for young girls in conjunction with an award winning child book’s author. Jannekke will be joining our panel at BookMachine Unplugged 2018: Talking Tech.
Nick Barreto works where books and technology intersect. He’s managed and built apps, is as an expert on ebook formats, metadata and workflows. He is committed to automating all the repetitive tasks to free up more time for the work that matters. Nick is one of Canelo’s co-founders and the Technology Director. @nickbarreto
This article is by Ken Jones of Circular Software. Ken is running the Understanding eBooks day on 25th April 2018.
I’ve been involved in making beautiful and interactive fixed layout ebooks since before there was a standard for such things. But trust me, this one is different… It is truly the finest example of interactive children’s story telling I have ever seen, it contains custom movies on every spread, background audio, professional narration and read aloud text highlighting, placed web code, personalisation, interactive animations and puzzles!
We’re fortunate to have so many book discovery tools and techniques available to us, but leveraging them effectively can be challenging. In this post I’ll share some insights on working strategies, drawn from experience building search and recommendation engines, and from helping publishers connect with readers through keywords.
John Chelsom started the XML Summer School in 2000, and continues as a board member and lecturer at this annual event. Since 2010 he has been the lead architect of the open source cityEHR product – an XML-based electronic health records system which combines clinical data with medical knowledge bases and is currently used in a number of hospitals in England.
2016 saw the unearthing of the oldest written document yet found in the British Isles – on a wooden Roman tablet from about 50 AD. We put huge effort into creating the digital text published today, but how much of it will still be around to read 2000 years from now?
At the XML Summer School over ten years ago, someone asked our expert panel to give guidance on the best way to archive digital publications. The most creative answer, from Robin Cover (a renowned digital archivist!) was that we should carve our text on tablets of stone, marked up in XML. I can still remember laughing heartily at his suggestion, but now I’m beginning to think he was right.
Robin’s argument was that text alone is not good enough for representing the information we want to preserve – we also need some representation of structure and metadata. For that he proposed XML – its encoding is just plain text and given a sufficiently large sample, its logic can be decoded without having the original specifications. His proof was to go back to digital assets of the 1960’s – how many of us would have software available that could read documents created way back then? Well if those documents were marked up in GML, the Generalized Markup Language invented by Charles Goldfarb at IBM in 1969, we would find it could still be read by any software that handles XML, the Extensible Markup Language descended from GML. Such software is all around us and much of its is free, including any web browser or plain text editor. Try doing the same with a proprietary word processing, desktop publishing or typesetting format, where the original application ceased to exist even ten years ago.
As for the tablets of stone, if we found a 50-year-old GML file what are the chances we’d be able to read the media it was stored on? Preservation of our digital assets is dependent on the technology used to store it, and even in fifty years we have seen many technologies come and go; paper tape streamers, tape drives and floppy disk drives will no doubt be followed into the dustbin of technology by DVDs and USB sticks over the next fifty years. So in thousands of years time, when the electricity has been switched off and archaeologists are picking over the debris of our silicon age, its still more likely they will find the text written on stone, rather than tapes, disks or chips. You may think this sounds a little crazy, but the Memory of Mankind project in Austria is aiming to do exactly that – preserving contemporary human knowledge on stone, buried deep in a salt mine for future generations to find.
I sometimes tell people that publishing should be viewed as an investment in digital assets – the more value we can create in those assets and the more we can reuse them, the greater will be the return on our investment. Our most cherished digital assets deserve to be preserved for the future, if only to protect our investment in producing them. And though many of us won’t be ready just yet to carve our documents in stone, we should at least be thinking about the first step of representing those documents in XML.
Take a look at their website for more information on the XML Summer School and their events.
Emma Barnes taught herself to code after founding her own independent publisher, Snowbooks. She went on to build Bibliocloud, the next-generation publishing system. Now she’s on a mission to promote tech skills within the publishing industry and beyond. Emma is also on the newly-formed BookMachine Editorial Board.
6.50am Wake up, wonder what day it is and remember – great! It’s the one day this week that I can dedicate to programming. I’m the MD of the indie publisher Snowbooks, and I’m CEO of Bibliocloud, responsible for sales, finance, and customer success, so each day is very different. But I reserve at least one day a week for slipping the needle in and luxuriating in single-minded programming. It so happens that it’s a Saturday, but that’s when the emails stop… context switching is my biggest foe.
8am First coffee, and a read through the opening chapters of the new Sandi Metz book about object-oriented programming in Ruby. It’s great when you find a book that directly addresses the real-world problems you’re facing. I click through to a podcast that she’s on to hear more.
11am Tests. Yesterday I discussed a piece of code that needs some attention with my colleague, Andy. The code is a method which returns a collection of external URLs that gets displayed in Bibliocloud. The URLs take you to a book’s Amazon.co.uk page, or Amazon.com page, or Wordery page, or British Library page, and so on — a handy and quick way to check what data is out there in the wild. The method doesn’t have automatic test coverage yet, so I’m going to start by documenting current behaviour. I do this using an integration test which mirrors what a user would do. We use Cucumber which gives us a common language between non-technical team members and programmers. I start by creating a new branch of the code based on our master branch, and create a Cucumber feature which literally reads “When I visit the ‘Autodrome’ page in Bibliocloud, and I click on the Amazon.com link, then I should be taken to the ‘Autodrome’ page on Amazon.com”. I then write some code to translate that into automatic test steps.
1pm The grand refactor. The Sandi Metz book has given me a couple more clues as to how this method could be improved, and I’m trying to hold all the concepts in my head so I can look at the problem squarely. Sandi Metz talks about finding the right level of abstraction, so I’m trying to think about which objects this problem is actually concerned with. Is it the validity of the ISBN that is key? Or the destinations themselves? Or the structure of the URLs? Some are built using the ISBN10, others with the ISBN13. Will there be a future case where the URL is built using an ISSN, or a DOI, or an ASIN, or an ISTC, or an ISNI, or an ORCiD iD? If a book belongs to a series, can we say that the book has an ISSN? If its authors have ORCiD IDs, can we use those to create external links for the book? What about linking to the client’s own website?
Or is this a case of YAGNI (‘you ain’t gonna need it’)? All this matters because I want to put the code in the right place, named properly, so that we can find, and change it easily, later. Maintainability, in a large, active system such as Bibliocloud, is probably the most important thing. I start by working with David to sketch out the problem (see the picture), then create a new Rubyclass by adding a text file to my local code repository called external_links.rb.
Like the common language provided by Cucumber, the challenge so far has been approached not with code, but with language, reading, grammar, discussion, and story. I reflect — not for the first time — on how relevant publishers’ skills are for programming.
2pm Lunch and back to the other Sandi Metz book I’m reading: Practical Object-Oriented Design in Ruby. There’s a good bit on page 93 where she talks about duck typing, which I wonder might be relevant. The idea about duck typing is that “if it looks like a duck and quacks like a duck, it’s a duck”. So my ExternalLinks class doesn’t need to actually be handed an actual book object in order to build the URL. It only expects to be able to get an answer when it asks “what’s your ISBN?” (even if it’s “nope, I don’t have one”). I could similarly give ExternalLinks a display spinner, or a CD, or a cassette audiobook: just so long as it can say what its ISBN is. I’m going to use this idea to write ExternalLinks so that it’s not tightly coupled to the Book class itself – though I’m a bit worried that this is another case of YAGNI. I commit this code to my local branch, glad that I’ve named it “spike/external_url_refactor” so that I can discuss this approach with my colleagues before considering it for a merge into our production system.
3pm Iteration. I run the test that was passing earlier and it fails. Huh. I abandon the integration test and start unit testing at a deeper level of the code. I realise that there’s a requirement I hadn’t understood: some of the destinations are dependent on format, as well as ISBN type. Writing the tests illuminate some of the nuances of the domain and I jump between revising the tests and revising the code (avoiding doing both at the same time which is a recipe for misery).
4pm Leave to pick up my son, as I do every day of the week. Programming allows for flexible hours. It’s the sort of job that benefits from a bit of percolation, and fitting it around family makes me happy that I can experience life and motherhood as it happens, rather than only working hard for some imaginary future.
8pm Share today’s programming. Bedtime is done, and I look at the code again, but I think I’ve got as far as my brain will take me today, so I push the code to a branch on Bitbucket, our remote code repository, and raise a pull request with my colleagues. I’ll look forward to discussing this approach with them on Monday and seeing if they notice any glaring or subtle errors, and suggest better ways to structure the code. [Postscript from the future: on Monday, we found no errors as such, but we improved the test suite and I got a lot of clarity about separation of concerns from my code review with Andy.]
10pm Bit more of that Sandi Metz book. It really is very moreish.