2017-08-02

What I've been doing: Creation of omnibus eStory volumes.

I read Internet Fiction. A lot of it. It's what I spend most of my time doing these days. I expect to keep doing this for a long time. The thing to note about Internet Fiction is that you need an active Internet connection when you are reading. A lot of the host sites allow you to download the stories for off-line reading, but for the free sites with a few exceptions the files are text files that aren't very pretty. In fact, some are down right ugly. But if you don't have an active connection, they're better than nothing.

Looking ahead, I can see the time coming where I'm in a care facility. It may not come to that, depending upon how my health deteriorates, but I'd be wise to plan for it in advance. What I need to plan for is being in a care facility because of physical problems, but while my mind is still working. I don't think many care facilities provide Internet access for those in their care. At least, that's the assumption I'm making. It's unlikely I'd have a desktop computer in a care facility, but a laptop or tablet seems reasonable. So, if I'm going to have anything to read, I'll need to have amassed a collection of files, preferably eBooks, of the stories I enjoy reading. While some of the authors who post on the Internet have gone to the effort to repackage their stories for sale via Nook, Kindle, Smashwords, Lulu, and a growing plethora of related sites, many have not. Which means that for many of the stories that I enjoy reading, if I want something other than raw text or a downloaded web page, I have to create it myself. There are eBooks out there about this, I even have two of them in my collection, although I have to admit I haven't read them. There discussions of this subject in various of the online forums frequented by authors; I've casually monitored the discussions in the Authors section of the Stories Online Forum. Mostly I've followed the examples of eBooks that I've purchased. The thing about the discussions on the author forums is, they pretty much assume you've already got a clean complete document in .docx, .rtf., or .odt, or some other accepted standard more advanced than .txt. They don't talk about what to do if your starting from .txt files download with lots of line feeds to keep the lines short, such as those available from Project Gutenberg or FictionMania. They don't talk about starting from downloaded web pages, with all the nasty .html artifacts that can make a standard word processor choke, and which can cause problems with the more basic (read: free) .html editors. So I've had to do a lot of learning by trial and error, sometimes ending up with such a mess that I deleted the working document and started over from the original downloaded file; one thing I learned very early is that you don't edit the original file, you make a copy first and edit that.

.txt files have no formatting. No italics, no bold, underlining, nothing. Sometimes authors use non-alphanumeric characters, such a - _ /\[]() to indicate formatting; this started with Usenet and BBS posts, where it was the only way. The Usenet/BBS crowd developed a fairly standard definition for what these non-alphanumeric characters intended to convey, but if you didn't grow up on Usenet it's not intuitive. If you start with a file that has formatting indicated in this manner, you have a big job ahead of you. First, you have to import the file into your favorite document editor that supports modern WYSIWYG formatting. Then you have to determine what the author intended with the symbols he used, apply the WYSIWYG formatting, and remove the characters used to imply that formatting. While there may be document editors out there that allow you to search for text surrounded by certain characters, and then replace those characters with modern formatting, I haven't come across them. So if you start with a document like that, it's going to be very labor intensive to bring it up to modern standards, as you will have to go through it character by character. And you will also need to look for foreign language words with non-English characters. I'm constantly replacing deja-vu with déjà vu, for example. and then you have to go through, line by line, adding a space to the end of each line and deleting the line feed so that paragraphs flow together as one unit; in the early days of Usenet and BBSs, lines didn't wrap around the screen, they ran off the end, and you had to manually insert line feeds into the document to keep the lines from running off the screen. Modern technology handles wrapping text just fine, and the display width is much greater, so a sentence that might take three lines with line feeds may take just one line with them removed. What I do now, when I come across such a file, is do an Internet search to see if it's been reposted with this reformatting already done. A good example of this is the stories by The Professor, which were originally posted at FictionMania, without formatting. In 2010 PS obtained permission from The Professor to repost many of his stories at BigCloset TopShelf. The reposted stories had The Professor's intended formatting. They were also .html documents. This was good and bad. Good in that the character formatting had been done. Bad, because .html documents have frames and all sorts of other stuff that mess up non.html documents in text processors. BigCloset allows you to create a printer friendly document that doesn't have all the site advertising sidebars and menus, etc. that their web pages are cluttered with. You can download that page, which gives you a much nicer document to start with in an .html editor. But I quickly discovered that weird shit happens when editing .html files, doing something that seems completely innocuous will cause a section of text to suddenly change font and font treatments for no reason I can fathom, and prove to be beyond the undo function to handle. So I gave up on editing .html files as the path to nifty eBooks. I had to, I was getting too angry and frustrated. The next thing I tried was to copy/paste the entire text of the printer friendly file in one fell swoop into a LibreOffice Writer document. This worked, after a fashion, but introduced some .html formatting elements, such as frames, into the document. These elements in some manner interfere with some of LibreOffice's formatting tools; I kept finding myself unable to insert horizontal lines between sections of text to indicate breaks in action, instead of the short lengths of dashes that had been used. And I really wanted those horizontal lines, they look much nicer than short runs of dashes. What I'm now doing, which is somewhat time consumptive and repetitive, is cutting/pasting text from within an individual frame; this way there are no .html artifacts to interfere with my document editor. It takes time, but is still so much faster than starting from a .txt file that it isn't funny.

Formatting aside, stories are posted in different ways. Sometimes the entire document is posted at once, sometimes it is posted in sections. Depending upon the host site, files may be limited in size, with larger files having to be broken down into parts. If posted via a mailing list, short stories may be posted complete, but longer works will be split up. If the author's mailing list has a host site with file storage capabilities, he may store the complete story as a single document at that site, and that single document will be what gets posted at other sites that can handle files that size. That's how Morpheus does things. Usually. He used to post his stories as serial emails to his Yahoo! Group, then post the complete story as a single file at BigCloset. Recently he's posted them at BigCloset at the same time he's sent them to his mailing list, and not posted a complete doc at BigCloset when done. The Academy was the last story in his Were universe that he posted at BigCloset as one file. The next, Touching the Moon, was posted in 62 parts! If I'm creating eBooks to read in a care facility, I don't want to have 62 eBooks to read one novel, just not going to happen.  FictionMania readers were lucky, he posted it as one document there, but since it was done as a .txt file, no formatting and lots of line feeds. Touching the Moon was a straight forward cut/paste of 62 text blocks into an .odt doc; .odt is LibreOffice' default document format. However, I don't have an .odt document of Touching the Moon. Rather, I have one .odt doc of all the Were universe stories to date. With a cover page, a title page, a table of contents with internal links to each story, and at the beginning of each story, right under the title of the story, links to the files at FictionMania and BigCloset. And an About the Author section at the end, with a link to the copy of a chat session interview with him stored at FictionMania.

I've created a number of documents like that. And using Calibre, an eBook management/conversion program, I've created eBooks from those documents in a number of file types, for ease of reading. I've also had the thought that after all the effort involved in creating these documents, it would be nice if it benefited more than just myself.  Since I don't own the rights to the stories, I can't distribute them without the permission of the author. The author may prefer to handle distribution themselves; while I did the packaging, I consider the documents their property to utilize as they see fit. If they want to sell copies, fine by me. If they want to make them freely available, well, that's pretty cool.

I've only contacted one author about this so far. With very positive results. With the permission of The Professor, I've uploaded eBook versions of The Complete Ovid Stories to the Internet Archive. While I haven't looked into what would be involved in making them available through the Nook and Kindle storefronts as free eBooks, I have The Professor's permission to do so, it's just a matter of working with those sites to make it clear that while it is not my intellectual property, I have been authorized to act as The Professor's agent in placing copies in the wild.

I've got to say I feel pretty good about this. While the majority of Internet Fiction is drek, Sturgeon's Law holding true, there's some pretty good stuff that risks getting lost when the host site closes, as happened when EWP went under; in that case we were fortunate that the Internet Archive's Wayback Machine had archived the site, and that someone checked while we still remembered the URL of EWP, since the Wayback machine indexes by URL. Unlike the print publishing industry, where using your legal name as the author is the norm, Internet Fiction is almost entirely published under pseudonyms. The heirs to print industry authors generally know that so and so is an author, and what he's published, and can take action to keep those items in print, so that they get the revenue. The vast majority of times, Internet Fiction author's relatives have no clue that they write Internet Fiction, nor how to obtain access to their accounts; I was fortunate that twenty years after the last Ovid story was posted, The Professor was still monitoring the message board at FictionMania, and answered my message asking if anyone knew how to contact The Professor, if he was still alive. The last person I knew to be in contact with him, PS, in 2010, hadn't posted at BigCloset since 2013, and the last contact Angharad had with PS had been several years ago, when he was in ill health. In the print community, publishers generally find out when their authors die. On the Internet, unless someone in contact with them outside the Internet finds out and posts the information, an individual could be dead for decades and no one would know it, they'd just know it had been a while since they'd been heard from. Without knowing legal names, you can't search for obituaries or go through the Social Security Death Index. This can sometimes be disastrous for an Internet community, when the person managing the web hosting dies and the first anyone knows is when the site is shut down for non-payment of maintenance fees. BigCloset, The Crystal Hall, and Stardust all had that start to happen to them, when Bob Arnold died. He'd not only handled hosting those web sites, the server's physical location was his home. When the power was turned off, the sites went black. In this case, his family knew what he had been involved with, and approved, which is pretty incredible since the primary genre posted to those three sites is Transgender Fiction; Bob wasn't Trans himself, but had an interest in Transformation and Gender-Bender fiction, which heavily overlaps TG fiction. The admins at BigCloset were able to contact Bob's family, and arranged for the power to go back on, and then purchased and relocated the servers. Stardust was Bob's baby, and is being maintained in his memory. The Crystal Hall has since set up shop on it's own, but maintains close ties with BigCloset. But if Bob's family hadn't approved of what he was doing, and if Erin and the other admins at BigCloset hadn't known how to contact them, all three sites would have been lost forever, along with any stories not backed up elsewhere. BigCloset has set up a corporation to administer the site, so there won't be one key individual whose loss will bring it down. BigCloset also has a memorial wall, where are listed the names of those members who they know have died. There is a forum thread at Beyond The Far Horizon dedicated to information on the status of authors, but I don't know what arrangements Gina Marie Wylie has made for maintenance of the site when she becomes unable to do so; she's already had to change the domain type in the URL because someone snipped the domain renewal on her. Stories Online, and it's sister sites, Fine Stories and SciFi Stories, are managed by World Literature Publishing Company, but as far as I know that organization is wholly owned by Lazeez Jiddan, and I don't know what arrangements he's made for their continuation when he's no longer up to it; he's a very hand's on sysadmin, lot's of hand coding of the site infrastructure, it would be very difficult for someone to come in cold and keep it going.

Potentially, I could be making master documents and talking with authors about getting them archived for a very long time. I'll keep making the documents since they meet a need that I have. And I'll keep offering them to the authors because it would be criminal, in my mind, to keep the results of that effort to myself.

This isn't the first time I've done something like this. At my Academia site are stored .pdf files of

Di Grassi his true Arte of Defence modernized v1 2

Vincentio Saviolo, His Practise in two books, modernized typeface, annotated vocabulary

which I produced several years ago. The copies available were all unmodified images of the original publications, and man, were they hard to read. So I ran them through OCR, corrected all the OCR errors, annotated them, and created new .pdf files, and posted them to Academia and spread the word through the SCA Rapier community, and also the HEMA community. I didn't update the spelling, they're a strict transcription formatted to match the original. The one major alteration was replacing the illustrations from the Di Grassi English edition with those from the original Italian edition, which were much better illustrations, which I did at the suggestion of one of the HEMA types, who provided the URL for the images. I should probably get them uploaded to the Internet Archive as well. I need to modify them anyway, my email contact information in them is now incorrect. Addendum, 8/29/2017: OK, I hadn't looked at the text of the fencing manuals since I created them in 2013, so I was in error. I did standardize, and modernize, the spelling, and in some cases, the words themselves, substituting modern equivalents where the intended meaning was no longer what the word means in Modern English. I'm currently creating a non-normalized EModE version of Saviolo, and plan to then create a normalized version, which will make them of use to researchers who want them in the original language. I need to revise the modernized text, as I've found places where I misread the original text the first time through, and I want to rethink EModE/Modern word equivalencies.

No comments: