TransHum-D23
07.06.17
|

 
 
 
NOTE:
This open standard is undergoing revision, in order to harmonize with the first implementation: XanaduSpace(tm), which may be downloaded from xanarama.net.
/ 2007.06.17

 
 
TransHum-D22z
05.10.22#
|
This is currently the Transliterature main page.
To Transquoter main page
To Transquoter download page.
.
.
..
TRANSLITERATURE*
A Humanist Format for Re-Usable Documents and Media
.
• Deep  • Open  • Re-User-Friendly
• Free-Form  • Nonhierarchical  • Profusely Connectable

 BEING DEVELOPED OPEN SOURCE AT:
University of Oxford • University of Southampton
• Project Xanadu® • Xanadu Australia
• Liquid Information, London  • University College, London
.
Theodor Holm Nelson
Oxford Internet Institute and Project Xanadu
electromail address grebnetug[å]xanadu.net

        Note: Your questions and emails may be posted.
        Note: We are looking for a coordinator, one person who can understand all the details and explain them to participants.  Repeatedly.


* "Transliterature" is trademarked not for commercial purposes but to avoid semantic creep. Our trademarked terms may be used only for what exactly fits our specs-- with no additional features.  (Software compliant with our specs but having additional features may be called Transliterature-compliant.  [To be detailed in open source license not yet decided.]  The same applies to Transquoter™, TransLit™, LUSTR™, Transversioner™.

Note that for trademark purposes this Transliterature design itself counts as goods distributed in commerce.

|
This work derives from a simple question we asked long ago: "How can computer documents– shown interactively on screens, stored on disk, transmitted electronically– improve on paper?"  Our answer was: "Keep every quotation connected to its original source."  We are still fighting for this idea, and the great powers it will give authors and readers.  (Others would later ask a very different question: "How can computers SIMULATE paper?"– the wrong question, we believe, whose mistaken pursuit has brought us to the present grim document world.)

One part of this project is available already: The Xanadu® Transquoter™, which does indeed keep quotes connected to their origins.
 

First, to My Fellow Humanists:
To laymen and outsiders, the world of computer media seems immutable.

You have been taught to use Microsoft Word and the World Wide Web as if they were some sort of reality dictated by the universe, immutable "technology" requiring submission and obedience.

But technology, here as elsewhere, masks an ocean of possibilities frozen into a few systems of convention.

Inside the software, it's all completely arbitrary.  Such "technologies" as Email, Microsoft Windows and the World Wide Web were designed by people who thought those things were exactly what we needed.  So-called "ICTs"-- "Information and Communication Technologies," like these-- did not drop from the skies or the brow of Zeus.  Pay attention to the man behind the curtain!  Today's electronic documents were explicitly designed according to technical traditions and tekkie mindset.  People, not computers, are forcing hierarchy on us, and perhaps other properties you may not want.

Things could be very different.

Instead of looking at deeper possibilities for electronic literature, the designers of today's electronic documents have imposed hierarchy and simulated paper.  This has drastically limited us.  We cannot annotate, we cannot publish side-by-side commentaries, we cannot legally quote at length, we cannot easily see the original contexts of quotes.*

* I must concede that there are many disparate attempts to do these things, and others mentioned later, but the increasing tangle of today's systems probably dooms most of them to only local success.


All this the Transliterature Project wishes to change.
 

FIRST, A FEW PICTURES
How about--
allowing any two documents or versions to be compared side-by-side, showing their links and/or transclusions (if any).

allowing anything to be annotated side-by-side by anyone, attaching comments to any document and publishing them.

keeping quotations connected each to its original context, viewable side by side; facilitating the legal re-use of content (method and permission for doing it)

permitting profuse and varied links of many types by different authors, overlapping freely

shared workgroup writing-spaces where different contributions are recognizable


allowing alternative views of the same document

allowing easy management of many simultaneous versions

allowing user choice of views and interfaces

and much more.

We believe many people have yearned for such capabilities but haven't figured out how to do them.  It's by no means obvious; we have worked on the problem for a very long time, and propose a simple generalized method.

To be sure, for each of the objectives I have enumerated, there is someone trying to do it on the World Wide Web.  But such efforts must perforce be entangled with web methods, designed for other purposes, and so they tend to be limited, incompatible, and hard to use.

We believe now is the time to start over.


BETTER TOOLS

To satisfy deeper literary needs, we want to provide a deep new system for editing and rich document management.

Instead of the "clipboard," which loses all identity of its contents, we propose pullacross editing, where the user pulls a portion from some source and carries it visibly to its new context.

We further intend that you can continue to see, at any time, where each portion came from--
GENERALIZING THE DOCUMENT
We also generalize to new forms of document.  Think of generalizing the magazine layout, for instance--
The magazine layout isn't about paper.  It's about showing related parts interestingly.  The magazine layout deconstructs into related parts which can  be seen in new spaces.


Generalizing this even further, we hope to provide new kinds of hypertext the world has never seen (flying and floating), doing and showing far more than can be done in today's documents.

We should be able to fly documents in 3D gaming space (using today's incredible gaming graphics)--

With transliterary data structure, however, all this can be linked to any degree, and show origins of each portion as well.

As a system of on-line publishing, this should present a real alternative to the World Wide Web but interconnectable with web documents.  (We can point at theirs and vice versa.)
 

TOWARD A DEEP ELECTRONIC LITERATURE
Tekkies think that electronic documents and the World Wide Web are something completely new and that they own it, exactly the way every generation of teenagers thinks they've invented sex and it's their secret.

But it's not new and they don't own it.  Word processing and the World Wide Web are not intrinsically new.  They are literature.

What is literature?  Literature is (among other things) the study and design of documents, their structure and connections.  Therefore today's electronic documents are literature, electronic literature, and the question is what electronic literature people really need.

Electronic literature should belong to all the world, not just be hoarded by a priesthood, and it should do what people need in order to organize and present human ideas with the least difficulty in the richest possible form.

A document is not necessarily a simulation of paper.  In the most general sense, a document is a package of ideas created by human minds and addressed to human minds, intended for the furtherance of those ideas and those minds.  Human ideas manifest as text, connections, diagrams and more: thus how to store them and present them is a crucial issue for civilization.

The furtherance of the ideas, and the furtherance of the minds that present them and take them in, are the real objectives.  And so what is important in documents is the expression, reception and re-use of ideas.  Connections, annotations, and most especially re-use-- the traceable flow of content among documents and their versions-- must be our central objectives, not the simulation of paper.

Those who created today's computer documents lost sight of these objectives.  The world has accepted forms of electronic document that are based on technical traditions, and which cannot be annotated, easily connected or deeply re-used.  They impose hierarchy on the contents and ensnare page designers in tangles only a few can manage.

"Technology" must no longer be the emphasis, but literature.  "Hypertext"-- a word I coined long ago-- is not technology but potentially the fullest generalization of documents and literature.  Text on paper was the best way to present ideas in the paper era, when there was no other way; but now we see fantastic movies and commercials to imitate, and we have super-power graphics cards that can enact swoops and zooms hitherto scarcely imaginable.  Tomorrow's true hypertext can give us far more powerful ways to show, integrate and embellish ideas-- leaving behind the imitation of paper represented by word processing and the web.  It's time for a new flying cinematic literature to represent and present tomorrow's ideas.

(Indeed, if we stop imitating paper, the long-doubted "paperless office" may still be possible :)

 
The design of electronic documents is not "technology,"
dictated by necessity like plumbing or aerodynamics.
It is design, a system of conscious decisions
about the way things should be--
like architecture, music or game design.
The design of our electronic documents has shaped
today's world.
And so far it has been simpleminded, shallow
and darkly limiting.
TRANSLITERATURE, the Genre;
TransLit, the Software
What follows is not a promise but an agenda and a fairly complete sketch.  It has no delivery dates.  It could take decades or it could take months, depending on who else cares.  It may not happen in my lifetime.  But the important thing is to start.

"Transliterature" is our name for a proposed new universal genre intended to unify electronic documents and media, erasing format boundaries and easing the copyright problem.

It is an extremely simple design, intended to correct many things that are wrong with today's computer world and liberate our use of media.

It should make possible a new crossover medium-- transpathic documents-- allowing you to step from content in one document to the same content in another document (which could be a movie, or radio show, or new media construction).  This should bring new insights, new forms of anthology, and new forms of copyright and media commerce (see Transcopyright.org).

Underneath, Transliterature uses a method of open media packaging (LUSTR), described below-- an extremely simple new infrastructure.  (Reviewed and summarized at end.)
 

OBJECTIVES OF THE TRANSLITERATURE PROJECT
What follow are not small objectives, but the steps are relatively small and the methods simple now.
We want to provide a principled alternative to today's electronic formats and enclosed, canopic-jar document conventions.
We want to show far deeper hypertext than is possible on the web.
We want to unify hypertext with word processing, audio and video, email, instant messaging and other media..
We want to offer a principled new form of rewritable, reworkable content.
We want to make the processes of work simpler and more powerful.


NEW FORMS OF RENDERING AND INTERACTION

Rendering, or actual presentation of the document, is not locked to a particular view, as in today's tradition of paper simulation (WYSIWIG, on which most current formats are based).

Fundamental view: Parallel tracks.  At all times a multitrack view is to be available, showing the virtual media stream(s) and their parallel relations of link and transclusion.  This multitrack view is to be the basic transliterary view, in the same way that a paper-simulation view (WYSIWYG) has been the basic view of today's conventional electronic documents.

Multitrack views are already familiar in many contexts, e.g. audio editing, parallel timelines.

Game-space views.  Using such mechanisms as 3D and Flash, we can show documents in many radical new ways, including flying islands, crawls, escalators, Matrix rain, etc.  (We call this WYSIWYNC, What You See Is What You Never Could-- before.)  All content so presented remains subject to profuse link and transclusion following.

Multiple views.  Authors may recommend more than one view of a document, and users may override the author's recommendations.  (We call this pringiple WYSIWYL, What You See Is What You Like.)  You can of course simulate paper if you like; paper simulation can be just one among the available views.  It should be possible to show and print transliterary documents in the conventional ways, simply returning them to the selfsame legacy viewers from which they were received.
 

INTERACTION
Links to and from all portions of a document must be followable, as well as transclusion paths to origins (transpathic connection). Transpathic stepping to the original context, or other currently resident document, must be possible.

Since we want to make possible many or myriad overlapping links (impossible on the web), there need to be ways to riffle through them interactively.
 

SIMPLICITY AND CLARITY
Unlike most other document representations, transliterature is simple and transparent, based on the open parallelism of contents, markup and connections.

It may be to a degree obsolescence-proof, in that certain basic structures can be permanently settled and always visible on request.
 

AVAILABLE AND WORKING NOW
The Xanadu Transquoter, explained at "Xanadu Transquoter in Brief," may be downloaded and used to create indirect (referential) documents, where every portion is transpathically connected to its source.  The Transquoter is effectively our invitation into the larger transliterary system.

We are hoping that competent and compatible people, feeling inspired, intrigued or amused by this alternative universe, will show up and join in.


A NOTE TO THE HUMANISTS WHO HAVE TO STOP HERE

What follows has to be technical.  But to those humanist readers who care about these issues we say: this will require new editing and viewing programs and a good deal more.  It could be done in months if we had resources, or it may take years or decades.
• if successful, this should make linking and annotation of any screen documents easy
• if successful, this should make deep quotability easy and widespread (for a preview that already works, see the  Xanadu Transquoter), expanded further in "The Xanadu® Transquoter™ in Brief".
• if our copyright idea catches on, this could create a growing pool of legally re-usable content, always brought from each respective publisher.
Thank you for your support.

Now, for Those Who Are Technically Knowledgeable and Open-Minded--
INTERNALS OF TRANSLITERATURE

  
It is important to understand that these internals
 should not be seen by most users.  With proper
 software for editing and presentation,
 Transliterature should be as easy to use as any
 conventional document system.

 
These methods may seem complicated and unnecessary 
 if you lose sight of what they accomplish.  The overhead 
 may best be compared to the huge headers of email or
 the trillions of lost packets on the Internet– or even our
 dark DNA– seeming inefficiencies, hidden from our eyes,
 that make a great deal possible.
  
WHERE TRANSLITERATURE CAME FROM
.
This is an adaptation of the reference version of the Xanadu® hypertext system (xu88, now "Udanax Green"), designed by Roger Gregory, Mark Miller and Stuart Greene.  I have made numerous adaptations to present-day conditions, including the exposure of conventional files, adaptation to current forms of addressing, and deconstruction of ambient formats.  (I have had to drop the more powerful and obscure features, such as permutation matrices of transfinite span addresses, so only one-level transclusion is supported.  Not to mention enfiladics.)

Meanwhile, Roger Gregory says the Green server has been debugged.  Jeff Rush says he has converted it to Python.  We look forward to merging these efforts.

A CLIENT SYSTEM
Transliterature is the system of documents; the program to run it is to be called TransLit.  TransLit is to be a client program, almost entirely carried out in users' machines, but with server boosts welcome--
- especially from the EPrints portion and context services, discussed in "The Xanadu® Transquoter™ in Brief").

- from the Transversioner, a generalization of Ken'ichi Unnai's implementation of the hypertime editing design (ca. 1996).

Keeping the core functions in the client may help somewhat in a number of copyright issues (though we offer a generalized solution for the on-line copyright problem; see Transcopyright.org.)


TRANSLITERARY INTERNALS:
LUSTR (Level of Universal STRucture)

|
|
|
"Any problem in computer science
can be solved by
one more level of indirection."
–Old joke
|
|
This brings us to the general design for the transliterary data system.  (LUSTR, Level of Universal Structure.)

It is not heirarchical and not encapsulated.  It consists of addressable content streams and structures to be applied to them.

|
 
A transliterary document consists of a list of selected content
 and a list of selected clinks. 

The content and clinks may be brought from anywhere.

The content is brought in and the clinks are applied to it.

Content elements are individually addressable and clinks
 are individually addressable.
  


With this you can do anything.


The transliterary internals are extremely simple and minimalist but very different from conventional methods.  Once you understand the data structure, the mechanics become relatively trivial.

The transliterary document is inside-out from a conventional document (such as a textfile or a web page).  Instead of a document being a lumpfile of content and markup, it is maintained as a list and fulfilled in the client, which sends for the contents, the connections and markup.

Essentially, we assign stabilized network addresses to all text (later, other fluid media-- audio and video, instant messaging, persistent streaming, etc.).  For instance, we obtain the indexed positions of a web page through Andrew Pam's algorithm.

Then we deal with the content indirectly, referring to it at all times by these stabilized addresses-- with referential editing*, markup, delivery and packaging.  That's all there is to it.  However, God is in the details.**

* Note: Bill Duvall, an alumnus of both Doug Engelbart's NLS group and Xerox PARC, told me recently that he had actually tried referential editing on the Alto, based on my proposal for referential editing decades ago.  As I understand it, Bill implemented an NLS lookalike on the Alto as an experiment-- the full instruction-set of Doug's system.  However, it was referential like Transliterature, manipulating pointers to cumulative text.  He was surprised that it actually ran faster than the native NLS!  (It should still be available on somebody's Alto disk, he thinks.)

Speed was never the point; the ideas behind referential editing were interconnection and backtrack and structural clarity.  But this is a fine little piece of history.

** I believe this was the original Shaker motto in the nineteenth century, but somehow it has been assigned to another Being in recent usage.

Then we apply markup and structure (clinks) as external pointers to the content.

This approach is counterintuitive, and powerful, and in principle can solve many issues of linking, versioning, copyright, origin.
 

INDIRECT DOCUMENTS, SOURCE DOCUMENTS
Conventional electronic documents are direct, carrying the actual text (and other media contents) inside some packaging file (.txt, .rtf, .pdf, .html, etc.).

A transliterary document is indirect, meaning that we edit it and distribute it as a system of pointers.  One kind of pointer (the content pointer) brings in a span of content, another kind of pointer (the clink) attaches or decorates contents.  They are internally similar, but have very different functions.

A transliterary document uses conventional documents as a source pool, as well as permascrolls-- cumulative files which have not been arranged into presentable documents.  (See "Transliterary Content Game.")

Within the transliterary paradigm, the main act of publication is to distribute an indirect document by sending out a package of these pointers, called an EDL (Edit Decision List).  Any new contents to be included need to be put out on the net first in source documents or permascrolls, so they too can be pointed to by the EDL.  Otherwise the chain of uniform reusability will be broken.


|
TRANSLIT INTERNALS:
1.  THE CONTENT SIDE OF THE EDL,
without links or transclusion

We will first discuss the content list (part of the EDL) and how it is edited.

Detailed Examples of Document Representation, Editing, The Transclusion Operation

This section is intended to clarify key issues of indirect document structure to those who are interested.

Many find the transliterary approach totally counterintuitive.  People have many vital questions about transliterary structure-- especially indirect documents and editing.  It makes no sense until various questions are answered.  The following examples should clarify most of the basic issues, and clear up the common misunderstandings, about transliterature.

We will first discuss the EDL without clinks.

We will look at small-scale examples, trivial by themselves, intended to illustrate technicalities which will be useful with more significant content portions.

You may want to install the Xanadu® Transquoter to study these examples.

If you want a quicker read, you may be able to follow the examples anyway.
 

SOME DETAILED EXAMPLES USING THE XANADU TRANSQUOTER
Understanding requires examples.  In what follows we will indicate
• The fundamental operations, demonstrating on plain textfiles (using the Xanadu Transquoter)--
•• rearrangement
•• insertion of new content
•• deletion
• The extended design of our structure as an alternative for decorated and connected text (such as HTML)--
•• how 'tags' can be converted to clinks
•• how clinks work with content
We will use the Transquoter, which already handles plain text according to the transliterary principles, as a client program that assembles an indirect document into a web page.  (But because it works in a web browser, it is much more limited than the Transliterature design.)  It has its own special adaptations particular to the web browser; see http://www.translit.org/transquoter/.  But it can be put to good use demonstrating the fundamentals of our approach.

Let's take some examples.

THE BASIC TEXT WE'LL WORK WITH
Let us use the following classic text, widely known to typists and journalists, because it compactly uses every character of the English alphabet.
THE QUICK BROWN FOX JUMPS OVER THE LAZY DOG
For use in our exercises, we have stored this all-caps textfile at
http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT
First let us send for its contents with an EDL.  We put the following spanpointer into a .EDL file and send it to the Transquoter.  (I.e., if using Windows, we put it in a textfile, change the suffix to .EDL and double-click on it, or drop it on the Transquoter code.)
http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT?xuversion=1.0&locspec=charrange:0/44
This opens the content text in a browser.

So far not much.

The text happens not to have any markup, but even if it did the result would be the same, since the Transquoter strips markup from web pages.

Now let's put our fox to work.

Virtual Rearrangement: An Example
  
It is important to keep saying that Transliterature is in
 pre-prototype. Eventually users should see an "ordinary"
 text editor.   Indeed, it could be an ordinary editor with
 an output option to transliterary format.

However, the internal results of all editing are indirect
 (virtual, referential), maintained as content lists.  Thus
 we work with and maintain the document principally as 
 a content list and clinks (to be discussed).

.
Let's see what this indirect editing looks like internally.  We'll rearrange the above textfile indirectly into
OVER THE LAZY DOG JUMPS THE QUICK BROWN FOX
Let's create an EDL that will rearrange it referentially.

Here are three span URLs, pointing respectively to "OVER THE LAZY DOG ", to "JUMPS", and to "THE QUICK BROWN FOX", with appropriate spaces included.

http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT?xuversion=1.0&locspec=charrange:26/19
http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT?xuversion=1.0&locspec=charrange:19/7
http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT?xuversion=1.0&locspec=charrange:0/20
If you have the Transquoter installed in Windows, you can make an EDL of these spanpointers.  (Put them in a textfile, change the suffix to .edl.)

Now click on this .edl file, causing Windows to send the EDL to the Transquoter.  See if you don't get--

OVER THE LAZY DOG JUMPS THE QUICK BROWN FOX
(If you don't have the transquoter installed, you'll just have to trust us :)


ADDING NEW CONTENT

What if you want to add new content?  Let's substitute the word "purple" for "brown" in this example.

Since we don't change the original document addresses, we have to put the new word somewhere else.  We could append it; as it happens, we've put it in a document called InputPermascroll, which we'll assume is a place where you keep putting your text additions.  (Note that this should be done automatically by a proper editor.)

Here's the same story, but now the fox has changed his color.

 
http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT?xuversion=1.0&locspec=charrange:26/19
http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT?xuversion=1.0&locspec=charrange:19/7
http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT?xuversion=1.0&locspec=charrange:0/10
http://transliterature.org/tlitExx/InputPermascroll.txt?xuversion=1.0&locspec=charrange:16/7
http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT?xuversion=1.0&locspec=charrange:15/4
We will refer to this as "the purple EDL" for future reference in later examples.
When we put these into a .EDL file, the Transquoter delivers the desired
OVER THE LAZY DOG JUMPS THE QUICK PURPLE FOX
which remains connected to its original sources, as before.  Note that only "purple" shows a mouseover color, since the others are from the first document transquoted; by default the first document put to the Transquoter in the EDL has no mouseover color.  (See http://transliterature.org/transquoter/.)


DELETION

Deletion doesn't actually happen, in that no content is changed; we simply change the EDL to ignore that content.  Content left out of an EDL is effectively deleted.  "No implementation is required," except the editing operation to change the EDL.

Example left to the reader: try deleting "purple" using the above EDL.


THE UPPER-AND-LOWER CASE PROBLEM

So far so good.  Textfiles like that, made of all-cap text, are comparatively easy to rearrange.

Unfortunately we run into problems with upper and lower case.  If we do the rearrangement above with a sentence that has normal upper and lower case, what happens?

Take this original (the same story, but no longer all caps)

The quick brown fox jumps over the lazy dog.
which we have stored at transliterature.org/QuickBrownFox.html.
Now let's rearrange this document in the same way we did with the raw uppercase textfile.  Because of our web-page analysis algorithm, the counts required in the span URLs are the same as for the same exercise on the textfile.  Here is the EDL:
http://transliterature.org/tlitExx/QuickBrownFox.html?xuversion=1.0&locspec=charrange:26/17
http://transliterature.org/tlitExx/QuickBrownFox.html?xuversion=1.0&locspec=charrange:19/7
http://transliterature.org/tlitExx/QuickBrownFox.html?xuversion=1.0&locspec=charrange:0/19
http://transliterature.org/tlitExx/QuickBrownFox.html?xuversion=1.0&locspec=charrange:43/2
This presently gives us, with the Xanadu Transquoter,
over the lazy dog jumps The quick brown fox.
Ooops!  Upper and lower case aren't changed automatically, are they.

We'll deal with that later on, but not with the Xanadu Transquoter.  In the larger Transliterature project we intend to use mechanisms (clinks) that leave the Transquoter behind.  (See later in this document.)

  
Highly technical point for our old friends--
THE EDITING MECHANISM FOR TRANSCLUSION: 
No Recursion.

There has been considerable confusion as to what happens internally
 when you transclude from one EDL to another.  It has been widely imagined
 that somehow our method keeps track of where the URLs have been. 
 It does not.  Many people have gotten the impression that we must search
 successively through every document that a content list has passed through. 
 This is not the case.

Let's take an example.

Suppose we take the previous rearranged brown fox example
 (generated by the Purple EDL)--

OVER THE LAZY DOG JUMPS THE PURPLE FOX
Now suppose you want to create a document transcluding the phrase
 from the previous example, THE QUICK PURPLE FOX, so it will retain
 its connections to its origins.  You want your new document to say
I look forward to meeting THE QUICK PURPLE FOX.
Now let's say you append "I look forward to seeing" at
 transliterature.org/tlitExx/InputPermascroll.txt.

Now your EDL simply copies spanpointers from the old EDL. 
 The result takes each content portion from its source, as before.

To transclude content-- in this case THE QUICK PURPLE FOX,
 which already took three span URLs-- you copy the three span URLs
 already used.

Your new EDL simply gets spanpointers from the previus EDL.
 The resulting EDL:

http://transliterature.org/tlitExx/InputPermascroll.txt?xuversion=1.0&locspec=charrange:192/27
http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT?xuversion=1.0&locspec=charrange:0/10
http://transliterature.org/tlitExx/InputPermascroll.txt?xuversion=1.0&locspec=charrange:16/7
http://transliterature.org/tlitExx/QUICKBROWNFOX.TXT?xuversion=1.0&locspec=charrange:16/4
Each portion of the URL is taken from its original source; we do not
 pass through the intermediate file.

(If we were transcluding only part of a portion specified by one span URL,
 the parameters would of course have to be adjusted.)


2.  Second part of the EDL: CLINKS and clink structures, for structure and decoration [yet to be implemented]
 

FIRST, THE ISSUE OF REPRESENTATION

We want to deal with more complex formats-- not just upper and lower case.  If we add all the features and aspects that we want text to have, however-- fonts and footnotes and links and so on-- things get more complex.

The response of the computer community to this problem has been the baroque embedded formats of today's document world, incompatible and generally incomprehensible without months of study.  We think the following plan is simpler and also will allow us to do much more.

The transliterary alternative is to put these formatting structures outside the content, yielding a simple structure.  The basic unit is what we call the Content LINK or CLINK.*
* We are presently using this term rather than "link". We call it "clink" rather than "link" (the term we used from the sixties till now) because--
• everybody thinks they know what a "link" is, meaning 1-way weblinks between pages; clinks are very different.

• as Content LINKS, they are attached to the content itself.

• they are outside, rather than embedded in the content, and they do not refer to positions local the document.  They refer to the content itelf-- to its generalized or absolute addresses.

• We have a useful tradition in the Xanadu project: "When you change the idea, change the word."  This is very important to maintain clarity during the evolving design process.  We originally used the word "link" for these structures, but now a different meaning for "link" has become widespread.  We are changing that meaning, and thus the word.


VISUALIZING CLINKS
We can visualize a clink as having a vertical line at each end (representing the from-set and the to-set in the address space), a line between them, and a type.
This visualization of clinks can be used to show how clinks can overlap, partially sharing endsets.

THE ANATOMY OF CLINKS

1. The simple clink: an endset on only one side.
2.  The typical clink, which is two-sided.
3.  The extended clink, which has more than one consecutive span on at least side.

HOW CLINKS ARE OBTAINED

Clinks may be constructed locally (by hand or by editor) or derived automatically by deconstructing ambient formats.  Any of today's documents may in principle be deconstructed into a text stream and a clinkpage.  For instance, Andrew Pam's algorithm obtains the text stream from a web page.  Reliably deriving clinkpages may be more difficult.  There are various problems: e.g. the variety of  idiosyncratic paragraphing methods on the web, the evasive capture strategies of certain big corporate formats.
HOW CLINKS ARE REPRESENTED (preliminary)

1.  Reference to a clinkpage

The EDL refers to a clinkpage as it refers to any other document, except by prefixing it with "¢" (the U.S. "cents" sign, which suggests the clinking of coins).

Thus

¢[page URL]
refers to a clinkpage, either waiting prebuilt at the specific address or a clinkpage to be extracted from that page.  Examples:
¢xanadu.com   [to be extracted]
¢file: ... snerd.txt   [prebuilt]


2.  Selection of individual clinks on a page

To refer to the entire clinkpage implicitly invokes all its clinks; but they may be enumerated individually, as in
¢[page URL]
#1
#3
#17
which invokes clinks 1, 3 and 17 from that page, and causes the others to be ignored.


3.  Representation of clinks in the page

General format.  A clink is represented in a textfile as--
"¢"
the clink type name
the word "from" (if from-set is present)
as many spanpointers as needed
the word "to" (if to-set is present)
as many spanpointers as needed.
As follows:

Abstractly, the general format is:

¢type from
spanpointer
spanpointer
...
to
spanpointer
spanpointer
...
Examples.

- One-sided clink with only a single to-set field, e.g. "boldface"

¢boldface to
spanpointer
- One-sided clink with multiple to-set fields, e.g. "boldface"
¢boldface to
spanpointer
spanpointer
...
- Two-sided clink with only one from-set field and one to-set field, e.g. "comment"
¢comment from
spanpointer
to
spanpointer
- Two-sided clink with an arbitrary number of from-set fields and an arbitrary number of to-set fields, e.g. "comment"
¢comment from
spanpointer
spanpointer
...
to
spanpointer
spanpointer
...


HOW CLINKS WORK IN DETAIL

Example: Converting Font Attributes to Clinks

Let's look at a very simple example with fonts.  Consider the following fonted text, with two consecutive words in boldface:
The quick brown fox jumps over the lazy dog.
We have put this at http://transliterature.org/QuickBrownFox.html.  (We used this same file in an earlier example with the Transquoter, but you didn't see the boldface because the Transquoter doesn't show fonts or font attributes.)

Let's bring that in and look internally at the actual HTML in the file (check it by "view source" on the menu).

<HTML>The quick brown fox <B>jumps over</B> the lazy dog.  </HTML>
We want to provide a method for disembedding this structure.

We are going to convert this to a clink, so that the markup for boldface (embedded as tags <B>, </B>) is no longer embedded.  Instead we have a clink-- a link to the content saying where in the content address span the boldface begins and ends.  We convert this embedded tag to a boldface clink as follows (format not finalized).  Beginning it with the "¢" mark, it becomes

¢boldface to
http://transliterature.org/tlitExx/QuickBrownFox.html?xuversion=1.0&locspec=charrange:20/11
which means, "embolden 10 characters starting at number 20."
When we play this clink against the Purple EDL (our previous rearrangement example), and run it with the Transquoter, it should yield the following.  (Note that this is not yet implemented, but this should be the result.)
over the lazy dog jumps The quick brown fox.
What has happened?  The same two words are still in boldface, even though they are no longer consecutive!  This is because the clink is applied to those same content addresses, wherever they happen to be in the current indirect document.  The clink, when evaluated, is found to cover the address span of both words.

This is a key example of how Transliterature's indirect method keeps links from breaking.  But it does much more.
 


CLINK LOGIC: The Power of Generalized Addressing
Because clinks refer to the absolute addresses of the content, they are attached to that content wherever it may appear.  This has many ramifications which we can scarcely begin to expound now, but here are a few more points.

There is no relation between the sequence of the content list and the sequence of clinks in the EDL.  A clink applies only to the characters it points to; for it to take effect, some of those characters must also be on the EDL's content list.

There are a number of different cases to consider.  All must be resolved in the client before presentation.  (However, inconsistencies may be left as options for the user to flip between.)

• the clinks may overlap, so that more than one clink covers the same characters.  (Note that this is explicitly forbidden in the WWW formats, for ideological reasons: you may not have "start boldface... start italic... end boldface... end italic".)

• A clink's addresses may not intersect with any of the content spans of the document.  No problem.  It is simply ignored.

• Clink addresses may have much greater spans than the content spans that are included in the document.  Extra reference by clinks to non-present content is simply ignored.

For example, suppose all the text of source document A is in red.  Now let's say that in document B you include--

•• spanURLs pointing to two sentences from document A
•• a clink (which has been automatically converted from document A) pointing at all of document A, saying that all the contents of document A are in red text.

This red clink should be interpreted by the TransLit client as applying only to the two sentences from A, since theirs are the only addresses which intersect with the red clink.

• If content with the same span URL is repeated in a document, any clink pointing to those span addresses applies identically to all instances.
RESOLUTION PHASE
When the client program has acquired all content and all clinks, the resolution phase occurs.  (It may occur in stages, since a clink may itself bring in content.)

How is this handled?  The operative clinks, and what portions they apply to, are found by taking the intersection between the content addresses and the clink addresses.  Clinks that touch no included content are ignored.  What remain are the operative clinks.

How the clinks are resolved after that, of course, is Just A Small Matter Of Programming (JASMOP).  In the case of conflicts, the client does its best guess or leaves it to the user to flip.  We can always fall back on the WYSIWYL principle ("What You See Is What You Like"), i.e. the user's ability to flip through different views to find the most appropriate ones.



CLINK TYPES.
Example 1: THE HANDLING OF PARAGRAPHS

Paragraphs give us a good example of clink logic.  There can be a number of different types of clink for paragraph representation, existing in parallel.

(In discussing the clinkless Transquoter, we explained that simple hacks have been provided (just for the transquoter) for creating paragraphs or the appearance of paragraphs.  This is a dead end, since the Transquoter works only in a web browser, and we are seeking a much more general solution for far more interconnected documents.)

There may be a number of different clink representations of paragraphs.  Obviously the simplest is a clink pointing to the first character of a paragraph.  However, the software on which this is based (Reference Xanadu, 88.1) had a very different method.  The paragraph link was intended to embrace the entire content of the paragraph.

Both of these methods, and others, can be valid.  Resolution issues will appear as these methods are implemented.  But these practices are perhaps for now best deferred-- perhaps until there is a user community.

Example 2: SOLVING THE REARRANGED-CAPITALIZATION ISSUE
Now we may consider the general solution to the rearranged-capitalization problem, discussed earlier.

As in the earlier example, content will undoubtedly be switched around so that capital letters appear in the middle of sentences and lower-case letters appear at the beginnings.  For the user to actually change these characters would wreck the transclusion principle.  Instead, we have the editor supply a reverse-case clink (to be put in automatically by the editing program) which means: Capitalize/Decapitalize the leading character(s).

Thus the characters are still correctly transcluded but appear recapitalized.  (For scholarly nicety there can even be cleverer cases, comparable to "[sic]".)
 

LESS-SIMPLE CLINKS
A clink is a pointer to spans, as is a span URL.  The examples so far have only a type and one spanpointer.  However, a clink may have a number of spans.

Following the Xanadu reference design, a clink is deemed to have a left side, a right side, and a type.

• the type says what the clink means-- implicitly, how to show it or act on it.

• The from-set (left side) is a list of zero or more spans.

• The to-set (right side) is a list of zero or more spans.  Example: boldface, where the spans point to one or more spans to be emboldened.

Example of comment link: a comment link lists something to be commented on, perhaps several things to be commented on at once--
• The left spans point to the comment itself.  The comment itself may be content drawn from several places, thus there may be several lefthand spans.

• Right spans point to the material being commented on.  Numerous things may be commented on by a single comment; thus they, too, may be drawn from several places.
 

CLINK TYPES IN GENERAL
In the old days our team made many lists of possible clink types.  These include (the terms below are off-the-cuff)--
• decorative clinks, such as bold and italic, font names and sizes
• parts-of-text clinks, such as paragraphs
• section-of-document clinks, e.g. chapter and verse
• literary-piece clinks, such as footnote, caption, marginal gloss, "box related to this point"
• literary-meaning links, such as summary, comment, disagreement, endorsement, corroboration, example, related point, ironic point, interesting side point, tangent, common confusion, fine point
• correspondence clink, used between counterpart portions in different documents (and may serve for the demotion of transcluded sections which are replaced).  This will also attach sound tracks to videos.
And, of course, to be converted from the HTML family,
• "weblink" clinks, pointing in only one direction.  (But we make them followable either way :)
Some clinks are followable (like HTML links) and some are not (like fonts, paragraphs.)


3.  Transliterary Connections:  Transclusions (and how related to clinks)
 
Transclusion means "the same content knowably in more than one place."  If you can go immediately from one instantiation of content directly to another, the connection is transpathic.

We believe this is a fundamental relation of great literary importance, and that it needs to be part of electronic documents of the future.

A key motto of our work is that a transclusion is not a link (or a clink).  Attempting to represent identities of content by the same methods as links carries a number of problems.  (We won't get into these here-- except to point out that Vannevar Bush's "trails" were not links but transclusions.)

Visualizing Transclusion.  We can visualize transclusion dynamicallly in a way similar to clinks.  We diagram both clinks and transclusions as two vertical bars, one against the document's address space and one going elsewhere, connected by a line.  To visualize a clink (shown earlier), we connect the two endsets by a single line.  For transclusion, however, we connect the two endsets with a double line, indicating the equality at both ends:

Transclusion in transliterature.  In transliterature we automatically maintain one-level transclusion.  The content portions always maintain connection to their origins, because each portion is cached with its original address-- which is ipso facto a pointer to the original.  Thus we maintain transpathic connections.

However, that gives us only a single transclusive relation, from a quotation to the original.

However, additional transclusions can be recognized if other documents are in current memory which share content origin addresses.  These are discovered transclusions.
 


MICROVERSION MANAGEMENT (the Transversionre)
A very nice hypertime editing system was programmed by Ken'ichi Unnai from my design as part of the OSMIC project at Keio University.  We will adapt this code as an editing and versioning mechanism for transliterature.  (It appears to work very well; it may or may not need to be extended to network addresses.)

Use of the transversioner will have to be server-side, maintained by anyone who wants to provide the facility (to serve their own versioned documents or, as a service, for other users' documents).
 

GETTING AN EDL FROM THE TRANSVERSIONER
The tentative invocation format to bring an EDL from the transversioner is
http:.../filename.EDL?hypertimestring
This will return an EDL specific to that time and version fork.  This invocation is consistent with our other EDL invocations, so it can be used in transliterary invocations wherever an EDL is wanted.
EDITING WITH THE TRANSVERSIONER
You will need a specific TransLit editing program to register each change with the transversioner.

The editing program (Unnai's version was an Emacs extension) saves individual inputs and operations.  We would like to convert this, as it is hard to get people to use Emacs.


FOLLOWING CLINKS TO OTHER THAN ORIGINAL CONTEXT
Following a clink (if it's followable) takes you by default to the original context of the endset specified.  If you want to see a different context of the same endset, you need to supply, in addition, an EDL for the desired context document.  The invocation will be something like
endset [some delimiter] EDL-of-desired-context
This will be a client-side function, working in the TransLit program itself.

Brief Review of Transliterary Structures

THE DATA STRUCTURE (LUSTR)

With two simple elements-- content list and clinks-- we represent everything.

A transliterary document consists of portions of media content (always connected to their origins) and relations to be applied to that content (Content LINKs or Clinks), which are used for interconnection, decoration, etc.

All conventional documents may in principle be deconstructed to this format.

Clinks are not embedded inside content, but imposed on (or applied to) the content from outside.

Clinks don't point into the document, but to the same address space as the portion spans themselves.  They are applied by finding the operative clink spans, obtained as the intersection of the addresses of content spans and clinks.  (For examples see below.)
 

CACHING FORMAT
The content is cached as snipped portions with their origin addresses.  (Thus always connected to its origins, since any portion knows where it came from.)
EDITING AND DISTRIBUTION FORMAT
However, the document is maintained and delivered referentially, an EDL, or a set of contents and clinks.  A document goes to a user indirectly, as an EDL-- lists of content and clinks.  The user's client program then sends for the content spans and the clinks, then applies the clinks to the content and presents the result.  (Users may further select what clinks to apply, thus varying the presentation.)
—30•™©®é£–