Continued from discord

A conversation with myself @z6Mkt...poxF6and @z6Mkt...uiDGrabout AV support and wasm.
Hi Skreutzer, I'm interested in your reply, and in these ideas. I thought taking it here might allow the consents space to breath a bit. Iv'e appeared to written a letter, and I'm glad I did so because I think ive clarified my ideas a bit.
Iv started with a recap of the conversation thus far (for posterity).
The letter starts a little way down the page.

Recap

I started the conversation with the prompt
Have you had discussions / thoughts on how to properly handel vidio and audio, maybe even code as hypermedia?
Skreutzer replied:
For images, audio, video, I point to the Hyperland documentary, in hopes people get that it's not about multimedia (just more than one medium) and not intermedia (inner meaning) and not transmedia (continuing across different mediums, but not with an alternative in another medium), but it's the hyper-properties, mostly controls. For audio, I tried an audio messaging thing with responses in a tree structure (hoping I could develop it into audio curation, but didn't continue and kind of failed). Main problem being, audio is "temporal" (on a linear time scale - while text snippets can branch more easily [more so if ideally written as standalone or contextualized], audio may not fit with voice/tone, which is less of an issue with text - I mean, synthesize all audio? branch music or what?). Jon Udell made some (controls) when employed at Hypothes.is (example https://blog.jonudell.net/2018/03/10/open-web-annotation-of-audio-and-video) but I don't like Hypothes.is and I assume he's probably on to other things.
For code, umm...that's a big topic in itself - from better editors/IDEs to Alan Kay's "the computer revolution hasn't happened yet", so I spend some of my time trying to look into "undoing" the many different programming languages (to not be stuck in a single one, to undo the incompatibilities that come from the many languages), OK there might be some higher points about code/programming and systems design, architecture, implementation, I don't think there's much reason to go into it. I'm struggling a lot with it, don't have a great solution/scheme yet (or soon or ever), it's even more insane than hypertext/intelligence-augmentation alone. Needless to say that Ted Nelson is an actual programmer (even wrote books teaching it), and Engelbart's is pretty much about that too (NLS) (them writing some of NLS in/with NLS).
My reply:
Yes AV's internal temporal structure is interesting. To me it seems like, as it is documents read top to bottom, so trancluded video should also be spread top to bottom, and it should be introspectable in the client. So you can make comments and so at particular times. At its simplest, something like soundcloud comments.
There is lots of kinds of data with interesting internal structure that should be alble to be part of a text, that should interact in a hypermedia. Another example might be a map
For code, I dont mean IDE. I was thinking about if tis possible to incldue webassembly in a page
I belive, with wasm you can provide it a restricted set of privileges so that you know its output is deterministic. I dont know but maybe, with the right design, there is a way to all pages to include hypermedia that is not defined in the clients, and is maybe novel, but non-the-less exposes propper referentiality and controls to the reader. For example, a visualization program embeded in a document. Or a code that builds on video out of a collection of sources.
Skreutzer
No, not this - doing things in this direction has been a major mistake in my opinion. Means, code comes from server-side. Even if it's p2p, you need to sandbox (which needs to prevent compatibility/interoperability, because execution of untrusted/untrustable remote code is a, if not the, primary security problem, including execution of injected code, etc.). Sure, the WWW people want to make their browser the new computer/OS (might be an instance of the "system in a system" fallacy), and fine, they can do that, but that's their WWW thing they're doing, and I'm not a fan of today's/future WWW nor the WWW "browser" (to the contrary). I can see why you/others would want it, it can make stuff become more popular and spread faster, have more "WWW apps", but I'm against the app model as well. It'll also be hard to find a basis of how/why such embedded scripting and remote code execution should be considered "hyper", both historically and conceptually.
The hyper stuff is about documents (therefore hypermedia AV images whatever any medium) to be less about conventional top to bottom. Most of the hyper stuff is precisely to get away from the linear flat and/or single sequence/direction.

Response

Firstly, I would resist the characterisation of it as "WWW apps". I hope you can see why in this letter.
I think "code as hypermedia" was a terrible way of phrasing what I was driving at, and my point definitely got lost.
"Its extremely hard to convey any non-trivial idea, even to smart people" -- Ted Nelson ( from memory )
Id like to lay my points out properly first, and then reply to parts of your comment in the light there produced.
My point brakes down into 3 parts of decreasing importance.
    AV is a vital component of hypermedia, and needs proper support
    Generalisation of AV with a Uniform Interface
    Modularity with wasm

Main Point

AV Have an intrinsic extension. Both have temporarily, Video also have width and height. They may also have cretin channels. There structure must be respected when assimilating them as hypermedia.
Consider the way SoundCloud handles comments, song chosen at random.
Songs are inherently linear. SoundCloud respects the linearity of the songs. It exposes this structure to the reader, allowing them to comment at specific times, and say things like:
"wow that drop this goes hard "
By respecting the internal structure of the song, soundcloud is able to let the users interact with the media at a deeper, semantic level.
Now Imagine if this was in a hypermedia system. Imagine if these comments were in fact points to which full documents can be attached, or indeed Other audio clips, or slices thereof. Imagine if you could reference and transcendent sections, and compare them side by side. Imagine the depth of commentary that this would allow.
At the very least, being able to take proper notes on a pease of audio or video is extremely important I think.

Generalising

As shown by AV, some artefacts, have internal structure that if respected, can enrich the hypermedia. I would characterise the current approach as leaving AV as undigested foreign bodies in a hypertext.
I content that there are many such kinds of artefacts, and that inner structure can be generalised sufficiently to write a uniform interface.
I would characterise this internal structure with extension and dimensionality.
Audio extends over time
Video extends over a surface and over time
Artefacts with extension define a surpassed parameterised by its dimensions, i.e. Time and Space in the case of AV.
This surface is something that hypermedia objects can attach and reference. i.e. Time stamps, annotations, but deeper.
It would be possible to incorporate support for proper AV support manually, but there are many formats, and other artifacts with their own internal structure, with their own dimensions of extension. Image carracells for example, or CSVs, or animations.
If you think in terms of extension, you can think beond AV. Consider a Map. Cartographic data is very important in many areas. Some information is inherently spacial. Why should it be left out of hypermedia. By exposeing a uniform interface for xy, you can attach all the hypermedia attendance to such a map, allowing deep support.

Modularity with wasm

You could implement AV and some others on a case by case basis. Handling them in the client specialy. Perhaps with a specialised video format say, but I think modularity might be a better approach.
Define a "code" block, that includes
    Signature that defines that shame of the extensional for the Uniform Interface
    IPFS address of wasm module(s)
    IPFS of data that the code will operate on, to be mounted in a read-only file system for the wasm module
The client then reads the signature, and knows how to construct an interface without running the wasm. It can construct the space onto which comments can be made, documents can be attached transclusions can be drawn from and references can be made to.
It can also define, via some standard, some kind of protocol for querying the wasm for a particular point (say time codes) and reading out the corresponding data.
Wasm is designed to be sandboxed. capabilities is provided at a very granular level, as explained in this 2024 GOTO Talk . I suggest:
    No Network Access, No phone home, have to specify its resource via the content addressed links before execution
    Read Only Storage, to mount the resources it asks for, but no persistence
    No time, Time relative to the player controls, provided explicitly.
    No Random Numbers, combined with the other restrictions, this can make the code deterministic, and since the inputs are content addressed, a code block with the same hash will have the same result.
Molecularity would mean that the core team focuses on defining the uniform interface, and making the client present it well. Specific implementations for different video formats for example is left to the authors. The core team can maintain creative control, while allowing experimentation, deeper advercerial-interop on media formats.
Needless to say: A lot of thought about security should be committed before introducing a feature like this.

Addressing Concerns

I'm against the app model as well.
Me too. I think what I've laid out is far from that. Maybe you can correct me.
A System in a System?
Can you build a system with no network and no persistence?
I would like to see it if its possible.
Core team retains control over the expressiveness of the uniform interface, so a SPA style page isn't possible. The block doesn't even control its own theme. This isn't about "special effects" as Ted Nelson Put it.
It'll also be hard to find a basis of how/why such embedded scripting and remote code execution should be considered "hyper".
Via a uniform interface, and dimentionality conveyed prior to execution. This allows commenting, linking and transclusion into artefacts that would otherwise be opaque, or require special client code.
Pluse, there is: No remote node execution allowed. No Network calls allowed. No Persistence Allowed.
Most of the hyper stuff is precisely to get away from the linear flat and/or single sequence/direction.
I Agree. Let me make the general Point: To include reference, transclusion, indirection and non-linearity, into existing media, requires one to respect its preexisting internal structure.
The hyper stuff is about documents [...]to be less about conventional top to bottom. 
And let me make the point: For a hypermedia to be maximise its potential, it should be able to pull in everything it references. Because its about representing ideas and information, and those things don't stick to there boundaries. Hypermedia should be able to fully digest and assimilate media that was produced fully prior to its development.
I can see why you/others would want it, it can make stuff become more popular and spread faster, have more "WWW apps",
Hopefully you can see the difference of what I'm suggesting and WWW Apps, and hopefully you can see that that's not were I'm coming from.
I'm interested in this because I think its elegant, and I want to use this kind of deep for AV hypermedia support.I'm currently chafing against the shallow way its currently handheld.
I have assumed that the way videos are now as a temporary measure and deeper support is already planned.
As someone who is dyslexic, and I get a lot of information from AV sources, this is a very important feature for me.
As someone with a maths and physics education, animations and simulations are also important to me, and that too is just another kind of media.
That's were I'm coming from.
However, with regards to popularity, I can very clearly envision wide adoption of the technology, if AV et al is handled properly. Not because people want the cool shinny web thing, but because it would provide a deeper experience then any web platform can.
Go look at any interesting YouTube videos, they are hypermedia trying to get out, filled with references and clips from other videos, documents read through ect. And look at the impoverished state of commentary. Down there cowering in the gutter under the video.
Commentary should be like those found in the Torah, not like those found scrawled on the inside of a public toilet.