Rust Audio

RDF handling for rust-lv2

This is a follow-up to discussion in rust-lv2’s issue #23, since the risen questions are rather broadly scoped.

First, some introduction: LV2 uses RDF graphs notated in Turtle to describe all relevant information; For example how plugins are called and how many ports they have. One example of such a description would be the description of the fifths example.

The URIs that are required to write these descriptions are provided as Turtle files too, for example in the core specification. For plugins, these graphs aren’t very important, but hosts need to know the defined graph in order to identify, load and use plugins. Therefore, we need a way to evaluate the predefined graphs of the specifications and as well as the graphs describing plugin resources.

@Yruama_Lairba did some research and found out that the RDF ecosystem for Rust is pretty thin; You can read the issue comments for more info. This pretty much means that we have to develop this ecosystem on our own. There’s no way around it since it is absolutely necessary for a proper host framework.

From my point of view, we need the three components:

  1. An RDF triple store where you can dynamically add, modify and remove relationships. I don’t think that it needs to be multi-threaded or perfectly ACID, it only needs to load some turtle files and query the graph.
  2. Obviously, a parser that can load Turtle files in the aforementioned triple store. Some of this work has already been done, for example by the rio_turtle crate, which can be adapted for our needs.
  3. A serialization format that can be used as constants in Rust code. If the resource descriptions of the specifications should be shipped by the framework, it needs to be integrated into the code. The problem here is that most serialization formats for RDF structures are rather lengthy and optimized to be read by humans. Simply copying the Turtle files into Rust code would take about half a megabyte of space and you would still need to properly parse the data, which may also fail. For maximum comfort, space usage and performance, it might be wise do develop a custom format that can be read efficiently by a Rust program without needing to parse it.

What do you think of these requirements? Did I miss something?

I don’t understand what do you want do on point 3. From where comes the need of a serialization format ? are you talking about generate some rust source from rdf ?

Yep, that’s what I’m talking about. If you copy the Turtle file into the source as a string constant, you still need to parse it, which requires the program to allocate a triple store to parse the RDF structure into and to iterate over every single character in the file. Instead, we could create a crate for a build.rs script or a procedural macro that parses that file and creates a structure that’s more compact and easier to handle.

I’ve sketched up something along the lines of what I mean. It contains both the TTL file that’s transcribed, the “generated” store and a small test that calculates the constant memory needs of each solution:

Code example
/// This is the Turtle file we want to store.
const AMP_TTL: &'static str = "
@prefix doap:  <http://usefulinc.com/ns/doap#> .
@prefix lv2:   <http://lv2plug.in/ns/lv2core#> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix units: <http://lv2plug.in/ns/extensions/units#> .

<https://github.com/Janonard/rust-lv2-book#amp>
        a lv2:Plugin ;
        lv2:port [
                a lv2:AudioPort ,
                        lv2:InputPort ;
                lv2:index 1 ;
                lv2:symbol \"in\" ;
                lv2:name \"In\"
        ] , [
                a lv2:AudioPort ,
                        lv2:OutputPort ;
                lv2:index 2 ;
                lv2:symbol \"out\" ;
                lv2:name \"Out\"
        ] .
";

/// simplification for the example:
type URID = u32;

/// The store of the literals, could implement `Map` and `Unmap`.
type Literals = [&'static str];
/// The triple-store, relative to a literal store.
type Triples = [(URID, URID, URID)];

/// Everything combined.
pub struct ConstantStore<'a> {
    pub literals: &'a Literals,
    pub triples: &'a Triples,
}

/// A compiled representation of the RDF structure.
/// This would have been generated by a macro.
const AMP_STORE: ConstantStore<'static> = ConstantStore {
    literals: &[
        "https://github.com/Janonard/rust-lv2-book#amp",    // 0
        "http://www.w3.org/1999/02/22-rdf-syntax-ns#type",  // 1
        "http://lv2plug.in/ns/lv2core#Plugin",              // 2
        "http://lv2plug.in/ns/lv2core#port",                // 3
        "_:genid1",                                         // 4
        "_:genid2",                                         // 5
        "http://lv2plug.in/ns/lv2core#AudioPort",           // 6
        "http://lv2plug.in/ns/lv2core#InputPort",           // 7
        "http://lv2plug.in/ns/lv2core#index",               // 8
        "1",                                                // 9
        "http://lv2plug.in/ns/lv2core#symbol",              // 10
        "in",                                               // 11
        "http://lv2plug.in/ns/lv2core#name",                // 12
        "In",                                               // 13
        "http://lv2plug.in/ns/lv2core#OutputPort",          // 14
        "2",                                                // 15
        "out",                                              // 16
        "Out",                                              // 17
    ],
    triples: &[
        (0, 1, 2),
        (0, 3, 4),
        (0, 3, 5),
        (4, 1, 6),
        (4, 1, 7),
        (4, 8, 9),
        (4, 10, 11),
        (4, 12, 13),
        (5, 1, 6),
        (5, 1, 14),
        (5, 8, 15),
        (5, 10, 16),
        (5, 12, 17),
    ]
};

fn main() {
    use std::mem::size_of_val;
    
    // Calculating the general size of the constant store, which are the triples,
    // The list of references to literals as well as the literals themselves.
    let mut store_size = size_of_val(AMP_STORE.triples) + size_of_val(AMP_STORE.literals);
    for literal in AMP_STORE.literals.iter() {
        store_size += size_of_val(literal);
    }
    
    // Calculating the size of the Turtle file is easy...
    let ttl_size = size_of_val(AMP_TTL);
    
    // On an x86_64, this prints "store_size = 732" and "ttl_size = 733"
    dbg!(store_size, ttl_size);
}

In this example, I’ve used URIDs to reference the Literals, but you could also use references to the literals, which would make the thing even more robust.

Is it clear what I mean now?

Ok, but i can’t get for what it’s useful to create such constant store. As i know:

  • plugin don’t need this kind of store , they only need to know some uri to get host features and transmit their extensions.
  • host may use “triplestore” to store information about plugin, but this can only done at runtime.

This is correct, but somehow, the host needs to know what to look for in the plugin definition. Therefore, it needs to load both the plugin’s .ttl files and the specifcation’s .ttl files into it’s triple store. Therefore, in order to keep up the usability, we need to ship the specification’s RDF structure with the host crate. Copying the .ttl of the specification into the source code is time- and space-inefficient and may fail in many situations. Therefore, we should introduce a way to store an RDF structure in Rust code that is fast, space-efficient and almost unfailable to parse.