Monday, July 26, 2010

Weekly Report #10

Weekly Report #10
Submitted on 2010-07-26
Covers 2010-07-19 to 2010-07-26

Status and Accomplishments
• the stub generation script (c6runapp-rpcgen) is now in place; provided with a number of C source files it can generate the corresponding GPP and DSP stubs for the functions defined in the given files, thus exposing them via RPC
• the documentation is expanded to include architectural details and will be maintained in the project wiki pages here
• more RPC examples to cover newly added things like stdio.h variadics and functions returning structures
• synchronized branch with the latest C6Run trunk, whose modified build system can create the C6Run libs without having to re-build the dependencies every time - but there are problems

Plans and Tasks
• investigate why the trunk-synced branch errors out on the produced executables ("Bus error", including on the non-RPC examples) and fix this
• experiment with building a bitbake recipe for C6Run to get it built inside OE
• for predefined RPC stubs whose pointer parameters are only "in" (ie, the RPC target won't modify the contents of the memory pointed) implement a mechanism that will allow these parameters to be alloc'd from the DSP stack or heap - having to allocate every little string via rpc_malloc is annoying

Risks, issues, blockers
• it's not clear why the bus errors occur with the latest trunk-synced version but hopefully it'll be something that I overlooked while merging the changes from the trunk, or possibly a problem with the version of the trunk I used

Monday, July 19, 2010

Forwarding invocation of variadic function in C

I had been brooding over how to do the RPC calls for variadic functions for some time now. Although marshalling any given variadic isn't really possible due to a lack of general method for obtaining argument count and sizes (see my weekly report #9, issues section), for commonly used stdio.h variadics such as printf and scanf, the arguments are well-defined by the format string so it should be possible to manually marshal these.

I'll be writing a seperate blog post about how I went about doing that, but for now I want to talk about a sub-problem of this: given a number of arguments, how do you forward these to a variadic function? A re-statement could be, how to "dynamically" invoke a variadic function?

Looking this up in the net, I've found these two discussions to be the most relevant:

The second link mentions a library called FFCALL which can be used to pass parameters to variadics dynamically, and this probably is the ideal way of doing things.  

I may have found another method for this - so far as I've seen it works on x86 and ARM. It's based on the assumption that the last mandatory parameter and all the variadic parameters reside continuously in the memory, as well as lots of terrible coding practices, but it should be illustrative enough.

What I'm doing is basically copying a fixed number of bytes from the memory region (=stack) where the variadic parameters are located into a buffer, then passing this buffer to another variadic function which is called with a fixed number of arguments (I picked doubles since they are larger and will allocate more space on the called function's stack). The function then calls memcpy to overwrite its variadic args section with the passed buffer, and afterwards can call the stdarg macros to obtain the variadic arguments, or pass them to something like vprintf.

Here's the code:

#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>

void accepter(char *fmt, char *ptr, ...);
void forwarder(char *fmt, ...);

void forwarder(char *fmt, ...)
    double d = 9.1;
    char *buf = (char*) malloc(10*sizeof(double));
    memcpy(buf, (void*)((unsigned int)(&fmt)+sizeof(char*)), 80);
    FILE *o = fopen("tmp", "wb");
    fwrite(buf, 10, sizeof(double), o);
    // call function with 10 double arguments to open up stack space
    accepter(fmt, buf, d, d, d, d, d, d, d, d, d, d);

void accepter(char *fmt, char *ptr, ...)
    memcpy((void*)((unsigned int)(&ptr)+sizeof(char*)), ptr, 10*sizeof(double));
    va_list ap;
    va_start(ap, ptr);
    vprintf(fmt, ap);

int main()
    double d = 65.98;
    forwarder("%d %d %d ermm %s and more params! %x %f %x %x \n", 1, 2, 3, "hello world", 199, d , 0xdeadbeef, 0xbeefdead);
    return 0;

Weekly Report #9

Weekly Report #9
Submitted on 2010-07-19
Covers 2010-07-12 to 2010-07-19

Status and Accomplishments
• support for returning structures by the addition of a special return type signature ('#')
• address translations are now exposed through RPC, so the user is able to perform translations manually
• build system modifications - DSP and GPP-side stubs are now collected from inside two directories instead of two single files
• dsp-rpc-posix branch updated to contain the latest changes in the C6Run trunk
• manual marshalling for stdio.h variadics (printf, scanf, fprintf, sprintf...)

Plans and Tasks
• implement scripted stub generation (ie, the user will provide function declarations and the corresponding GPP and DSP-side stubs will be generated by a script)
• complete architectural documentation
• expand library of RPC examples to illustrate use-cases with different function signatures

Risks, issues, blockers
• since there is no general way of extracting the number and size of parameters in variadic functions (eg, one function may only specify the number of args as the first fixed parameter and accept only integer args, while another may take a zero-argument as a terminator of a number of float args, and yet another may use printf-style encoding), we can't implement a generic marshaller for variadics. for this reason, support for user-defined variadics is left out for now, although users writing their own variadic functions isn't very common and this is not expected to become an issue.

Monday, July 12, 2010

Weekly Report #8

Weekly Report #8
Submitted on 2010-07-12
Covers 2010-07-05 to 2010-07-12

Status and Accomplishments
• lots of work went to solving the long-standing buffer/pointer parameters and return types issue, and we finally have a reasonably well-working system in place. all address translation / memory space mapping is now done automatically, provided that any pointers to be passed are pointing to memory allocated using the RPC allocator (CMEM based).
• three types of pointer return types are identified and supported: no-translation pointers (such as FILE* which aren't meant to be dereferenced at all), direct-translation pointers (such as strspn whose return value points to somewhere within the passed parameter) and manual-copy pointers (such as ctime when the function allocates memory with some other method and passes that pointer, size information needs to be specified in the GPP stub)
• most C I/O stubs are now complete (the ones remaining are variadics and infeasible things like threads)

Plans and Tasks
• find a way for marshalling variadic function calls (will probably involve manual work) and complete the remaining stubs
• test completed stubs with existing code
• offer more flexibility for buffer/pointer parameters by providing the ability to do address translations manually and maybe detecting non-shared buffers in the DSP-side stubs then syncing them automatically with some shared ones
• support multiple stub input files instead of placing them only in rpc_stubs_gpp.c and rpc_stubs_dsp.c

Risks, issues, blockers
• variadic functions such as printf and scanf are problematic for marshalling - since there is no well-defined way of knowing how many parameters of which size there is, it's likely that the user will have to provide the parameter packing for these manually
• even when the stub generator tool is in place, it's likely that the user will have to provide some manual guidance for certain cases in stub generation such as identifying buffer/pointer parameters which don't need address translation, return types which need manual copying into a shared buffer and variadic functions

Monday, July 5, 2010

So what happened to the variadic marshaller?

As you may recall, I had a variadic marshaller I had been using for the RPC layer for some time now, which recently had to be dropped since it was causing trouble with passing float parameters. I wanted to talk about here a little bit since it's not really a specific problem concerning DSP or marshallers, but rather how certain arguments are passed to variadic functions.

Let's start by talking about what a variadic function is, since you may or may not have heard the term. A variadic function is a function that can take a varying number of arguments. There usually are a few "required" arguments but the sky (or rather, the bottom of the stack :)) is the limit. Sounds familiar? Yes indeed, probably the two most famous functions in the C library are variadic: printf and scanf.

Variadics in C are easily identifiable by their declarations, which looks something like this:

int summation_function(int count, ...)

notice the ellipsis ( ... ) - this is the notation used to state that there will be an arbitrary number of arguments here.

And for quick reference - accessing the variadic arguments is done via stdarg.h macros, for example:

va_list arg_list;
va_start(arg_list, count);
for(int i=0; i < count; i++)
  sum += va_arg(int);

whose detailed usage descriptions you can find easily by Googling.

Moving back to the problem I had with the variadic marshaller: I was passing regular 4-byte floats as arguments to be marshalled, but somehow the 4-byte region corresponding in the marshalled buffer always showed up to be corrupted somehow.

The issue was eventually revealed to be about the default argument promotions that are applied to variadics, as described in:

therefore, any short int or char arguments passed are automatically promoted to int's, and all float's are casted into double's - thus having a 8 byte representation whose first 4 bytes don't really mean all that much :)

My initial idea was to simply cast the obtained double parameter back into a float before marshalling it into the buffer, but unfortunately this leads to loss of precision during double->float conversion. Therefore, I've decided to switch to using macros and doing the parameter marshalling directly inside the stubs. Comparing what the DSP-side stubs look like in the old vs. new methods of marshalling:

void rpc_mixedprint(int a, char b, float c, double d, short e, int f)
    rpc_marshal("rpc_mixedprint", "vicfdsi",a,b,c,d,e,f);


void rpc_mixedprint(int a, char b, float c, double d, short e, int f)
    RPC_INIT("rpc_mixedprint", "vicfdsi");
    RPC_PACK(int, &a);
    RPC_PACK(char, &b);
    RPC_PACK(float, &c);
    RPC_PACK(double, &d);
    RPC_PACK(short, &e);
    RPC_PACK(int, &f);

I'm aware it doesn't look quite as elegant as the variadic marshaller did... but in terms of ease of stub generation, it's not all that different, and there's no loss of data precision involved. And the GPP-side stub uses macros to extract the parameters as well, once the unmarshaller unpacks the buffer into the void** param_buffer:

void rpc_mixedprint(void **param_buffer, void *result_buffer)
    mixedprint(RPC_CAST_PARAM(param_buffer[0], int),
               RPC_CAST_PARAM(param_buffer[1], char),
               RPC_CAST_PARAM(param_buffer[2], float),
               RPC_CAST_PARAM(param_buffer[3], double),
               RPC_CAST_PARAM(param_buffer[4], short),
               RPC_CAST_PARAM(param_buffer[5], int)

GSoC Weekly Report #7

Weekly Report #7
Submitted on 2010-07-05
Covers 2010-06-28 to 2010-07-05
Corresponding Draft Schedule Item:
Create the RPC framework that can call functions on the GPP side from the DSP and return values back to the DSP.  Implement and unit-test the POSIX function wrappers according to the planned order.

Status and Accomplishments

  • the RPC framework was tested with many different kinds and combinations of non-buffer parameters, a few bugs unearthed and the issue with floats issued as variadic parameters led to the deprecation of the variadic marshaller. DSP-side stubs use macros to do the marshalling now. not as elegant but works far better.
  • the GPP-side RPC stubs dynamic link library is now embedded directly inside the resulting executable for cleaner deployment (ie, the user doesn't have to copy the library manually to the Beagle)
  • rpc_malloc and rpc_free handlers using CMEM for alloc/free and address translation added into the GPP server, so basic buffer/pointer parameters support will soon be in place
  • implemented a simple version of the ARM function caller, but doesn't work with double arguments or 4+ params of any kind. the macro-based parameter passing method works fine for now, though, and I intend to keep it for a while longer.
  • all POSIX/C lib stubs for functions that don't take any buffer parameters are now in place (there isn't that many, though :))
  • first usable version of dsp-rpc-posix committed to the repository

Plans and Tasks

  • more testing and support for buffer parameters - there's still questions here, see issues section
  • write stubs for the POSIX functions with buffer parameters/return types
  • review the build system (how the user provides his/her own sources for use in RPC) and make improvements where necessary, see issues section

Risks, issues, blockers

  • the user needs to be able to provide source code and declarations for functions he/she intends to use with RPC - how should this be ideally done? keep them in a pre-determined directory and have the user add them there (easy but not very flexible)? pass them to the tool as command line parameters prefixed with something special? 
  • the GPP and DSP-side stubs for custom functions need to be provided manually for now. although it's relatively straightforward to do by hand, it's very very suitable for automation and I intend to have a stub generator script for this. I'd prefer not to have to write a C parser though, anything available I can use for this?
  • the situation with buffer parameters issue is as follows at the moment: any buffer parameters the user wants to pass on via RPC *must* be allocated via the rpc_malloc call. this call is mapped to GPP-side CMEM allocation and the GPP server performs virtual-to-physical address translation before passing the buffer address to the DSP, so the DSP can directly work on this buffer. but there isn't any direct physical->virtual address translation available on the GPP side - what's the best way to deal with this? currently the C6Run allocator saves 16 bytes of extra info, including the virtual base address, along with the allocated buffer, and this is how we get rpc_free to work. but there'll be problems if the user doesn't directly pass the allocated buffer, but just a part of it - how will we find the virtual address then? even if we can find the virtual address for any given physical address...we can only do address translation if we're aware if it's required. what if the user puts a physical address somewhere inside a struct, or even inside another buffer (say, a void** parameter) ?
  • one solution to this could be adding the allocated virtual addresses to the DSP MMU TLB and have the DSP work directly with virtual addresses. but there's only 31 slots available in the table  :( EDIT this is not a solution at all, the DSP MMU doesn't do any address translation at all (virtual = physical)! it's just for protection (preventing DSP from accessing places it's not supposed to)
  • another solution could be making this the user's problem: providing virtual addresses from rpc_malloc and making the user do virtual->physical translation on their own (via another RPC function, of course :)) if it's needed.

Saturday, July 3, 2010

DSP-RPC-POSIX initial commit is in place!

I've made the initial commit for dsp->gpp RPC calls (the development had been on my personal SVN repo so far, since I didn't feel it was ready to see daylight :P)

It's nothing spectacular yet (e.g no buffer/pointer parameters or return values allowed, so only ctype and math c library calls) and you probably won't think very highly of the coding style (or the way I do things in the makefiles...) either, but it still works :)

The SVN URL is:

There's a readme file under the top-level rpc/ directory which should be sufficient to get started.

possible makefile/build issue: the version of LPM I was using (local_power_manager_linux_1_24_02_09) has lpm.av5T instead of lpm_linux.av5T (as was stated in the original C6Run makefiles) so I've changed that. just find/replace it the other way around in build/gpp_libs/modules/Makefile if it complains about that.

Some notes:

  • I had to put away my variadic marshaller since it was causing issues with floats (why? blog post coming soon!) - all marshalling is done inside the stubs with macros. Takes more lines but it's probably a better idea in the longer run. 
  • The main points of interest in terms of code would be: the files under the rpc/ directory (dsp and gpp side stubs, some additional dsp side functions), rpc_server.c and .h under build/gpp_libs (gpp unmarshalling, symbol location and execution), cio_ipc.c (gpp RPC server, inside the C6Run C I/O server) c6run.c under build/gpp_libs (extracting the embedded GPP stubs library and setting up RPC buffers).
  • I'm using the C6Run C I/O transport system to pass RPC buffers back and forth, for now.
  • There's some additional parts in c6run-cc to handle the --rpc switch (add the dsp stub file to the sources list, compile the gpp stubs into a dynamic link library and embed it inside the final executable)
  • Custom stubs have to be hand-coded, there's no stub generator tool yet

Support for buffer parameters coming soon! (actually, if you write a GPP side stub that does allocation from CMEM or POOL, use this RPC call for allocating memory on the DSP and pass data in this buffer,
everything should work).

Comments, criticism and suggestions are very, very welcome!