lundi 25 juin 2018

Structure over flexible array member

I'm writing a C program (g++ compilable) that has to deal with a lot of different structures, all coming from a buffer with a predefined format. The format specifies which type of structure I should load. This may be solved using unions, but the hugh difference in sizes of the structures made me decide for a structure with a void * in it:

struct msg {
    int type;
    void * data; /* may be any of the 50 defined structures: @see type */
};

The problem with that is that I need 2 malloc calls, and 2 free. For me, function calls are expensive and malloc is expensive. From the users side, it would be great to simple free the msgs. So I changed the definition to:

struct msg {
    int type;
    uint8_t data[]; /* flexible array member */
};
...
struct msg_10 {
    uint32_t flags[4];
    ...
};

Whenever I need to deserialize a message, I do:

struct msg * deserialize_10(uint8_t * buffer, size_t length) {
    struct msg * msg = (struct msg *) malloc(offsetof(struct msg, data) + sizeof(struct msg_10));
    struct msg_10 * payload = (__typeof__(payload))msg->data;

    /* load payload */
    return msg;
}

And to get a member of that structure:

uint32_t msg10_flags(const struct msg * msg, int i)
{
    return ((struct msg_10 *)(msg->data))->flags[i];
}

With this change, gcc (and g++) issue a nice warning: dereferencing type-punned pointer will break strict-aliasing rules [-Wstrict-aliasing] message.

I think this is a common issue (but I found no answer here) on how to represent a family of messages in C, in some efficient manner.

I understand why the warning appeared, my questions are the following:

  1. Is it possible to implement something like this, without the warning or is it intrinsically flawed? (the or is not exclusive :P, and I'm almost convinced I should refactor)
  2. Would it be better to represent the messages using something like the following code?

    struct msg {
        int type;
    };
    ...
    struct msg_10 {
        struct msg; /* or int type; */
        uint32_t flags[4];
        ...
    };
    
    
  3. If yes, caveats? Can I always write and use the following?

    struct msg * deserialize_10(uint8_t * buffer, size_t length) {
        struct msg_10 * msg = (struct msg_10 *) malloc(sizeof(struct msg_10));
    
        /* load payload */
        return (struct msg *)msg;
    }
    
    uint32_t msg10_flags(const struct msg * msg, int i) {
        const struct msg_10 * msg10 = (const struct msg_10 *) msg;
        return msg10->flags[i];
    }
    
    
  4. Any other?

I forgot to say that this runs on low level systems and performance is a priority but, all in all, the real issue is how to handle this "multi-message" structure. I may refactor once, but changing the implementation of the deserialization of 50 message types...

Aucun commentaire:

Enregistrer un commentaire