Annotations

Annotations in sixgill express conditions which should hold of the code. Annotations are gcc attributes, understood by the sixgill gcc plugin; The following header file defines macros for all sixgill annotations; the macro names can be changed without breaking the plugin, but macro contents (e.g. attribute names) should be left alone.

sixgill.h

Annotations take condition arguments which can be any C expression (anything which can be passed to assert(...)), with a couple caveats. First, no looping or recursive function calls can be performed; for information on this restriction, see Quantified Asserts. Second, the code in the condition is not actually executed, and any side effects will not occur. The following sections describe and give examples for the various kinds of annotations.

Assertions
Preconditions
Postconditions
Type Invariants
Global Invariants

Annotations can refer to values that the program itself cannot. This capability comes through special functions that can be referred to only in annotations. The following sections describe and give examples for these functions.

Buffer Bounds
Null Terminators
Loop Initial Values
Function Initial Values

All of the annotations referred to above are untrusted — while sixgill will assume the annotation is correct when checking assertions in the code, it will also check additional assertions to make sure the annotation is correct. Each of these annotations has a variant which is trusted — sixgill will still assume the annotation is correct, but will not do any checking to ensure it. Untrusted annotations can be added anywhere without affecting the soundness of the tool; trusted annotations must be correct.

Trusted Annotations

Finally, annotations can be used to supply directives for sixgill. These annotations do not express conditions which should hold of the code, but modify sixgill's behavior in limited ways.

Skipping Invariant Inference

Assertions

The static_assert annotation specifies a condition that must hold at a point within a function's execution, in the same manner as the standard assert(...).
void foo(int *buf, int len)
{
  static_assert(len > 0);

  int last_pos = 0;
  for (int i = 0; i < len; i++) {
    if (buf[i] > 5)
      last_pos = i;
  }

  static_assert(last_pos < len);
  buf[last_pos] = 0;
}
static_assert can be used to specify loop invariants, as in the following example. There are no special steps to take to mark an assertion as a loop invariant; if the assertion is somewhere in the loop body, it must hold whenever execution of the loop reaches that point.
void foo(int *buf, int len)
{
  int *pos = buf;
  int rem = len;
  while (rem) {
    static_assert(pos - buf == len - rem);
    *pos++ = 0;
    rem--;
  }
}
The main difference between static_assert and assert is that static_assert is checked statically, while assert is checked at runtime. Assertion macros can be made which check both statically and at runtime, and the static_assert_runtime annotation is intended to support this, such as in the following example:
#ifdef DEBUG
#define Assert(COND)               \
  do {                             \
    static_assert_runtime(COND);   \
    if (!COND) {                   \
      printf("assert failed!\n");  \
      abort();                     \
    }                              \
  } while (0)
#else
#define Assert(COND) static_assert_runtime(COND)
#endif
static_assert_runtime is identical to static_assert, with the following differences:

1. static_assert_runtime annotations are not assumed when checking assertions which occur later in the code. This prevents any runtime assertions from interfering with sixgill's analysis.
2. Reports for static_assert_runtime annotations which might not hold are grouped separately from static_assert annotations which might not hold.

Preconditions

The precondition annotation specifies a condition that must hold whenever a function is called. The following example modifies an earlier example to use precondition instead of static_assert.
precondition(len > 0)
void foo(int *buf, int len)
{
  // static_assert(len > 0);
  ...
}
There are two differences between using precondition and using a static_assert at the beginning of a function.

1. precondition annotations can be placed on function declarations, not just definitions. They can be included in header files, and placed on functions whose definition sixgill is not analyzing (e.g. code from external libraries).
2. sixgill will generate one report for each call site where a precondition might not hold; using a static_assert will generate at most one report, for a single call site which might violate the assert.

Postconditions

The postcondition annotation specifies a condition that must hold at exit from a function. Postconditions can use the keyword return to refer to the return value of the function.
precondition(len > 0)
postcondition(return < len)
int foo(int *buf, int len)
{
  int last_pos = 0;
  for (int i = 0; i < len; i++) {
    if (buf[i] > 5)
      last_pos = i;
  }
  return last_pos;
}

Type Invariants

The invariant annotation specifies a condition that generally holds of objects of a particular type; the condition can be assumed whenever a object with the type is used, and must be established after any object with the type is modified. Invariant annotations can be placed on the type itself or on any of its fields; for the following example, all the invariants are equivalent.
struct str {
  invariant(a <= b)
  int a;
  int b;
};

struct invariant(a <= b) str {
  int a;
  int b;
};

struct str {
  int a;
  int b;
} invariant(a <= b);
Invariants can also use the 'this' keyword (even on C structures) to refer to fields of the structure. The following is equivalent to the above invariants.
struct str {
  invariant(this->a <= this->b)
  int a;
  int b;
};
There are some restrictions on which type invariants can be checked by sixgill. See Type Invariants over multiple objects for more information.

Global Invariants

The invariant annotation can also be used on global variables. In the same fashion as type invariants, these conditions must generally hold of the globals; they can be assumed whenever the globals are used, and must be established after the globals are written.
int globa;
invariant(globb >= globa) int globb;

Buffer Bounds

The lbound and ubound functions can be used in annotations to indicate the lower and upper bounds of a buffer. lbound(buf) is the smallest element index which is allocated for buf, and ubound(buf) is the number of elements allocated, starting at buf itself. buf[lbound(buf)] and buf[ubound(buf) - 1] are accessible, provided that buf has at least one element allocated.
precondition(ubound(buf) >= len)
void foo(int *buf, int len)
{
  for (int i = 0; i < len; i++)
    buf[i] = 0;
}
lbound and ubound are both expressed in terms of the element type of the buffer. The buffer passed to lbound and ubound can be cast to another type to change this element type; casting to void* or char* will give bound extents in bytes.
precondition(ubound((void*)buf) >= bytelen)
void foo(int *buf, int bytelen)
{
  int ind = 0;
  while ((ind + 1) * sizeof(int) <= bytelen)
    buf[ind++] = 0;
}

Null Terminators

The zterm function indicates the index of the last null element in a buffer. zterm(buf) is the same as strlen(buf), with a few differences:

1. zterm can be used on buffers with element types other than char.
2. If the buffer has multiple null values, zterm indicates the last one, rather than the first one after buf.
3. If there is no null value after buf, zterm gives a negative value, which will be less than lbound(buf) if there is no null value at all in the buffer.

The most common use for zterm is a precondition or assertion that some zterm(buf) >= 0. This indicates that buf has a null terminator. strlen itself can be annotated using zterm in the following fashion:
precondition(zterm(str) >= 0)
postcondition(return <= zterm(str))
int strlen(char *str)
{
  char *pos = str;
  while (*pos) { pos++; }
  return pos - str;
}

Loop Initial Values

The initial function can be used in an assertion within a loop body to indicate the value of a variable or other term when the loop was initially entered. This is useful when a variable has been overwritten but its initial value is still relevant, as in the following example.
void foo(int *buf, int len)
{
  while (len) {
    static_assert(buf >= initial(buf));
    static_assert(buf - initial(buf) == initial(len) - len);
    *buf++ = 0;
    len--;
  }
}

Function Initial Values

The initial function can also be used in assertions outside of loop bodies and in postconditions, in which case it indicates the value of a variable or term when the function was entered.
precondition(*pdata < max)
postcondition(*pdata > initial(*pdata))
void foo(int *pdata, int max)
{
  while (*pdata != max) {
    (*pdata)++;
    if (*pdata == 42) return;
  }
}

Trusted Annotations

Trusted annotation variants exist for all the untrusted annotations. These have the names static_assume, precondition_assume, postcondition_assume and invariant_assume. Each behaves identically to the corresponding untrusted annotation, with the exception of not being checked for correctness by sixgill. In most cases it is strongly preferable to use untrusted annotations. The main reasons to use trusted annotations are as follows:

1. To wall off a portion of the code base from analysis. This is necessary for many custom memory allocators: rather than use untrusted annotations to convince sixgill that an allocator is correct, use a trusted annotation to describe the property which sixgill may assume.
postcondition_assume(ubound(return) >= size)
void* internal_malloc(int size)
{
  ...
  // complicated stuff.
  ...
  return value;
}
2. To account for a lack of precision in sixgill's analysis. As described in Precision, there are cases using nonlinear operations and quantified properties which sixgill is not currently capable of understanding. Trusted annotations can be used here to (a) shut the tool up, and (b) allow checking for code which uses the result of the computation which wasn't understood.
postcondition(zterm(return) >= 0)
char* print_radix(unsigned num, unsigned radix)
{
  char *buf = malloc(40);
  char *pos = buf;
  while (num) {
    *pos++ = '0' + (num % radix);
    num /= radix;
    static_assume(pos - buf < 40);
  }
  *pos = 0;
  return buf;
}

Skipping Invariant Inference

The skip_inference function may be used as a directive in a type or global invariant which instructs sixgill not to attempt to infer invariants on that type or global. Normally, sixgill will try to infer type invariants that prove the correctness of its goal condition (see Reports), and then try to prove those type invariants. If it infers a type invariant which is not actually correct, it will end up generating a report, and may infer the same or similar incorrect type invariants in hundreds of places. Adding the skip_inference directive to a type or to one of its fields will direct sixgill to use other dependents. In the following example, without the skip_inference directive sixgill will look at foo2 and try to infer the incorrect type invariant ubound(buf) > 20 on str. With the directive, it will instead infer a precondition on foo2 and prove the correctness of the write.
typedef struct str {
  invariant(skip_inference())
  int *buf;
} str;

void foo1(str *s) { s->buf[10] = 0; }
void foo2(str *s) { s->buf[20] = 0; }

void bar1()
{
  int x[11];
  str s = { x };
  foo1(&s);
}

void bar2()
{
  int x[21];
  str s = { x };
  foo2(&s);
}