Coverity

Overview

Coverity is a static analysis program, lint on steroids, a product of Synopsys. It compiles code to a form that "checkers" can examine and detect errors (defects in coverityese) of a certain type.

As an open source project, FreeRADIUS uses [Coverity Scan](https://scan.coverity.com/projects/freeradius-freeradius-server), a free service Synopsys provides. Synopsys lets open source projects registered with them check their project with coverity f(LOC(project)) times per day (two for FreeRADIUS as things stand). Each such project gets a web page where one can set up the project, see the results of runs, and admins can change settings.

Coverity can miss a defect (false negative) or claim something’s a defect that isn’t (false positive). Coverity appears to consider a function at a time, looking at what it calls, not how you got there. A programmer may honor a function’s preconditions and hence know that the function, and the functions it calls, will always work, and therefore write those calls without error checking. Coverity, at least as it stands, cannot know that, giving rise to "unchecked return value" defects that in a perfect world it could tell won’t cause a problem.

Given defects, one can use the web interface to classify them and assign them to a developer, who can do one of the following:

Rewrite the code to fix the problem.
Annotate the defect with a comment. (Coverity documentation mentions this as a possibility for false positives or intentional defects.)
Model functions to give coverity a better idea of what the function does. A function model uses Coverity primitives, pseudo-functions, to describe what it does to data. Typically you do this to assist coverity in avoiding false positives. One can also model library functions that Coverity can’t see the source for.

Annotation

Defects on single lines

To annotate a defect or defects, place one or more comments of the form /* coverity[<defect type>] */ on the lines immediately before the line with the defect. (The brackets aren’t metalinguistic; they really appear in the comment. C++ style comments work too.) For example, if one wished to annotate the defect shown there, it would look like this (with the line number of the coverity listing removed):

	/* coverity[tainted_data] */
        len = fr_pair_tlvs_from_network(ctx, out, vendor, p, option_len, decode_ctx, decode_option, verify_tlvs, true);

Note: FreeRADIUS uses macros a lot, in particular macros that expand to multiple C statements. Annotations in macro definitions do not appear in the expansions of those macros.

Function-scope Annotations

Function definitions can be preceded by a comment describing the behavior of a function to indicate actions that the functions take. For example,

/* coverity[+abort] */
void my_exit(int code)
{
	...
}

Modeling

Here is a slightly edited explanatory comment that appears in some modeling files in open source C language projects.

/* Coverity Scan model
 *
 * This is a modeling file for Coverity Scan. Modeling helps to avoid false
 * positives.
 *
 * - A model file can't import [sic] any header files.
 * - Therefore only some built-in primitives like int, char and void are
 *   available but not wchar_t, NULL etc.
 * - Modeling doesn't need full structs and typedefs. Rudimentary structs
 *   and similar types are sufficient.
 * - An uninitialized local pointer is not an error. It signifies that the
 *   variable could be either NULL or have some data.
 *
 * Coverity Scan doesn't pick up modifications automatically. The model file
 * must be uploaded by an admin in the analysis settings of
 * http://scan.coverity.com/<path for the particular project>
 */

Note the singular "file"; Coverity Scan only allows one modeling file, which you must upload. For FreeRADIUS, go to [scan](https://scan.coverity.com/projects/freeradius-freeradius-server/) and choose the "Analysis Settings" tab. At the bottom of that page is the interface for uploading or deleting the modeling file. (Coverity Scan uploads the modeling file and then checks it, so keep a copy around to reload in case the newly uploaded version has a problem.)

You can’t include header files, so the modeling file is likely to need typedefs and defines that could otherwise be included, inducing a certain amount of redundancy and possible mismatch between the modeling file and the project headers. This is reduced by the rudimentary (i.e. empty) structures. One might think that if coverity’s compiler can deal with members of aggregates, one could have non-rudimentary structures, but

no model functions in Coverity documentation declare structs with members.
our attempts to use structs with members in models haven’t worked.

Coverity has primitives like coverity_writeall(), which tells it the modeled function overwrites a variable in its entirety, however big it is, so you needn’t ask for sizeof(<rudimentary struct>).

A model function contains only code to indicate the effects coverity cares about and the conditions under which they happen. A very common idiom, which we tried to use to tell coverity that fr_sbuff_in_foo(), on success, writes to out→p:

size_t my_copy(uint8_t *dest, uint8_t const *src, size_t n)
{
  size_t  result;
  if (result > 0) __coverity_write_buffer_bytes__(dest, result);
  return result;
}

Coverity doesn’t care how result is determined, or how the data is copied. result is just checked to determine whether any data was copied, and returned to show what it represents.

Coverity tends to be stricter than actual compilers, though as gcc and clang improve they are adding options to check for more issues. Examples of things coverity will notice but gcc and clang currently won’t:

Coverity notices and complains about subexpressions of conditionals that don’t affect control flow. This tripped up the initial versions of SBUFF_PARSE_[U]INT_DEF(), macros used to generate the fr_sbuff_out_<integral type>() functions. They return an error if the parsed integer is out of range for the type, but comparing with the min and max values gives a defect for the widest integer types, because no out of range value is representable. The check was recast as "did the conversion preserve the value", but that will need to change if coverity gets sufficiently smart.
In C, something that (transitive closure of has) a member with a const-qualified type can only be set in C via an initialization. (Calling a function semantically is like initializing the parameters.) Currently, at least, one can sneak around it in gcc or clang, but coverity sees through the ruse. fr_value_box_t has members that are const-qualified and also has members with members that are const-qualified. Therefore, all one can do about fr_value_box_init() is annotate the reported store_writes_const_field defect.
Coverity considers the result of some pointer casts to be tainted (see below). It calls them "pointer downcasts", an odd term for a non-OOP language like C. The casts in question are from pointers to less aligned types to pointers to more aligned types. Doing that violates some C coding standards, e.g. Carnegie-Mellon’s SEI-CERT C Coding Standard, and if at runtime the cast pointer is not sufficiently aligned, dereferencing it is officially undefined behavior in C. In FreeRADIUS, this happened in fr_dhcpv4_raw_packet_recv(), though the pointer points to memory allocated with talloc, which guarantees TALLOC_ALIGN alignment… but no coverity primitive assures coverity that a pointer is aligned. There is a coverity_alloc() primitive that is part of the model for malloc(), but it’s not documented as telling Coverity about alignment.

Taint

Tainted data is suspicious data. Coverity considers data from certain sources to be tainted, and tainted data should be validated before use.

For example, trunk test code currently simulates mux and demux by writing and writing trunk request pointers to and from sockets. Coverity considers the read trunk request pointer tainted. It will probably take a way to remember written pointers between the write and the read to validate them, or just keeping them in memory instead of writing and reading them to avoid tainted data there.

In a function that loads a tainted value and doesn’t validate it, each use in that function invocation is considered a defect, including passing it to another function. Coverity does not remember validations once the function invocation containing the validation returns. It may therefore be a good idea if a function calls more than one function using the data to have the first called do the validation and pass the validated value to the rest.

When the Heartbleed bug appeared, Synopsys looked for a way coverity could detect such bugs. https://www.synopsys.com/blogs/software-security/detecting-heartbleed-with-static-analysis/ describes what it came up with: a value is considered tainted if

it’s calculated by byte swapping, and
it’s then assigned to a "tainted sink", e.g. something used to index an array or as a length

But there must have been more to it than that since then, as evidenced by the fr_nbo_to_uint*() functions. The ntoh*() functions taint their result…but the fr_nbo_to_uint*() functions take a pointer to memory holding a value stored in network byte order. Here’s one of them:

/** Read an unsigned 16bit integer from wire format (big endian)
 *
 * @param[in] data	To convert to a 16bit unsigned integer of native endianness.
 * @return a 16 bit unsigned integer of native endianness.
 */
static inline uint16_t fr_nbo_to_uint16(uint8_t const data[static sizeof(uint16_t)])
{
	return (((uint16_t)data[0]) << 8) | data[1];
}

Coverity considers not only the value fr_nbo_to_uint16() tainted, but the pointer as well. One can range check a length—how does one validate a pointer? Actually, there are ways. The talloc library lets you associate a string with allocated memory indicating its type, and gives you talloc_get_type() and talloc_get_type_abort() to check that type, and once that’s known, if there’s some property things of that type have, you can check that too. That reduces the issue to letting Coverity know that those actions validate the pointer—Coverity can recognize range checks on numeric values as validation, but this it won’t recognize. The way around it is to split that out into a function and model it as verifying the pointer. That brings its own problem. If in the original function the state of the pointed at object affects what is done with it, that state checking code will have to be replicated in it.

Taint propagation

Unvalidated tainted values, like any other error, propagates. Coverity considers uses of tainted data to be defects. Assigning it to another variable taints that variable; passing it to a function taints the parameter for that invocation.

In addition, once a pointer is considered tainted, so is any value retrieved using that pointer. An example: in decode_vsa() in the defect listing for src/protocols/dhcpv4/decode.c, we see

   	5. tainted_argument: Calling function fr_nbo_to_uint32 taints argument *p. [show details]
375        pen = fr_nbo_to_uint32(p);

and later on, even though no byte swapping is involved,

   	8. var_assign_var: Assigning: option_len = p[0]. Both are now tainted.
397        option_len = p[0];

leading to the defect

CID 1503954 (#1 of 1): Untrusted loop bound (TAINTED_SCALAR)
11. tainted_data: Passing tainted expression option_len to fr_pair_tlvs_from_network, which uses it as a loop boundary. [show details]
   	Ensure that tainted values are properly sanitized, by checking that their values are within a permissible range.
407        len = fr_pair_tlvs_from_network(ctx, out, vendor, p, option_len, decode_ctx, decode_option, verify_tlvs, true);

TAINTED_SCALAR is the checker that detected the defect; the defect is tainted_data.

References

https://sig-product-docs.synopsys.com/bundle/coverity-docs/page/webhelp-files/customize_start.html Customizing Coverity. Look especially at the sections "Identifying vulnerable data" and "Models and Primitives" (and in turn the section in "Models and Primitives" on C and C++ primitives).
https://www.synopsys.com/blogs/software-security/detecting-heartbleed-with-static-analysis/ "On detecting Heartbleed with static analysis". Describes the method Synopsys developed to detect errors like Heartbleed in a way "that scale[s] to large programs with low false positive rates, yet find[s] critical defects."