List Array ADT

Summary

You will implement and test a novel data structure we call a List Array.

Objectives

Gain exposure to additional semantics for a list ADT
Practice implementing code to a given specification
Develop memory management skills
Reap the benefits of a memory error detector
Experience using parameterized preprocessor macros
Reinforce good unit testing habits

Introduction

Typically the list ADT is understood as a linearly indexed data structure supporting get and set operations with an add operation to insert new items at the end of the sequence.

So far we have seen how linked lists can easily support these operations, and clearly arrays can, too. Of course, with arrays, there is the complication that space may run out if the array is not big enough for all the add operations one may want to apply.

One common approach to this problem is to create a new, much larger array when space runs out, copy the old data to this array, and release the previous array. (Java's ArrayList and C++'s vector both use this strategy.) It may seem wasteful at first, but it turns out that the copying time amortizes well if the new array doubles in size.

Unfortunately, that analysis only applies when copying is cheap relative to time spent traversing memory. What if copies are expensive (slow)?

In this assignment, we will take a hybrid approach to expansion by constructing a list which holds arrays of increasingly larger sizes. The (conceptual) concatenation of all the arrays in the list would give rise to the linear index to the data in the collection.

Implementation Details

We will use the following structures to organize the collection's data.

typedef struct node node_t;

struct node {
  data_t * array;
  int length;
  node_t * next;
};

typedef struct {
  node_t * first;
  int size;  
} listarray_t;

Note that data_t is not part of the C standard. Use of the code will require we also define this type somewhere before it is used, as follows.

typedef int data_t;

Of course we need not necessarily store ints, we could just have easily have specified doubles, chars, or even potentially a struct like Picture or a pointer like float *. (However, a struct would require us to rethink our approach to the function contains, which relies on the == operator that cannot be used to compare structures.)

In any case, the definitions above give rise to a structure that we can conceptually visualize in the figure below.

Memory organization for a list array — Conceptual organization of memory and pointers for a list array.

The operations your listarray_t will need to support are given in the accompanying listarray.h header and include create, size, add, get, set, contains, and clear.

While your listarray.c implementation will naturally contain these operations, you can greatly simplify several of them with an additional helper function that either returns the address of the data_t item at a particular index or else returns (via a handle) the address of the last node_t in the list, (i.e., for when the index is the list size and the last node's array is full; this bookkeeping avoids an additional traversal to attach a new node.)

/* Get pointer to data at specified index 
 * Precondition:
 *  index <= size(list) [Note: equality allowed for add() function]
 * Postcondition:
 *  returns address of data item or NULL
 *  if returns NULL and last is not NULL, changes last to point to last node
 * [PRIVATE METHOD] */
data_t *
get_ptr (const listarray_t * list, int index, node_t ** last);

Expansion Behavior

The first node—allocated when create (int capacity) is called—should contain an array of length capacity, but the list's size should be zero, because nothing has yet been added. We put array in quotes because you must use malloc to create a block of memory with the appropriate size; the node will then contain a pointer to that block of memory, which will function as the array in the node.

When the add method is called, the size of the list should increase and the data item should be stored at the next available index (as determined by the list size) in the last node's array.

When there is no such available index (i.e., because the last node's array is full), a new node should be added to the list with an array that is double the size of the previous node's array.

Thus, if the initial capacity is 10, the nodes in the list would have arrays of lengths 10, 20, 40, 80, etc, as add is invoked to necessitate the space.

Testing

Because you're implementing an outright data structure, testing is more straightforward. For this assignment, you can use the handy testing macro from the C Preprocessor Lab, as demonstrated the testing program skeleton below.

test_listarray.c

#include <stdbool.h>
#include <stdio.h>

#include "listarray.h"

#define TEST(EXPR,RESULT)                                                     \
        if ((EXPR) != (RESULT)) {                                             \
          ++numErrors;                                                        \
          printf("Did not get expected result %s for %s.\n", #RESULT, #EXPR); \
        }

/* Print a report of a collection of unit tests.
 * Prints OK when no errors and FAIL with a count otherwise.
 */
void
reportTests (int numErrors)
{
  if (numErrors>0)
    printf (" FAIL: %d errors\n",numErrors);
  else
    printf (" OK\n");
} // reportTests


/* Test collection example */
int
testExample (void)
{
  printf("--EXAMPLE TEST--");

  int numErrors = 0;
  
  listarray_t * list = create (5);

  TEST( size (list),   0 );   // list is empty
  TEST( add (list,42), true); // added item
  TEST( size (list),   1 );   // list is appropriate size
  TEST( get (list,0),  42);   // check item was added / indexable

  clear(list);
  
  reportTests (numErrors);
  return numErrors;
} // testExample


/* Test all methods for a single node (no expansions) */
int
testBasic (void)
{
  printf("--BASIC TESTS--");
  
  const unsigned int CAPACITY = 10;
  int numErrors = 0;
 
  listarray_t * list = create (CAPACITY);

  TEST( size(list), 0 );        // initial list is empty

  int i; // index for all loops
  
  /* Test add (successful), size (incremented), and get (new item) */
  for (i=0 ; i<CAPACITY ; i++) {
    TEST( add (list,i), true);  // added item i
    TEST( size (list),  i+1 );  // list is appropriate size
    TEST( get (list,i), i);     // check index i is item i
  }

  /* Test contains (true and false) at all locations*/
  for (i=0 ; i< 2*CAPACITY ; i++) {
    TEST( contains (list,i), i<CAPACITY); // list contains item
  }

  /* Test set (return), size (same), and get (new value) */
  for ( i=0 ; i<CAPACITY ; i++) {
    TEST( set (list, i, i + CAPACITY), i);
    TEST( size (list), CAPACITY ) ;
    TEST( get (list,i), i + CAPACITY); 
  }

  clear (list); // should run without memory errors
  
  reportTests (numErrors);
  return numErrors;
} // testBasic


/* Test all functionality by running test functions.
 * Returns the total number of errors.
*/
int
testAll (void)
{
  int numErrors = 0;

  numErrors += testExample();
  numErrors += testBasic();
  // Add more test collections here

  if (numErrors==0)
    printf ("TESTS PASSED\n");
  else
    printf ("TOTAL ERRORS: %d\n",numErrors);
  return numErrors;
  
} // testAll


/* Driver program for test suite */
int
main (void)
{
  return testAll();
} // main

You should expand this program to include a complete set of tests.

Important Reminders

Read the documentation carefully to ensure your implementation meets the stated requirements.
Every malloc needs a corresponding free; pay attention to error handling even within create and add.
Be sure to explicitly initialize all fields of a struct.
The pattern for allocating a variable of type foo_t is:
```
foo_t * bar = malloc ( sizeof(foo_t) );
```
A common mistake is to include the * operator in the argument to sizeof. Remember you want enough memory for the actual thing, not just enough memory for a pointer to it. (What you get back is simply a pointer to that amount of space.)
You can get credit for writing correct tests, even if your library doesn't pass them
You might get credit for correctly implementing operations, even if they rely on other operations that are incorrect in your implementation.

Specific Requirements

You must use the listarray.h as provided, without modifications.
Your program and test driver must compile with no warnings (demonstrated in your transcript).
Your library and test driver should run cleanly with no AddressSanitizer reports (demonstrated in your transcript).
All malloc calls should be tested for success; failures should be indicated with perror messages and appropriate return values.
Whenever possible, preconditions should be assertion tested.
You must include a Makefile with targets that compile the listarray library and test driver individually into object files, and a target to link them into an executable test program. Your Makefile should not include debugging symbols or AddressSanitizer by default; these must be requested at compile time through the use of the CFLAGS and/or LDFLAGS variable.

Grading

In addition to the general grading guidelines for evaluation, the assignment is worth 30 points.

Submit files listarray.c, test_listarray.c, Makefile, references.txt, and tests.txt.

[20 points] listarray.c Library
- [1 point] Tests malloc calls for success
- [1 point] Preconditions verified with assert
- [2 points] Makefile separately compiles files and links executable
- [16 points] Implementations efficiently meet specification
  - [3 points] get_ptr
  - [2 points] create
  - [1 point] size
  - [2 points] add
  - [1 point] get
  - [2 points] set
  - [3 points] contains
  - [2 points] clear
Comments on Program Format, Comments, Readability, etc.
(Points not given, but points can be deducted.)
[10 points] Testing
- [3 points] Test plan enumerates problem circumstances
- [4 points] Test program covers cases, with the expected outcome
- [2 points] Transcript records compilation and AddressSanitized test runs
- [1 points] Summary statement explains why the program is correct

A five point penalty will apply to any submission that includes personally identifying information anywhere other than the references/honesty file.